dataspec.dev brand logo
Independent Consultant ยท Remote & On-Site

Senior Data Engineering Consultant & Architect

Scalable data pipelines. Engineered for impact.

Designing & implementing data pipelines, optimizing existing architectures, and integrating AI/LLM technologies โ€” for batch and streaming workloads.

// What I Do

Engineering Services

End-to-end data engineering solutions tailored to your business needs

Data Pipeline Design & Implementation

Architect and build end-to-end data pipelines for both batch and streaming workloads. From ingestion to transformation to delivery โ€” designed for reliability and scale.

ETL/ELT Batch Streaming Airflow

Pipeline Optimization & Modernization

Revise, refactor, and optimize existing data pipelines for better performance and lower cost. Migrate legacy systems to modern cloud-native architectures.

Performance Cost Reduction Migration

Data Warehousing & Lakehouse

Design and implement scalable data warehouses and lakehouse architectures using Medallion patterns for reliable, queryable, and maintainable data.

Medallion Databricks Data Mart

AI & LLM Integration

Leverage the latest AI, LLM, and GenAI technologies within your data platform. From embedding models in pipelines to building intelligent data products.

GenAI LLM ML Pipelines
Arnob Kumar Dey - Senior Data Engineering Consultant
// About Me

Engineering Data Solutions at Enterprise Scale

I'm Arnob Kumar Dey, an independent freelance Senior Data Engineering Consultant with over 10 years of experience helping organizations transform their data infrastructure.

I hold an M.Tech from BITS Pilani and specialize in cloud-native data architectures, particularly the Medallion Architecture pattern. I work with Fortune 500 companies and fast-growing startups alike โ€” designing pipelines that are reliable, scalable, and cost-effective.

10+ Years Experience
50+ Projects Delivered
5+ Fortune 500 Clients
// Technical Skills

My Tech Stack

Battle-tested tools and platforms I use to deliver production-grade solutions

AWS
AWS
PySpark
PySpark
Python
Python
SQL
SQL
Terraform
Terraform
Airflow
Airflow
Databricks
Databricks
ETL/ELT
ETL / ELT
AI / LLM
AI / LLM
Medallion Architecture
Medallion Architecture
// Case Studies

Featured Projects

Enterprise-scale data solutions that drive business value

Data Modernization Automotive

Recall Modernization & Data Mart Development

S&P Global

Legacy system hosted on VMware needed modernization for better scalability. Migrated the end-to-end system to a managed AWS environment and architected a centralized "Recall Data Mart".

Migrated legacy VMware system to managed AWS
Architected centralized Recall Data Mart
Optimized query performance with AWS Glue & Apache Iceberg
AWS Glue Apache Iceberg Data Mart
Cloud-Native Aviation

OE Scheduling Optimizer

Delta Airlines

Manual pilot scheduling needed automation to ensure strict FAA regulation compliance. Led the design of a cloud-native scheduling optimizer using AWS CDK, Lambda for orchestration, and SageMaker for model deployment.

Automated pilot scheduling with FAA compliance
Cloud-native architecture with AWS CDK
ML model deployment via SageMaker
AWS CDK Lambda SageMaker
Data Quality Healthcare

Data Validation Framework

Internal Project

Designed a custom framework using Great Expectations to enforce data quality standards across pipelines handling FHIR healthcare data, ensuring compliance and reliability.

Custom Great Expectations framework
FHIR healthcare data compliance
Automated data quality enforcement
Great Expectations FHIR Data Quality
// How I Work

My Process

A proven approach to delivering data engineering solutions

1

Discovery

Understand your data landscape, existing infrastructure, and business objectives.

2

Architecture

Design scalable, cost-effective solutions aligned with your tech stack and goals.

3

Implementation

Build, test, and deploy with CI/CD best practices and thorough documentation.

4

Optimization

Monitor performance, tune for efficiency, and iterate for continuous improvement.

// Credentials

Professional Certifications

Verified expertise in cloud and data engineering technologies

AWS Certified Developer Associate

AWS Certified

Developer Associate

Expertise in AWS services, deployment, and security best practices

Databricks Data Engineer Associate

Databricks Certified

Data Engineer Associate

Proficiency in Databricks, Apache Spark, and lakehouse architecture

// Get In Touch

Start a Conversation

Have a data engineering challenge or project in mind? Let's discuss how I can help.