Data Engineer

SPIRO

Pune CityFull-timeMid LevelOn-site

Job Description

Data Engineer โ€“ Data Platform Engineering We are looking for a highly skilled Data Engineer to design, build, and scale a modern cloud-native Data Platform supporting batch, streaming, analytics, and real-time data use cases across the organization. The ideal candidate should have strong expertise in distributed data systems, platform engineering, cloud infrastructure, and building reliable, scalable, and observable data ecosystems. Key Responsibilities Design and develop scalable enterprise data platform architecture on AWS Build reusable frameworks for data ingestion, transformation, orchestration, and consumption Develop batch and real-time data pipelines using distributed processing technologies Build and maintain streaming platforms for event-driven architectures Create standardized data lake/lakehouse foundations for analytics and downstream applications Implement metadata management, schema evolution, lineage, governance, and access control Develop platform capabilities for data reliability, observability, monitoring, and alerting Enable self-service data access and scalable data consumption layers Build CDC and incremental ingestion frameworks from databases and external systems Optimize platform scalability, performance, fault tolerance, and cloud cost Automate deployments, infrastructure provisioning, and CI/CD for data workloads Collaborate with Analytics, Product, and Engineering teams to onboard new data domains Ensure platform security, compliance, and operational excellence Technical Skills Core Technologies Strong expertise in: Apache Spark Spark Streaming Apache Flink SQL Python Distributed data processing systems Streaming & Event Platforms Hands-on experience with: Apache Kafka Kafka Connect / Kafka Streams Real-time event-driven architectures AWS Data Platform Strong experience with: Amazon Web Services AWS Glue AWS Athena AWS DMS Amazon S3 Amazon CloudWatch Understanding of IAM, networking, security, and cloud-native architecture Data Platform Experience Experience building: Data lakes / lakehouse platforms Batch and streaming ingestion frameworks Metadata-driven ETL systems Data catalog and lineage systems Multi-tenant data platforms Data reliability and observability frameworks Data quality validation pipelines Scalable consumption layers for analytics and APIs Strong understanding of: Data partitioning and file optimization CDC architecture and incremental processing Schema registry and schema evolution High availability and fault-tolerant systems Data governance and access management Good to Have Experience with: Apache Airflow dbt Delta Lake Apache Iceberg Docker / Kubernetes Terraform / Infrastructure as Code CI/CD pipelines and DevOps practices Functional Expectations Strong system design and architecture skills Ability to build reusable and scalable platform components Strong debugging and production support capabilities Excellent communication and stakeholder collaboration Ownership mindset toward platform stability, scalability, and reliability Ability to drive platform modernization initiatives independently Experience 3+ years of experience in Data Engineering / Data Platform Engineering Experience building enterprise-scale cloud data platforms preferred Prior experience with high-volume real-time systems is a plus

Posted 1 weeks ago

Related Jobs

Related Searches

Apply Now