Data Engineer

Apexon

BengaluruFull-timeMid LevelOn-site

Job Description

Responsibilities & Duties Lead the design, development, and scaling of data platforms and pipelines using Snowflake on AWS Build and optimize data ingestion frameworks using StreamSets and AWS S3 for structured, semi-structured, and unstructured data Develop and manage robust ETL/ELT pipelines leveraging AWS services (S3, Glue, Lambda, DMS) and Python-based data processing Design and enhance database schemas (MySQL/PostgreSQL) with a focus on performance, scalability, and maintainability Collaborate with development teams to optimize queries and schema design for new features prior to production deployment Support and implement data lake and data warehouse architectures, ensuring efficient data flow into Snowflake Lead database migration initiatives (e.g., MySQL to PostgreSQL) with minimal downtime and high data integrity Write and maintain Python (or Go) scripts for data ingestion, transformation, and automation Proactively identify improvements to enhance data platform scalability, resiliency, and performance Troubleshoot production issues, perform root cause analysis, and ensure rapid recovery with no data loss Implement data security and governance best practices including access control, encryption, and compliance standards Work closely with cross-functional teams (engineering, analytics, business) to deliver scalable data solutions Mentor junior engineers and contribute to best practices, coding standards, and architectural decisions Maintain technical documentation, data flow diagrams, and operational procedures Required Qualifications & Skills Bachelorโ€™s degree in Computer Science, Engineering, or related field 6 - 8 years of experience in data engineering, ETL development, or data platform engineering Strong hands-on experience with Snowflake (data modeling, performance tuning, virtual warehouses) Deep expertise in AWS ecosystem, especially AWS S3, Glue, Lambda, DMS, and IAM Proven experience with StreamSets for data ingestion and pipeline orchestration Strong proficiency in Python for data engineering, automation, and pipeline development Advanced experience in SQL, schema design, and performance optimization (MySQL/PostgreSQL) Solid understanding of data lakes, data warehouses, and ETL/ELT best practices Experience with Apache tools such as Airflow, Kafka, or Superset Familiarity with Linux systems, networking basics, and system performance tuning Strong problem-solving skills with experience in incident management and root cause analysis Excellent communication and collaboration skills Preferred Qualifications Experience with Kubernetes and containerized data platforms Exposure to real-time data streaming and event-driven architectures Experience with CI/CD pipelines and DevOps practices Knowledge of data governance, security, and compliance frameworks Experience in regulated industries (finance, healthcare, insurance) Certifications in AWS or Snowflake (e.g., SnowPro)

Posted 3 weeks ago

Related Jobs

Related Searches

Apply Now