Lead Data Engineer (Azure Data, Fabric)- Remote
Altysys
Job Description
We have a new role for Lead Data Engineer (Azure Data, Fabric), Please find below the details. No Of positions: 01 Experience: 9 to 12 Years Work Mode: Remote Note: The client has offices in Pune, Hyderabad, Bangalore, Ahmedabad, Mumbai, Chennai, and Coimbatore. Candidates from these mentioned locations will need to attend the F2F interview round at the client office.
Also, if selected, the candidate will have to visit the client office to collect the laptop. Therefore, please share profiles accordingly. Notice: Immediate Please find below the JD: We are seeking a highly skilled and experienced Azure Cloud Lead / Data Architect to evaluate, optimize, and elevate our existing data platform built on Microsoft Fabric.
In this role, you will conduct a comprehensive audit of our current production architecture, capacity configurations, Spark notebooks, orchestration pipelines, and storage layer. Your primary mission will be to identify performance bottlenecks, optimize cost efficiency, ensure architectural best practices, and deliver a robust roadmap for scalability and governance. Key Responsibilities 1.
Architectural Assessment & Optimization Audit the end-to-end Microsoft Fabric architecture (SaaS infrastructure, Capacity settings, and workspace deployment strategies). Evaluate the Storage Layer (OneLake / Delta Lake) to ensure optimal data tiering, file sizing (v-order, compaction), and partitioning strategies. Review security, access control (RBAC, PBAC), and data governance implementation across shortcuts, lakehouses, and warehouses. 2.
Code & Pipeline Engineering Audit Conduct deep-dive code reviews of existing PySpark / Spark SQL notebooks to optimize execution plans, caching strategies, and memory management. Assess data integration pipelines (Data Factory Pipelines, Dataflows Gen2) for efficiency, error handling, retry mechanisms, and concurrency. Identify opportunities to reduce compute costs and execution times within Fabric capacities (F-SKUs). 3.
Best Practices & Roadmap Provide a detailed gap analysis report comparing the current state against enterprise-grade well-architected frameworks. Define guidelines for CI/CD automation, version control (Git integration), and deployment pipelines within Microsoft Fabric. Mentor the existing engineering team on advanced Spark tuning and Fabric-specific optimization techniques.
Required Skills & Qualifications Technical Expertise Core Azure Data Stack: Minimum 8+ years of experience engineering and architecting enterprise data platforms using Azure Data Factory, Azure Synapse Analytics, Azure Databricks, and ADLS Gen2. Advanced Spark Engineering: Deep, hands-on expertise in PySpark, Spark optimization, data frame performance tuning, and debugging complex Spark jobs. Storage Frameworks: Strong understanding of Delta Lake storage formats, ACID transactions, time travel, and optimization techniques (Z-ordering, vacuuming).
Microsoft Fabric (Highly Preferred): Practical understanding of Fabric components: OneLake, Lakehouse, Data Warehouse, Data Factory, and Capacity management. Data Modeling: Expertise in designing Medallion Architectures (Bronze/Silver/Gold) and Star Schema modelling. DevOps: Experience setting up CI/CD pipelines for data engineering workloads using Azure DevOps or GitHub.
Soft Skills & Leadership Proven experience entering an existing project environment and quickly performing comprehensive technical audits. Strong consulting acumen with the ability to articulate complex technical findings and ROI to stakeholders. Excellent technical documentation skills.
Preferred Certifications Microsoft Certified: Azure Data Engineer Associate (DP-203) Microsoft Certified: Azure Solutions Architect Expert (AZ-305) Microsoft Certified: Fabric Data Analyst Associate (DP-600) or Fabric Analytics Engineer