MLOps Manager
Anblicks
Job Description
Job Role: MLOps Manager Experience: 12โ17 Years Location: Hyderabad, India Early Joiners Preferred About the Role We are looking for a MLOps Manager to lead the design, build, and scale of enterprise-grade MLOps and DevOps platforms. This role combines hands-on engineering excellence with team leadership , focusing on productionizing machine learning systems, enabling developer productivity, and driving automation at scale. You will work at the intersection of ML engineering, cloud infrastructure, and platform engineering , helping teams deliver reliable, scalable, and secure ML solutions in production.
Key Responsibilities Lead and mentor a team of MLOps/DevOps engineers, driving technical excellence and delivery outcomes Architect, build, and scale end-to-end MLOps platforms and CI/CD pipelines for ML workloads Design and implement automated deployment pipelines for training, testing, and model serving at scale Operationalize ML models into production with a focus on performance, reliability, and observability Partner with data scientists and engineering teams to enable self-service ML platforms and developer tooling Implement Infrastructure-as-Code (IaC) and automation frameworks for cloud environments Ensure platform compliance with security, governance, and reliability standards Troubleshoot complex production issues and continuously improve developer experience and system resilience Drive best practices for CI/CD, testing, monitoring, and release management across ML and data platforms Evaluate and optimize environments supporting large-scale data pipelines and ML workflows Required Qualifications 10+ years of experience in DevOps, MLOps, or Platform Engineering roles 5+ years of people management experience , leading teams of 5+ engineers Strong hands-on expertise in building and scaling MLOps pipelines and platforms Proven experience with Infrastructure-as-Code (Terraform preferred) in public cloud environments Deep experience with CI/CD tools such as GitHub Actions, Jenkins, and code quality/security tools (e.g., Snyk) Strong knowledge of MLOps and orchestration frameworks such as Airflow, Kubeflow, MLflow, or similar Experience deploying and managing ML models in production at scale Hands-on experience with distributed data processing frameworks such as Apache Spark, EMR, or Databricks Strong programming skills in Python (preferred) or Node.js/Bash Experience with containerization and orchestration (Docker, Kubernetes) Strong understanding of cloud platforms (AWS, Azure, or GCP) and cloud-native services Experience with data platforms and services such as Snowflake, Redshift, Glue, BigQuery, or similar Solid understanding of distributed systems, monitoring, logging, and reliability engineering Experience with Git-based workflows and version control best practices Preferred Qualifications Experience with configuration management tools (Ansible, Chef, Puppet) Familiarity with ML libraries and frameworks such as scikit-learn, PyTorch, TensorFlow Exposure to large-scale inference systems and batch/real-time scoring architectures Experience supporting multi-runtime environments (Node.js, Java/Spark/Scala, React) Cloud certifications (AWS/GCP/Azure)