Generative AI Engineer

Qantra

JaipurFull-timeMid LevelOn-site

Job Description

AI Platform Engineer Location: India Remote Pay : INR 45 - INR 50 LPA Experience - 10–14+ years overall, operating at Lead / Principal level Employment Type - Full time We are seeking a Lead Azure GenAIOps / LLMOps Engineer to design, build, and operate a secure, observable, governed Azure GenAI platform that can be reused by multiple product and business teams. This role is not focused on model training or fine-tuning . Instead, it owns LLM operationalization, governance, observability, safety, cost control , and platform reliability across enterprise environments.

You will work at the intersection of AI Platform Engineering, LLMOps, Cloud Architecture, and DevSecOps , partnering closely with application teams, security teams, and cloud platform teams. Key Responsibilities 1. Azure GenAI Platform Ownership β€’ Architect and operate a shared, multi-tenant Azure GenAI platform using: Azure OpenAI Azure AI Foundry (must-have) β€’ Define reference architectures for RAG, agents, and LLM-powered apps. β€’ Decide and document usage patterns across: AKS, App Service, and Azure ML (Candidate should have strong experience with at least one; platform design should support multiple runtimes.) 2.

LLM Runtime, Agent & Tool Governance β€’ Implement AI Gateway / Azure API Management for: Model routing and abstraction Throttling and quota enforcement Authentication and authorization β€’ Govern agent runtimes, including: Tool access control Permissions and identity boundaries Authentication, audit logging, and traceability β€’ Define MCP server / tool governance standards: Function calling approvals Tool versioning Change control and auditability 3. CI/CD, Environment Promotion & Configuration Management β€’ Build reusable pipeline templates for GenAI workloads. β€’ Define environment promotion models across: DEV β†’ NON-PROD β†’ PROD β€’ Enforce: Git-based prompt, agent, and config versioning Approval workflows Rollback and hotfix strategies β€’ Manage golden datasets and regression test suites for: Prompts Agents RAG pipelines 4. Observability, Quality & Reliability β€’ Implement LLM observability using tools such as: Langfuse OpenTelemetry Azure Monitor / Application Insights β€’ Enable: Prompt & response tracing Retrieval tracing Tool-call tracing Token usage tracking Cost and latency dashboards β€’ Define and enforce SLIs/SLOs for GenAI workloads. β€’ Own incident response, on-call readiness, rollback, and DR testing. 5.

RAG Quality & Evaluation β€’ Implement continuous monitoring for: Retrieval quality Chunk quality Citation quality Grounding score Hallucination regression β€’ Automate evaluation gates in CI/CD pipelines. β€’ Maintain baseline and golden datasets to detect quality drift. 6. GenAI Safety & Responsible AI Controls β€’ Implement enterprise safety controls: Prompt shields Jailbreak detection Groundedness checks Content moderation PII / PHI masking β€’ Design human-in-the-loop review and escalation workflows for risky outputs. β€’ Collaborate with security teams on policy definitions (ownership is shared, not siloed). 7. Security, Networking & Identity (Design Ownership) β€’ Design secure Azure architectures using: Private networking Private Endpoints Managed Identities Azure Key Vault VNet isolation β€’ Clarify responsibility boundaries: Own GenAI platform security design Collaborate with core security / platform teams for enterprise controls β€’ Heavy DevSecOps controls (SBOM, image signing, admission checks) are good-to-have unless mandated by environment. 8.

Cost, Routing & Performance Optimization β€’ Implement: Model routing and fallback strategies Throttling and quota management β€’ Optimize cost by: Model Application User Environment Tenant β€’ Build token and cost dashboards for leadership visibility. 9. Compliance & Audit Automation β€’ Automate compliance evidence generation: Policy enforcement proofs Audit trails Access logs Promotion records β€’ Reduce reliance on manual audit documentation. Core Deliverables (Expected Outcomes) β€’ Enterprise-grade Azure GenAI reference architectures β€’ Reusable CI/CD pipeline templates β€’ Secure AI Gateway patterns β€’ Governed agent and tool frameworks β€’ Observability dashboards and alerts β€’ Regression test suites and golden datasets β€’ Platform onboarding guides and standards Required Skills Azure & AI Platform β€’ Azure OpenAI, Azure AI Foundry (mandatory) β€’ AKS or App Service or Azure ML (deep expertise in at least one) β€’ Azure API Management / AI Gateway patterns β€’ Private networking, Managed Identity, Key Vault LLMOps & Governance β€’ RAG architectures and evaluation β€’ Prompt, agent & config lifecycle management β€’ Model routing, fallback, and throttling strategies β€’ Multi-tenant GenAI platform experience (strongly preferred) Automation & Engineering β€’ Python, Bash, YAML β€’REST APIs and SDK-based automation β€’ CI/CD using Azure DevOps or GitHub Actions β€’ Terraform or Bicep Observability & Reliability β€’ Langfuse, OpenTelemetry, Azure Monitor, App Insights β€’ SLIs/SLOs, incident management, production support Good to Have β€’ Semantic Kernel β€’ Microsoft Agent Framework β€’ LangChain, Agno β€’ FastAPI β€’ Advanced DevSecOps controls (SBOM, image signing, admission checks) β€’ Azure security and architecture certifications

Posted 3 weeks ago

Related Jobs

Related Searches

Apply Now