AI Platform Engineer
Gala Solutions
Job Description
Job Title: AI Platform Engineer Experience Level: Level 3 (senior): 5-7 years 12 Months Contract Location: Montreal (Day 1 onboarding onsite / in office presence 3x week) In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communitie s. This is a AI Platform Engineering role at Director level, which is part of the job family responsible for providing specialist GenAI and expertise that drive decision-making and business insights during GenAI development as well as uplifting platform features by introducing cutting edge technology capabilities, enhancements and innovative solutio ns.Since 1935, *** is known as a global leader in financial services, always evolving and innovating to better serve our clients and our communities in more than 40 countries around the wor ld. What you’ll do in the r ole:Design and build a firmwide AI development and evaluation platform with a strong focus on enterprise-scale GenAI benchmarking, assurance, and governa nce.Develop self-service tooling, SDKs, and APIs to enable teams to build, evaluate, and deploy GenAI applications efficiently and saf ely.Build reusable, scalable platform components for GenAI and agentic systems, including orchestration, evaluation pipelines, and model lifecycle workfl ows.Lead the implementation of container-native GenAI workloads on Kubernetes / OpenShift using GitOps-driven deployment patte rns.Integrate and operate GenAI ecosystem components including LLMs, vector databases, embeddings, and agent framewo rks.Drive key architecture, product, and design decisions across security, authentication, observability, scalability, and reliabil ity.Establish platform best practices for GenAI evaluations, agentic systems, ModelOps / LLMOps, and production operati ons.Collaborate closely with engineers, data scientists, security, and product teams to accelerate safe enterprise adoption of Ge nAI.
What you’ll bring to the role:6+ years of strong hands-on software engineering experience, preferably in Python (FastAPI, Flask), building large-scale, cloud-native platf orms.Deep experience designing and operating Kubernetes / OpenShift workloads using Helm, Customize, container registries, and GitOps pract ices.Hands-on experience building GenAI and LLM-based applications, including agentic orchestration, embeddings, evaluation workflows, and fine-tu ning.Strong understanding of microservices, RESTful API design, asynchronous and concurrent programming, and performance-oriented sys tems.Solid foundation in data engineering principles including SQL/NoSQL stores, Kafka, Redis, vector databases, and state management at s cale.Proficiency in DevOps, CI/CD, observability (OpenTelemetry, Prometheus, Grafana), and SRE-inspired operational pract ices.Strong working knowledge of security-first design, OAuth2, secure coding practices, and enterprise-grade platform cont rols.Experience with agent-based frameworks or orchestration sy stemsExposure to LLMOps / ModelOps / evaluation plat formsExperience working in enterprise-scale platforms or internal developer plat forms