Engineering Manager - Forward Deployed Engineering (LLM)
Baseten
Job Description
About Baseten Baseten powers missionâcritical inference for the world\'s most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cuttingâedge models into production. We\'re growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction.
Join us and help build the platform engineers turn to to ship AI products. The Role As an Engineering Manager (Player & Coach), you will lead and mentor a team of Forward Deployed Engineers focused on building, scaling, and optimizing LLM inference workloads for Baseten customers. Applying both handsâon technical ownership and managerial leadership, you will guide your team through the processes of designing, deploying, and managing high performance, low latency AI applications on Basetenâs platform.
FDE at Baseten is not a sales function â we are a mix of engineering, product, and customer architects who contribute to the core Baseten codebase, drive large portions of our feature roadmap, and execute on complicated customer engagements. You will also partner with product, infrastructure, and other customer engineering teams to ensure that large language models (LLMs) and other generative AI systems deliver bestâinâclass performance, reliability, and cost efficiency in production environments. Example Initiatives Forward Deployed Engineering on the frontier of AI The fastest, most accurate Whisper transcription Deploy productionâready model servers from Docker images Deploy custom ComfyUI workflows as APIs Responsibilities Leadership & Team Management Lead, mentor, and grow a team of Forward Deployed Engineers, providing guidance on technical direction, project execution, and professional development.
Set clear goals and ensure timely, highâquality delivery across multiple customerâfacing projects involving LLM deployment and inference optimization. Collaborate with leadership to align team priorities with company and customer goals, balancing shortâterm delivery, widely varying customer priorities, and longâterm technical initiatives. Playerâcoach â While much of this role will be leading the team, you will also be expected to be a key driver on strategic product initiatives and customer engagements.
The best managers derive credibility from being able to be handsâon when needed. Technical Ownership Develop and maintain software systems and product features using one or more generalâpurpose programming languages in a productionâlevel environment, with a preference for Python due to its relevance in ML projects. Drive customer impact by designing, implementing, and deploying Baseten solutions endâtoâend (problem framing â evaluation â production deployment â monitoring).
This involves working with customersâ engineering teams at every stage of the customer journey including: sales, implementation, and expansion. Deliver with velocity: turn vague objectives into clear specs and wellâdefined PoCs so we can rapidly ship wellâtested services and outcomes for our customers. Optimize and enhance AI/ML projects, contributing to the continuous improvement of our technical stack.
This includes developing features and PRDs with other engineering and product orgs. Own products and customer projects endâtoâend, functioning as both an engineer, project manager, and product manager, with a focus on user empathy, project specification, and endâtoâend execution. Requirements Bachelorâs, Masterâs, or Ph.D. in Computer Science, Engineering, or related field. 4+ years of professional software engineering experience, including 1+ year in a leadership or mentorship capacity.
Strong programming skills in Python, with production experience in building or optimizing ML inference systems. Proven experience with LLMs, inference optimization, or serving frameworks (e.g., vLLM, TensorRT, Triton, Hugging Face, Ray Serve). Familiarity with observability, profiling, and cost/performance tradeoffs in production ML systems.
Excellent communication and collaboration skillsâable to lead crossâfunctional efforts and drive outcomes in ambiguous, fastâpaced environments. Bonus Points Experience leading customerâfacing engineering teams or working directly with enterprise partners. Deep understanding of GPU infrastructure, distributed inference, or model compression techniques.
Benefits Competitive compensation, including meaningful equity. 100% coverage of medical, dental, and vision insurance for employee and dependents. Flexible PTO policy including company wide Winter Break (our offices are closed from Christmas Eve to New Year\'s Day!). Paid parental leave.
Fertility and familyâbuilding stipend through Carrot. Companyâfacilitated 401(k). Exposure to a variety of ML startups, offering unparalleled learning and networking opportunities.
At Baseten, we are committed to fostering a diverse and inclusive workplace. We provide equal employment opportunities to all employees and applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status. We are an Equal Opportunity Employer and will consider qualified applicants with criminal histories in a manner consistent with applicable law (by example, the requirements of the San Francisco Fair Chance Ordinance, where applicable).
Compensation Range: $260K - $380K #J-18808-Ljbffr