⚡ New

LLM Engineer – LLM & Generative AI

Aerocraft Engineering

AhmedabadFull-timeMid LevelOn-site

Job Description

About the job About Our Company: Aerocraft Engineering India Pvt Ltd based in Ahmedabad, provides services to US-based Architecture, Engineering, and Construction groups of companies: Russell and Dawson – An Architecture/Engineering/Construction firm (www.rdaep.com) United-BIM – BIM Modeling Services Firm (www.united-bim.com) AORBIS – Procurement as a Service Provider (www.aorbis.com) We are a nimble and growing organization where everyone’s role is very important for the company’s business success. All team members’ contributions have a direct correlation with the company’s performance in meeting its business and financial objectives. We are looking for an Experienced AI Engineer with hands-on experience building production-grade applications powered by Large Language Models.

You will design, develop, and optimize LLM-based systems — with a strong focus on scalable Retrieval-Augmented Generation (RAG) pipelines and model fine-tuning — to deliver intelligent solutions that drive real business impact. Experience Required - Minimum 3+ Years Job Location: Ahmedabad (Siddhivinayak Towers, Makarba) Shift Timings: 11am to 8pm/ 9am to 6pm (Shift may change as per business requirement) Monday to Friday Work from office Immediate Joiner is Preferred. Key Responsibilities Design, build, and maintain scalable RAG systems for knowledge-intensive applications, including document ingestion, chunking strategies, vector store management, and retrieval optimization.

Fine-tune open-source and proprietary LLMs using parameter-efficient techniques such as LoRA and QLoRA to adapt models for domain-specific use cases. Develop and deploy end-to-end LLM-powered applications including chatbots, agents, summarization tools, and search systems. Evaluate model performance using quantitative metrics (BLEU, ROUGE, perplexity) and qualitative benchmarks; iterate on prompt engineering and fine-tuning strategies accordingly.

Optimize inference pipelines for latency, cost, and throughput across cloud and on-premise environments. Collaborate with data engineers, product managers, and stakeholders to translate business requirements into AI-driven solutions. Stay current with rapidly evolving LLM research, tools, and frameworks, and advocate for best practices across the team.

Must-Have Requirements 3+ years of experience in AI/ML engineering, with at least 3+ year focused on LLMs and generative AI. Proven experience developing scalable RAG systems (vector databases such as Pinecone, Weaviate, Qdrant, Chroma DB, or FAISS; embedding models; retrieval and re-ranking strategies). Hands-on experience with model fine-tuning using LoRA, QLoRA, or similar PEFT techniques on frameworks like Hugging Face Transformers, PEFT, or Axolotl.

Strong proficiency in Python and ML frameworks (PyTorch, Transformers, LangChain, LlamaIndex). Solid understanding of transformer architectures, attention mechanisms, and tokenization. Experience with cloud platforms (AWS, GCP, or Azure) for model training and deployment.

Nice-to-Have Experience with multi-agent frameworks (AutoGen, CrewAI, LangGraph). Familiarity with model serving tools such as vLLM, TGI, or Triton Inference Server. Knowledge of MLOps practices — experiment tracking (W&B, MLflow), CI/CD for ML, and model monitoring.

Experience with quantization techniques (GPTQ, AWQ, GGUF) for efficient deployment. Contributions to open-source AI/ML projects. Benefits: • Exposure to US Projects/Design/Standards • Company provides Dinner/Snacks/Tea/Coffee • 5 Days Working Week • 15 paid leave annually & 8-10 Public Holidays

Posted Today

Related Jobs

Related Searches

Apply Now