Role overview

We’re building the next generation of intelligent claims solutions—and you’ll be at the heart of it. As our
Lead LLM Engineer
, you will own the design, deployment, and optimization of our Azure-based LLM stack. From fine-tuning GPT-style models for claims quality assurance to standing up scalable inference endpoints and retrieval pipelines, you’ll bring cutting-edge ML techniques to production in a high-stakes, performance-sensitive environment.

This is a
player-coach
position—combining hands-on LLM development with strategic technical leadership. You’ll guide engineers, shape the architecture of our language-model platforms, and collaborate closely with product, data, and platform teams to deliver high-impact AI capabilities for text understanding, information extraction, and domain-specific generation.

What you'll work on

Lead the design and deployment of large language model infrastructure using Azure ML and AKS.
Fine-tune transformer models (e.g., GPT) using LoRA, QLoRA, and PEFT for downstream QA and classification tasks.
Build and manage vector stores (e.g., FAISS, Pinecone) and retrieval pipelines as part of RAG architectures.
Develop low-latency, fault-tolerant inference services with FastAPI or Flask, integrated with Azure AD and secured via Key Vault.
Optimize model performance using quantization, distillation, and other compression techniques.
Monitor runtime systems using Azure Monitor, Grafana, and related tooling to meet enterprise SLAs.
Collaborate across product, engineering, and operations teams to align on model behavior, deployment strategies, and performance goals.
Own cost visibility and optimization across Azure ML, AKS, and related infrastructure.

What we're looking for

Bachelor’s degree in Computer Science, Data Science, or related field; Master’s preferred
6–10+ years of experience in AI/ML engineering, with hands-on deployment of LLMs in production
Demonstrated success building scalable, enterprise-grade ML systems, ideally in regulated industries
Strong track record of architectural ownership and mentoring in a high-performance team
Effective cross-functional communicator with a focus on delivering production-ready AI solutions

Tags & focus areas

Used for matching and alerts on DevFound

Fulltime Ai Machine Learning Data Science Generative Ai

Lead LLM Engineer

Role overview

What you'll work on

What we're looking for

Tags & focus areas

Ready to Join the Team?