Role overview
**Job Title:**
LLM Engineer
**Job Type:**
Contract
*(W2 Only)*
**Contract Duration:**
ASAP through 12/31/2025 (with good potential for extension into 2026)
**Work Location:**
San Jose, CA (HYBRID role; Onsite 2 days per week)
**Work Schedule/Hours:**
Monday–Friday, 8 hours per day, 40 hours per week (standard business hours)
**Compensation:**
$85 to $95 per hour
**Overview:**
A leading Big Four consulting firm is seeking a highly-skilled
**LLM Engineer**
to design, train, and optimize large language models that drive cutting-edge applications in generative AI and natural language understanding. This role offers the opportunity to work on advanced model development, scalable deployment systems, and innovative research alongside cross-functional product and engineering teams.
**Responsibilities:**
*Model Development & Optimization*
* Design, train, fine-tune, and evaluate large language models (LLMs) to ensure high performance, efficiency, and alignment with research or product goals.
* Optimize model architectures, tokenization strategies, and data pipelines to enhance throughput and model accuracy.
*Systems Integration & Deployment*
* Build and maintain scalable inference pipelines for production environments.
* Optimize serving infrastructure using techniques such as quantization, caching, pruning, and distillation.
* Integrate trained models into enterprise applications, APIs, or end-user products.
*Research & Cross-Functional Collaboration*
* Lead experimentation with new architectures, retrieval-augmented generation (RAG) frameworks, and prompt-engineering techniques.
* Collaborate closely with product managers, data scientists, and ML operations teams to translate research into production-grade solutions.
* Stay current with advancements in transformer architectures, fine-tuning methods, and LLM safety/alignment best practices.
**Qualifications:**
*Required:*
* High school diploma or GED required; Bachelor’s degree or higher preferred.
* 5+ years of experience in machine learning, NLP, or large-scale model development.
* Strong understanding of deep learning frameworks such as PyTorch or TensorFlow.
* Experience building, training, or fine-tuning large language models (e.g., GPT, LLaMA, PaLM, Falcon, etc.).
* Solid programming skills in Python, with experience in distributed training and cloud-based ML infrastructure (AWS, GCP, or Azure).
* Strong problem-solving and communication skills, with the ability to work cross-functionally in fast-paced environments.
*Preferred:*
* Experience with retrieval systems, vector databases, or RAG pipelines.
* Familiarity with model alignment, evaluation metrics, and responsible AI practices.