Bernard Nickels & Associates
AI

LLM Engineer

Bernard Nickels & Associates · San Jose, CA · $197k

Actively hiring Posted 2 months ago

Role overview

Job Title:
LLM Engineer

Job Type:
Contract
(W2 Only)

Contract Duration:
ASAP through 12/31/2025 (with good potential for extension into 2026)

Work Location:
San Jose, CA (HYBRID role; Onsite 2 days per week)

Work Schedule/Hours:
Monday–Friday, 8 hours per day, 40 hours per week (standard business hours)

Compensation:
$85 to $95 per hour

Overview:
A leading Big Four consulting firm is seeking a highly-skilled
LLM Engineer
to design, train, and optimize large language models that drive cutting-edge applications in generative AI and natural language understanding. This role offers the opportunity to work on advanced model development, scalable deployment systems, and innovative research alongside cross-functional product and engineering teams.

Responsibilities:

Model Development & Optimization

  • Design, train, fine-tune, and evaluate large language models (LLMs) to ensure high performance, efficiency, and alignment with research or product goals.
  • Optimize model architectures, tokenization strategies, and data pipelines to enhance throughput and model accuracy.

Systems Integration & Deployment

  • Build and maintain scalable inference pipelines for production environments.
  • Optimize serving infrastructure using techniques such as quantization, caching, pruning, and distillation.
  • Integrate trained models into enterprise applications, APIs, or end-user products.

Research & Cross-Functional Collaboration

  • Lead experimentation with new architectures, retrieval-augmented generation (RAG) frameworks, and prompt-engineering techniques.
  • Collaborate closely with product managers, data scientists, and ML operations teams to translate research into production-grade solutions.
  • Stay current with advancements in transformer architectures, fine-tuning methods, and LLM safety/alignment best practices.

Qualifications:

Required:

  • High school diploma or GED required; Bachelor’s degree or higher preferred.
  • 5+ years of experience in machine learning, NLP, or large-scale model development.
  • Strong understanding of deep learning frameworks such as PyTorch or TensorFlow.
  • Experience building, training, or fine-tuning large language models (e.g., GPT, LLaMA, PaLM, Falcon, etc.).
  • Solid programming skills in Python, with experience in distributed training and cloud-based ML infrastructure (AWS, GCP, or Azure).
  • Strong problem-solving and communication skills, with the ability to work cross-functionally in fast-paced environments.

Preferred:

  • Experience with retrieval systems, vector databases, or RAG pipelines.
  • Familiarity with model alignment, evaluation metrics, and responsible AI practices.

What you'll work on

Model Development & Optimization

  • Design, train, fine-tune, and evaluate large language models (LLMs) to ensure high performance, efficiency, and alignment with research or product goals.
  • Optimize model architectures, tokenization strategies, and data pipelines to enhance throughput and model accuracy.

Systems Integration & Deployment

  • Build and maintain scalable inference pipelines for production environments.
  • Optimize serving infrastructure using techniques such as quantization, caching, pruning, and distillation.
  • Integrate trained models into enterprise applications, APIs, or end-user products.

Research & Cross-Functional Collaboration

  • Lead experimentation with new architectures, retrieval-augmented generation (RAG) frameworks, and prompt-engineering techniques.
  • Collaborate closely with product managers, data scientists, and ML operations teams to translate research into production-grade solutions.
  • Stay current with advancements in transformer architectures, fine-tuning methods, and LLM safety/alignment best practices.

What we're looking for

Required:

  • High school diploma or GED required; Bachelor’s degree or higher preferred.
  • 5+ years of experience in machine learning, NLP, or large-scale model development.
  • Strong understanding of deep learning frameworks such as PyTorch or TensorFlow.
  • Experience building, training, or fine-tuning large language models (e.g., GPT, LLaMA, PaLM, Falcon, etc.).
  • Solid programming skills in Python, with experience in distributed training and cloud-based ML infrastructure (AWS, GCP, or Azure).
  • Strong problem-solving and communication skills, with the ability to work cross-functionally in fast-paced environments.
  • Experience with retrieval systems, vector databases, or RAG pipelines.
  • Familiarity with model alignment, evaluation metrics, and responsible AI practices.

Tags & focus areas

Used for matching and alerts on DevFound
Contract Nlp