Robert Half
AI

Generative AI Engineer

Robert Half · Chicago, IL

Actively hiring Posted about 1 month ago

Role overview

The Opportunity

While 2024 was the year of AI experimentation, 2026 is the year of
Agentic AI
. Our client, a Financial Services leader in Fulton Market, is moving beyond simple chatbots to build autonomous AI agents capable of multi-step reasoning and enterprise-scale task execution.

We are looking for an AI Engineer who doesn't just "call an API" but understands how to build resilient, cost-effective, and secure production systems. You will be joining a high-priority squad tasked with automating core business logic using the latest LLM orchestration frameworks.

Key Responsibilities

  • Agent Orchestration: Design and deploy multi-agent systems using frameworks like LangGraph , CrewAI , or PydanticAI to handle complex, non-linear workflows.
  • Advanced RAG Pipelines: Implement and optimize Retrieval-Augmented Generation (RAG) using vector databases ( Pinecone , Weaviate , or Milvus ) and advanced reranking strategies.
  • Model Optimization: Fine-tune open-source models (Llama 3/4, Mistral) using LoRA/QLoRA for domain-specific tasks while maintaining low latency.
  • AI FinOps & Guardrails: Implement token-cost monitoring and security guardrails (Zero Trust integration) to ensure LLM outputs are safe, compliant, and within budget.
  • Production Engineering: Containerize AI microservices using Docker/Kubernetes and set up CI/CD pipelines for model deployment and monitoring (MLOps).

Technical Requirements

  • Python Mastery: 5+ years of Python development (including experience with Python 3.12+ features).
  • LLM Experience: Proven track record of shipping LLM-powered applications to production (OpenAI, Claude, or local hosting).
  • Data Architecture: Strong SQL skills and experience with unstructured data processing.
  • Chicago Connection: Must be able to commute to the Chicago office 3 days a week to collaborate with the engineering leadership team.

What you'll work on

  • Agent Orchestration: Design and deploy multi-agent systems using frameworks like LangGraph , CrewAI , or PydanticAI to handle complex, non-linear workflows.
  • Advanced RAG Pipelines: Implement and optimize Retrieval-Augmented Generation (RAG) using vector databases ( Pinecone , Weaviate , or Milvus ) and advanced reranking strategies.
  • Model Optimization: Fine-tune open-source models (Llama 3/4, Mistral) using LoRA/QLoRA for domain-specific tasks while maintaining low latency.
  • AI FinOps & Guardrails: Implement token-cost monitoring and security guardrails (Zero Trust integration) to ensure LLM outputs are safe, compliant, and within budget.
  • Production Engineering: Containerize AI microservices using Docker/Kubernetes and set up CI/CD pipelines for model deployment and monitoring (MLOps).

What we're looking for

  • Python Mastery: 5+ years of Python development (including experience with Python 3.12+ features).
  • LLM Experience: Proven track record of shipping LLM-powered applications to production (OpenAI, Claude, or local hosting).
  • Data Architecture: Strong SQL skills and experience with unstructured data processing.
  • Chicago Connection: Must be able to commute to the Chicago office 3 days a week to collaborate with the engineering leadership team.

Tags & focus areas

Used for matching and alerts on DevFound
Fulltime Ai Ai Engineer Robotics Generative Ai