Role overview

The Opportunity

While 2024 was the year of AI experimentation, 2026 is the year of
Agentic AI
. Our client, a Financial Services leader in Fulton Market, is moving beyond simple chatbots to build autonomous AI agents capable of multi-step reasoning and enterprise-scale task execution.

We are looking for an AI Engineer who doesn't just "call an API" but understands how to build resilient, cost-effective, and secure production systems. You will be joining a high-priority squad tasked with automating core business logic using the latest LLM orchestration frameworks.

Key Responsibilities

Agent Orchestration: Design and deploy multi-agent systems using frameworks like LangGraph , CrewAI , or PydanticAI to handle complex, non-linear workflows.
Advanced RAG Pipelines: Implement and optimize Retrieval-Augmented Generation (RAG) using vector databases ( Pinecone , Weaviate , or Milvus ) and advanced reranking strategies.
Model Optimization: Fine-tune open-source models (Llama 3/4, Mistral) using LoRA/QLoRA for domain-specific tasks while maintaining low latency.
AI FinOps & Guardrails: Implement token-cost monitoring and security guardrails (Zero Trust integration) to ensure LLM outputs are safe, compliant, and within budget.
Production Engineering: Containerize AI microservices using Docker/Kubernetes and set up CI/CD pipelines for model deployment and monitoring (MLOps).

Technical Requirements

Python Mastery: 5+ years of Python development (including experience with Python 3.12+ features).
LLM Experience: Proven track record of shipping LLM-powered applications to production (OpenAI, Claude, or local hosting).
Data Architecture: Strong SQL skills and experience with unstructured data processing.
Chicago Connection: Must be able to commute to the Chicago office 3 days a week to collaborate with the engineering leadership team.

What you'll work on

Agent Orchestration: Design and deploy multi-agent systems using frameworks like LangGraph , CrewAI , or PydanticAI to handle complex, non-linear workflows.
Advanced RAG Pipelines: Implement and optimize Retrieval-Augmented Generation (RAG) using vector databases ( Pinecone , Weaviate , or Milvus ) and advanced reranking strategies.
Model Optimization: Fine-tune open-source models (Llama 3/4, Mistral) using LoRA/QLoRA for domain-specific tasks while maintaining low latency.
AI FinOps & Guardrails: Implement token-cost monitoring and security guardrails (Zero Trust integration) to ensure LLM outputs are safe, compliant, and within budget.
Production Engineering: Containerize AI microservices using Docker/Kubernetes and set up CI/CD pipelines for model deployment and monitoring (MLOps).

What we're looking for

Python Mastery: 5+ years of Python development (including experience with Python 3.12+ features).
LLM Experience: Proven track record of shipping LLM-powered applications to production (OpenAI, Claude, or local hosting).
Data Architecture: Strong SQL skills and experience with unstructured data processing.
Chicago Connection: Must be able to commute to the Chicago office 3 days a week to collaborate with the engineering leadership team.

Tags & focus areas

Used for matching and alerts on DevFound

Fulltime Ai Ai Engineer Robotics Generative Ai

Generative AI Engineer

Role overview

What you'll work on

What we're looking for

Tags & focus areas

Ready to Join the Team?