Role overview
**Senior ML Infrastructure / MLOps Engineer
Location:**
SF Bay Area (On-site)
We’re representing an ambitious AI research organization building physical autonomy systems powered by large-scale ML. You’ll own the infrastructure that makes cutting-edge model development reliable, reproducible, and scalable — from training to deployment.
⭐
The Opportunity
Be a core part of the team responsible for the machine learning foundation of a next-generation AI platform. You will help build and maintain the systems that enable performant model training, experimentation, and production workflows at scale.
What You’ll Do
- Build and maintain scalable ML infrastructure supporting training, fine-tuning, RLHF/DPO workflows, and distributed experiments.
- Develop and manage data pipelines, dataset versioning, experiment tracking, and reproducible evaluation frameworks.
- Operate containerized training and inference environments, including CI/CD automation for models.
- Partner with researchers, engineers, and systems teams to enable rapid iteration and robust deployments.
What You Bring
- Strong experience with ML infrastructure, distributed training systems, and production-grade MLOps practices.
- Familiarity with containerization, orchestration, and reproducible ML workflows.
- Hands-on in experiment management, dataset governance, and automation tooling.
- A pragmatic mindset and ability to work across research and engineering functions.
Why This Role Excites People
- Directly shape the backbone of ML systems that support real, high-impact AI research and autonomous behavior.
- Work with a tight-knit, world-class team tackling foundational problems at the intersection of ML, systems, and autonomy.
- Competitive salary, meaningful equity, and a strong benefits package.