Role overview
collaborating with a leading AI research lab to support the evaluation of advanced machine learning systems. We are seeking experienced machine learning engineers and researchers to contribute to the design of high-quality evaluation suites that measure AI performance on real-world machine learning engineering tasks. The work focuses on translating practical ML research and engineering workflows into structured benchmarks for frontier models. This is a project-based, remote opportunity suited for experts with hands-on ML research experience.
What you'll work on
- Design and write detailed evaluation suites for machine learning engineering tasks
- Assess AI-generated solutions across areas such as model training, debugging, optimization, and experimentation
What we're looking for
- 3+ years of experience in machine learning engineering or applied ML research
- Hands-on experience with model development, experimentation, and evaluation
- Background in ML research (industry lab or academic setting strongly preferred)
- Strong ability to reason about ML system design choices and tradeoffs
- Clear written communication and high attention to technical detail
We consider all qualified applicants without regard to legally protected characteristics and provide reasonable accommodations upon request.