ML Engineer Evaluation Automation Siri AI Quality Engineering

Apple · Cupertino, CA, US

Actively hiring Posted 3 months ago 2 min read

Role overview

Design and develop machine learning and LLM-based solutions for ML model and system evaluation use cases such as:

Automatic large scale data generation
Automatic UI and Non UI test evaluation
Run evaluation jobs at scale
Build and optimize LLM judges
Intelligent log summarization and anomaly detection
Fine-tune or prompt-engineer foundation models (e.g., Apple, GPT, Claude) for Evaluation-specific applications
Collaborate with QA teams to integrate models into testing frameworks
Continuously evaluate and improve model performance through A/B testing, human feedback loops, and retraining
Monitor advances in LLMs and NLP and propose innovative applications within the ML evaluation domain

What we're looking for

3+ years of proven ability in machine learning, including hands-on work with LLMs.

Strong programming skills in Python and experience with ML/NLP libraries

Experience building or fine-tuning LLMs for software engineering tasks

Understanding of prompt engineering, and retrieval-augmented generation (RAG)

Experience developing LLM based automated evaluation frameworks

Excellent knowledge of software testing methodologies & practices

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .

Tags & focus areas

Used for matching and alerts on DevFound

Ai Machine Learning Nlp Generative Ai Fulltime

ML Engineer Evaluation Automation Siri AI Quality Engineering

Role overview

What we're looking for

Tags & focus areas

Ready to Join the Team?