Role overview
Develop models, tools, metrics, and datasets for assessing and evaluating the safety of generative models over the model deployment lifecycle
Develop methods, models, and tools to interpret and explain failures in language and diffusion models
Build and maintain human annotation and red teaming pipelines to assess quality and risk of various Apple products
Prototype, implement, and evaluate new ML models and algorithms for red teaming LLMs
Work with highly-sensitive content with exposure to offensive and controversial content
What we're looking for
Strong engineering skills and experience in writing production-quality code in Python, Swift or other programming languages
Background in generative models, natural language processing, LLMs, or diffusion models
Experience with failure analysis, quality engineering, or robustness analysis for AI/ML based features
Experience working with crowd-based annotations and human evaluations
Experience working on explainability and interpretation of AI/ML models
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .