Role overview
**JOB DESCRIPTION:**
**JOB Title : Data Scientist + ML Engineer (Gen AI) (NO C2C)**
**Location : Cupertino , CA (Remote , but needs to be onsite occasionally)**
**Duration : 12 months contract with possible extension**
**POSITION OVERVIEW:**
We are looking for a highly skilled Data Scientist + ML Engineer (Generative AI)to join our team. In this role, you will be responsible for developing, fine-tuning, and applying advanced generative AI models — including diffusion models, large language models (LLMs), and other state-of-the-art architectures. You will collaborate closely with cross-functional partners in research, data engineering, and operations to deliver high-quality machine learning solutions and scalable datasets.
This position requires a balance of technical depth and creative problem-solving. You should be comfortable working with large, complex datasets and possess a strong grasp of modern ML frameworks, distributed computing environments, and end-to-end data pipelines.
**RESPONSIBILITIES:**
* Design and Implement LLM-Driven Synthetic Data Pipelines: Design and build work flows using LLMs and Gen AI techniques to create high-volume, high-quality synthetic data for model training and testing.
* Design, implement, and deploy machine learning models with a focus on generative AI (diffusion models, LLMs, and related architectures)
* Fine-tune, evaluate, and optimize large language models for specific downstream tasks and data needs
* Develop and maintain scalable data pipelines supporting training, evaluation, and inference workflows
* Conduct exploratory data analysis to surface insights and identify opportunities for model or data improvement
* Partner cross-functionally with researchers, engineers, and data program managers to defi ne requirements and deliver high-impact ML solutions
* Build and enhance internal tools, libraries, and automation work flows to accelerate experimentation and iteration
**REQUIRED EXPERIENCE AND SKILLS:**
* Bachelor’s degree in Computer Science or related fi eld from an accredited U.S. institution
* 2+ years of experience in Machine Learning or Software Engineering
* Expert-level proficiency in Python and familiarity with deep learning frameworks such as PyTorch
* Strong foundation in machine learning algorithms, data preprocessing, and evaluation techniques
* Demonstrated experience working with diffusion models, stable diffusion, or large language models (LLMs)
* Excellent analytical, problem-solving, and debugging skills
* Strong communication and documentation skills with the ability to explain complex concepts clearly
* Ability to work independently in a fast-paced, iterative development environment