Role overview

Role Overview As a core member of our AI engineering team, you will design, develop, and optimize cutting-edge AI models and workloads that run natively on our high-performance GPU clusters. Leverage our SOTA infrastructure to train, fine-tune, and serve massive-scale models at unprecedented efficiency. Collaborate across infrastructure, product, and research teams to align hardware capabilities with real-world AI demands, driving breakthroughs in performance, scalability, and innovation.

What we're looking for

Design, implement, and train state-of-the-art ML models for high-impact applications (e.g., NLP, Computer Vision, Network Optimization).
Optimize AI workloads for extreme performance and scalability on large-scale GPU systems like GB200 NVL72, using tools such as Dynamo, vLLM, and advanced inference engines.
Partner with cross-functional teams to co-design hardware-software solutions that maximize AI processing efficiency.
Build robust tools, data pipelines, evaluation frameworks, and deployment systems.
Track and incorporate the latest AI research and technological advancements.
Contribute to product requirements (PRDs) and agile execution (sprint planning and delivery).
Champion a culture of humility, bold innovation, and high-velocity product delivery.

Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or a related technical field.
3+ years of hands-on experience in machine learning, deep learning, and software engineering.
Proficiency in Python; experience with C/C++.
Strong working knowledge of major AI/ML frameworks (PyTorch, TensorFlow, JAX, or similar).
Solid foundation in data structures, algorithms, and software design principles.

Master's or PhD in Computer Science, AI/ML, or a related discipline.
Experience with Large Language Models (LLMs), Generative AI, or Computer Vision.
Familiarity with distributed training frameworks and techniques (e.g., Ray, DeepSpeed, Megatron-LM).
Proven expertise optimizing models for GPU inference (e.g., TensorRT, Triton Inference Server).
Knowledge of MLOps tools and practices (Kubeflow, MLflow, etc.).

Show less

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Information Technology and Engineering

Industries

Technology, Information and Media, Software Development, and IT Services and IT Consulting

Tags & focus areas

Used for matching and alerts on DevFound

Ai Ai Engineer Machine Learning

AI ML Engineer – NextGeneration AI Platms Workloads

Role overview

What we're looking for

Tags & focus areas

Ready to Join the Team?