Artificial Intelligence Engineer

Actively hiring Posted 2 months ago 2 min read

Role overview

Job
: Computer Vision/AI Engineer

Duration
: Long term contract

Location
: Orlando, FL

Designing, building, and optimizing all aspects of large-scale training and fine-tuning, from dataloading to inference, to maximize Model Flop Utilization (MFU) on large compute clusters.
Working closely and proactively with research scientists to translate models and algorithms into high-performance, production-ready code, integrating and testing the latest advancements.
Relentlessly profiling and resolving training performance bottlenecks, optimizing the entire training stack for speed and efficiency.
Contributing to the technology evaluations and selection of hardware, software, and cloud services for the AI infrastructure platform.
Using MLOps frameworks (MLFlow, WnB, etc.) to ensure best practices across the model lifecycle, ensuring reproducibility, reliability, and continuous improvement.
Creating thorough documentation for infrastructure and training procedures, staying updated on advancements in training strategies, and driving improvements in workflows and infrastructure.

Master's degree or higher in Computer Science, Engineering, or a related technical field.
5 or more years in a Data & AI (Artificial Intelligence) Engineer or Machine Learning Engineer, focusing on building and optimizing infrastructure for large-scale machine learning systems. *Candidates with more experience can be considered for a higher level or vice-versa.
Deep practical expertise with AI frameworks (PyTorch, Jax, Pytorch Lightning, etc.), large-scale multi-node GPU training, and optimization strategies for large foundation models on distributed compute infrastructure.
Excellent problem-solving, debugging, and performance optimization skills, with a data-driven approach to identifying and resolving technical challenges.

Used for matching and alerts on DevFound

Contract Ai Engineer Computer Vision Mlops