Role overview
To Apply for this Job Click Here
This position is based in Europe. Relocation assistance and work sponsorships will be provided.
We are looking for a
highly skilled Machine Learning Engineer
with a strong background in
Large Language Models (LLMs)
to join our team. This
position
offers the chance to contribute to
cutting-edge innovation
in a world-class research environment.
You will work on
core model development
, building and optimizing LLMs from the ground up — not just applying or fine-tuning pre-trained models. If you’re passionate about transformer architecture, model compression, distributed training, and LLM research, this is the role for you.
What you'll work on
- Design, build, and train LLMs from scratch, including architecture design, dataset preparation, and optimization
- Lead research and development in transformer architectures, model internals, and efficient training techniques
- Apply model compression and optimization strategies (LoRA, QLoRA, quantization, pruning, distillation, etc.)
- Implement and scale distributed training using frameworks such as DeepSpeed, FSDP, HuggingFace Accelerate
- Optimize inference performance using tools like TensorRT, vLLM, and other runtime frameworks
- Evaluate and benchmark model accuracy, robustness, and efficiency across real-world use cases
- Contribute to the design of quantum-inspired AI methods for model compression and acceleration
- Collaborate cross-functionally to integrate LLMs into products and guide junior team members
- Stay current with state-of-the-art techniques in LLM research and contribute to our innovation pipeline
What we're looking for
- Ph.D. in AI, ML, NLP, or related field
- 2+ years of experience in deep learning and neural networks
- 2+ years of direct, hands-on experience building or training LLMs or transformer models
- Deep understanding of transformer internals, including architecture, training loops, attention, and loss functions
- Proficiency with PyTorch, HuggingFace Transformers, Accelerate, and similar libraries
- Demonstrated ability to optimize training and inference (e.g., mixed precision, FlashAttention, gradient accumulation)
- Experience with distributed systems, cloud platforms (AWS preferred), and GPU-accelerated workloads
- Strong written and verbal English communication skills