Role overview
Overview
The Spatial AI Lab is part of the Applied Sciences Group, a Microsoft research and development organization dedicated to creating next-generation human-computer interaction technologies leveraging the most recent AI developments and exploring new hardware capabilities and device form-factors. Our team of scientists and engineers has strong expertise in computer vision and multi-modal AI, with a particular focus on spatial and embodied AI.
As a Scientist - Multimodal Foundation Models & Robotics on our growing team, you will conduct research at the intersection of large-scale generative modeling and embodied AI, with a focus on robotics. Your primary focus will be on building the core intelligence for a new generation of agents, training the multimodal foundation models that empower them to perceive complex environments, reason about tasks, and act seamlessly across both the physical and digital worlds. This opportunity will allow you to deepen your expertise in training embodied foundation models, deploying algorithms on robotic hardware and large-scale AI systems, and contribute to our pioneering research through publications and collaborations with partners like ETH Zurich.
Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
What we're looking for
Experience in one or more of the following areas:
- Foundation Models: hands-on training experience in at least one of the following topics: LLMs; Large vision-language models (VLMs); Video generative models and diffusion algorithms; or action-based transformers and Vision Language Action models (VLAs).
- Large-Scale ML Systems: Experience with large scale machine learning compute systems.
- Robotics:
- Hands-on training experience in robot learning techniques, such as reinforcement learning, imitation learning as well as classical control methods
- Solid understanding of robot kinematics, dynamics and sensors
- Familiarity with control algorithms such as PID, model predictive control (MPC), and whole-body control.
Track record of impact, either via first author research publications at top-tier machine learning or robotics conferences (CoRL, RSS, NeurIPS, ICML, ICLR, CVPR), or via contributions to successful industry initiatives.
This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process**.