Senior Research Engineer, Foundation Model Training Infrastructure, Senior Research Engineer, Foundation Model Training Infrastructure - Santa Clara
1 day ago

Job description
NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the Generalist Embodied Agent Research (GEAR) group.
Our team is leading Project GR00T, NVIDIA's moonshot initiative at building foundation models and full-stack technology for humanoid robots.
You will work with an amazing and collaborative research team that consistently produces influential works on multimodal foundation models, large-scale robot learning, embodied AI, and physics simulation.
What You Will Be Doing
Design and maintain large-scale distributed training systems to support multi-modal foundation models for robotics.
Optimize GPU and cluster utilization for efficient model training and fine-tuning on massive datasets.
Implement scalable data loaders and preprocessors tailored for multimodal datasets, including videos, text, and sensor data.
Develop robust monitoring and debugging tools to ensure the reliability and performance of training workflows on large GPU clusters.
Collaborate with researchers to integrate cutting-edge model architectures into scalable training pipelines.
What We Need To See
Bachelor's degree in Computer Science, Robotics, Engineering, or a related field;
10+ years of full-time industry experience in large-scale MLOps and AI infrastructure;
Proven experience designing and optimizing distributed training systems with frameworks like PyTorch, JAX, or TensorFlow.
Deep understanding of GPU acceleration, CUDA programming, and cluster management tools like Kubernetes.
Strong programming skills in Python and a high-performance language such as C++ for efficient system development.
Strong experience with large-scale GPU clusters, HPC environments, and job scheduling/orchestration tools (e.g., SLURM, Kubernetes).
Ways To Stand Out From The Crowd
Master's or PhD's degree in Computer Science, Robotics, Engineering, or a related field;
Demonstrated Tech Lead experience, coordinating a team of engineers and driving projects from conception to deployment;
Strong experience at building large-scale LLM and multimodal LLM training infrastructure;
Contributions to popular open-source AI frameworks or research publications in top-tier AI conferences, such as NeurIPS, ICRA, ICLR, CoRL.
NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and productive people in the world. Please join us and be part of the forefront of developing general-purpose robots and large-scale foundation models
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 224,000 USD - 356,500 USD.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until January 13, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer.
As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
, , JR1992361Similar jobs
We are looking for an experienced Training Infrastructure Engineer to take our infrastructure to the next level. This role is focused on managing the training cluster, implementing distributed training algorithms, data loaders and developer tools for AI researchers. · ...
1 month ago
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. · ...
1 week ago
· Design, deploy, and maintain Figure's training clusters · Architect and maintain scalable deep learning frameworks for training on massive robot datasets · ...
1 month ago
Senior Research Engineer, Foundation Model Training Infrastructure, Senior Research Engineer, Foundation Model Training Infrastructure
Only for registered members
NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the Generalist Embodied Agent Research (GEAR) group. Our team is leading Project GR00T, NVIDIA's moonshot initiative at buil ...
1 day ago
We are looking for a systems-minded engineer who lives at the intersection of large-scale model inference, distributed systems, and performance optimization. · The Role · We are looking for a systems-minded engineer who lives at the intersection of large-scale model inference, di ...
1 week ago
Senior Research Engineer, Foundation Model Training Infrastructure
Only for registered members
We are searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the Generalist Embodied Agent Research (GEAR) group. · We will work with an amazing and collaborative research team that consis ...
1 month ago
Senior Research Engineer, Foundation Model Training Infrastructure
Only for registered members
Nvidia is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the Generalist Embodied Agent Research GEAR group Our team is leading Project GR00T NVIDIA s moonshot initiative at building ...
1 month ago
Senior Research Engineer, Foundation Model Training Infrastructure
Only for registered members
We are searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training. · ...
1 month ago
Senior Research Engineer, Foundation Model Training Infrastructure
Only for registered members
NVIDIA is searching for a senior or principal engineer who specializes in building cutting-edge infrastructure for large-scale foundation model training in the Generalist Embodied Agent Research (GEAR) group. · NVIDIA is widely considered to be one of the technology world's most ...
1 week ago
We are looking for a Software Engineer to work at the forefront of developing and optimizing the software infrastructure and tools necessary for training cutting-edge AI models. · You will focus on building robust scalable efficient training pipelines frameworks that support enti ...
1 month ago
Nuro is seeking an experienced Technical Lead Manager with deep expertise in quantized training and model compression to join our ML Infrastructure team.In this role, you will drive the adoption of state-of-the-art quantization techniques, enabling training and deployment of high ...
1 month ago
Senior Research Engineer, Training Data Infrastructure in Foundation Models
Only for registered members
Our team is dedicated to solving the high-quality training data problem at the scale required to train advanced Foundation Models. · We believe that the advanced model performance fundamentally depends on a data-centric approach to Machine Learning. · Our objective is to engineer ...
4 weeks ago
Senior Research Engineer, Training Data Infrastructure in Foundation Models
Only for registered members
+We are seeking a Senior Research Engineer who possesses a deep understanding of distributed systems and a strong intuition for Machine Learning. · +Description · This position operates at the convergence of Software Engineering and Machine Learning Research.+You will work alongs ...
1 month ago
Senior Research Engineer, Training Data Infrastructure in Foundation Models
Only for registered members
We are seeking a Senior Research Engineer who possesses a deep understanding of distributed systems and a strong intuition for Machine Learning. · You will join a culture that values engineering craftsmanship, privacy, and rigorous scientific inquiry. · ...
1 week ago
Senior Research Engineer, Training Data Infrastructure in Foundation Models
Only for registered members
We are seeking a Senior Research Engineer who possesses a deep understanding of distributed systems and a strong intuition for Machine Learning. · ...
1 month ago
We're building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry.We've been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovati ...
4 weeks ago
Member of Technical Staff, Pre-training Data Infrastructure
Only for registered members
+Job summary · About xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. · +Design and implement petabyte-scale high-throughput data processing systems that involve both CPU- based processing. · Build run ...
1 month ago
We build the infrastructure that powers large-scale ML training and inference workloads. · ...
1 week ago
We are now looking for a Senior Software Engineer for Generative AI Research At NVIDIA,p we believe the next generation of AI will be physical AI – systems that perceive reason and act in the real world. · Cosmos enables large-scale AI models for robots autonomous agents and AI s ...
1 month ago
NVIDIA is seeking a Senior Software Engineer for Generative AI Research to build infrastructure that enables physical AI at scale. · ...
1 month ago