Drive down wall-clock time to convergence by profiling and eliminating bottlenecks across the foundation model training stack stack, from data pipelines to GPU kernels
Design, build, and optimize distributed training systems (PyTorch) for multi-node GPU clusters, ensuring scalability, robustness, and high utilization
Implement efficient low-level code (CUDA, cuDNN, Triton, custom kernels) and integrate it seamlessly into high-level training frameworks
Optimize workloads for hardware efficiency: CPU/GPU compute balance, memory management, data throughput, and networking
Develop monitoring and debugging tools for large-scale runs, enabling rapid diagnosis of performance regressions and failures
Deep experience in distributed systems, ML infrastructure, or high-performance computing (8+ years)
Production-grade expertise in Python
Low-level performance mastery: CUDA/cuDNN/Triton, CPU–GPU interactions, data movement, and kernel optimization
Scaling at the frontier: experience with PyTorch and training jobs using data, context, pipeline, and model parallelism
System-level mindset with a track record of tuning hardware–software interactions for maximum utilization
-
About the Company · Models are what they eat. But a large portion of training compute is wasted training on data that are already learned, irrelevant, or even harmful, leading to worse models that cost more to train and deploy. · At DatologyAI, we've built a state of the art data ...
Redwood City1 week ago
-
Company Overview · At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machine learning is the key to unlocking these capabilities for the w ...
San Mateo $100,000 - $300,000 (USD)2 weeks ago
-
We are looking for a Robotics Engineer specialising in Post Training & Deployment to help further the performance and capabilities of Skild AI's world-class high-level software and products. · ...
San Mateo $100,000 - $300,000 (USD) Full time1 month ago
-
· Company Overview · At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machine learning is the key to unlocking these capabilities for th ...
San Mateo, CA $100,000 - $300,000 (USD) per year1 week ago
-
Company Overview · At Skild AI, we are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. We believe massive scale through data-driven machine learning is the key to unlocking these capabilities for the w ...
San Mateo, CA1 week ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park $58.65 - $181,000 (USD)6 days ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park $88.46 - $257,000 (USD)6 days ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park $74.04 - $217,000 (USD) Full time1 week ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park $74.04 - $217,000 (USD)6 days ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park $58.65 - $181,000 (USD) Full time1 week ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park $88.46 - $257,000 (USD) Full time1 week ago
-
Meta is seeking Research Engineers to join the Post-Training team within Meta Superintelligence Labs. High-quality data is the engine of AI progress at MSL, determining the capabilities we can unlock and how fast our models improve. As a Research Engineer on this team, you will b ...
Menlo Park, CA5 days ago
-
We are seeking a highly motivated and skilled Research Engineer with expertise in model training, · Conduct research and implement solutions in areas such as model architecture, algorithms, data processing, and optimization. · Optimize and scale our training infrastructure to imp ...
Palo Alto2 months ago
-
Meta is seeking a Research Engineering Manager to lead the Post-Training team within Meta Superintelligence Labs. · ...
Menlo Park $219,000 - $301,000 (USD) Full time2 weeks ago
-
Meta is seeking a Research Engineering Manager to lead the Post-Training team within Meta Superintelligence Labs. High-quality data is the core of AI progress at MSL, fueling our frontier capabilities and determining how our models interact with the world. In this leadership role ...
Menlo Park $219,000 - $301,000 (USD)2 weeks ago
-
We are seeking a highly motivated and skilled Research Engineer with expertise in model training, · Conduct research and implement solutions in areas such as model architecture, · Optimize and scale our training infrastructure to improve efficiency and reliability in a reinforcem ...
Palo Alto, CA1 month ago
-
Meta is seeking a Research Engineering Manager to lead the Post-Training team within Meta Superintelligence Labs. High-quality data is the core of AI progress at MSL, fueling our frontier capabilities and determining how our models interact with the world. In this leadership role ...
Menlo Park, CA2 weeks ago
-
We are building a unified, modular runtime that meets researchers where they are and moves with them up the scaling curve. · Success for us is measured by raising both training throughput (how fast models train) and researcher throughput (how fast ideas become experiments and pro ...
San Francisco $250,000 - $460,000 (USD)2 months ago
-
JOB · Pay RatePay Band E06 (Non-Rep)$128, minimum) - $194, maximum)The negotiable initial salary offer will be between $128, $149,055.16 commensurate with education and experience. Reports ToManager of Train Control Engineering or designee.Current AssignmentThis is a capital posi ...
Oakland, CA2 weeks ago
-
Ride BART to a satisfying career that lets you both: 1) make a difference to Bay Area residents, and 2) enjoy excellent pay, benefits, and employment stability. · BART offers a competitive salary, · comprehensive health benefits, · paid time off, · and the CalPERS retirement prog ...
Oakland $49,055 - $128,685 (USD)2 weeks ago
-
About Black Forest Labs · We're a team of world-class researchers and engineers creating the generative models that power how people make images and video—tools used by millions of creators, developers, and businesses worldwide. Our FLUX models are among the most advanced in the ...
San Francisco $180,000 - $300,000 (USD)3 days ago
Staff Software Engineer, Training - San Carlos - Genesis AI
Description
What You'll Do
What You'll Bring
#J-18808-Ljbffr
-
Software Engineer, Training
Only for registered members Redwood City
-
Research Engineer, Post-training
Only for registered members San Mateo
-
Robotics Engineer, Post Training
Full time Only for registered members San Mateo
-
Research Engineer, Post-training
Only for registered members San Mateo, CA
-
Research Engineer, Post-training
Only for registered members San Mateo, CA
-
Research Engineer, Post-Training
Only for registered members Menlo Park
-
Research Engineer, Post-Training
Only for registered members Menlo Park
-
Research Engineer, Post-Training
Full time Only for registered members Menlo Park
-
Research Engineer, Post-Training
Only for registered members Menlo Park
-
Research Engineer, Post-Training
Full time Only for registered members Menlo Park
-
Research Engineer, Post-Training
Full time Only for registered members Menlo Park
-
Research Engineer, Post-Training
Only for registered members Menlo Park, CA
-
Research Engineer, Distributed Training
Only for registered members Palo Alto
-
Research Engineering Manager, Post-Training
Full time Only for registered members Menlo Park
-
Research Engineering Manager, Post-Training
Only for registered members Menlo Park
-
Research Engineer, Distributed Training
Only for registered members Palo Alto, CA
-
Research Engineering Manager, Post-Training
Only for registered members Menlo Park, CA
-
Training Performance Engineer
Only for registered members San Francisco
-
Train Control Engineer
Only for registered members Oakland, CA
-
Train Control Engineer
Only for registered members Oakland
-
Training Infrastructure Engineer
Only for registered members San Francisco