data/ml infrastructure lead, robot learning - Palo Alto, CA
16 hours ago

Job description
Start Date:
ASAP
About Us
Mundane is a venture-backed seed-stage robot learning startup founded by a team of Stanford researchers and builders. We're deploying a massive fleet of humanoid robots to perform mundane tasks in commercial environments, collecting data to build the next generation of embodied intelligence.
We are a fast-paced, execution-driven team of engineers, roboticists, and builders. Our robots operate in real customer environments — and improve through real-world experience.
About the Role
You will build and own the
data engine that powers robot learning at Mundane
.
We already have robots collecting real-world data and training runs working. Your job is to make the entire system scale — turning raw robot operation into high-quality datasets, fast training iteration, reproducible model releases, and reliable multi-user compute workflows.
This role sits at the foundation of our learning velocity. Your infrastructure decisions will directly determine how quickly our robots improve in the field.
At Mundane, your systems won't support toy benchmarks — they will power training runs that deploy to humanoids operating in customer environments.
What You'll Own
End-to-end robot learning infrastructure:
Logging → ingestion → storage → dataset versioning → indexing & sampling → training I/O performance → experiment reproducibility → GPU compute usability
You will define how data becomes models — reliably and repeatedly.
Responsibilities
- Design and implement a robust dataset format for robot learning (episodes, metadata, manifests, versioning).
- Build ingestion pipelines from robots (office + customer deployments) into centralized storage.
- Implement scalable storage and compression strategies, including time-synchronized video and high-rate sensor streams.
- Build fast dataset indexing and sampling tools (task-balanced sampling, hard-example mining, curriculum support).
- Improve dataloader throughput and stability (prefetching, caching, sharding, distributed loading).
- Standardize reproducible training workflows (dataset version + config + code commit + artifact lineage).
- Own and improve on-prem GPU training usability (multi-user workflows, monitoring, job hygiene).
- Enable multi-GPU training and distributed experimentation when needed.
- Partner closely with hardware, deployment, and learning teams to ensure logging, calibration, and data integrity are correct.
Qualifications
- Strong software engineering skills in Python; experience building reliable systems.
- Experience designing ML data and training pipelines (data formats, I/O optimization, reproducibility).
- Strong PyTorch experience; familiarity with distributed training.
- Comfortable working with messy real-world data (timestamp alignment, dropouts, calibration drift).
- Strong debugging skills across data, compute, and infrastructure.
Nice to Have
- Experience with robotics datasets and ROS2-based systems.
- Familiarity with object storage systems (S3/GCS), Parquet catalogs, or large-scale data lakes.
- Experience working with video pipelines (FFmpeg or similar).
- Experience with Ray, Slurm, K8s, or building lightweight internal schedulers.
- Background in embodied AI or large-scale real-world learning systems.
What You'll Get
- Direct ownership of the core infrastructure that determines robot learning speed.
- Early equity with meaningful upside in a venture-backed robotics company.
- The ability to see your systems power training runs that deploy to live humanoid robots.
- Exposure to the full stack: data collection, training systems, deployment, hardware integration.
- A front-row seat in scaling a technically ambitious robotics company from seed stage.
Perks:
Competitive salary + equity, flexible PTO, legendary merch, coffee, robots, sauna & cold plunge (pending)
Similar jobs
We build humanoid robots that work alongside people to solve labor shortages and create abundance. · Takend end-to-end ownership of autonomous capability development: data review, model design, deployment, · & fleet performance monitoring. · ...
1 week ago
We are working to create general-purpose robots capable of accomplishing a wide variety of dexterous tasks. To do this, our team is building general-purpose machine learning foundation models for dexterous robot manipulation. · Develop hardware platform making sure the robots and ...
1 week ago
We are looking for a machine learning engineer to develop our infrastructure and support researchers in the development of foundation models for robotics. · ,Communication protocol experience (ROS) · ,Strong software engineering skills in Python PyTorch ,and distributed systems , ...
1 month ago
We're developing new tools and capabilities to amplify the human experience at Toyota Research Institute (TRI). We're looking for a machine learning engineer to develop our infrastructure and support researchers in the development of foundation models for robotics. · We are worki ...
1 week ago
We are looking for a machine learning engineer to develop our infrastructure and support researchers in the development of foundation models for robotics. · ...
1 week ago
Company Description · We are an early-stage robotics startup working on building multi-purpose mobile robots that can do complex manipulation tasks. We are looking for a creative, skilled, and motivated robot learning engineers to join our team in advancing robot manipulation cap ...
2 days ago
Start Date: ASAP · About Us · Mundane is a venture-backed seed-stage robot learning startup founded by a team of Stanford researchers and builders. We're deploying a massive fleet of humanoid robots to perform mundane tasks in commercial environments, collecting data to build the ...
1 day ago
+The Mission · We are working to create general-purpose robots capable of accomplishing a wide variety of dexterous tasks. · +Qualifications · Hands-on experience with using machine learning for learned control, including RL, offline RL or behavior cloning, for manipulation. · S ...
1 week ago
We're on a mission to improve the quality of human life by creating robots that can learn, adapt, and assist in everyday environments. · We are building a world-class team to push the boundaries of robot learning, · combining machine learning, perception, prediction, · and action ...
1 month ago
We are working to create general-purpose robots capable of accomplishing a wide variety of dexterous tasks. · To do this, we're building general-purpose machine learning foundation models for dexterous robot manipulation. ...
2 weeks ago
We are working to create general-purpose robots capable of accomplishing a wide variety of dexterous tasks. · Data-efficient and general algorithms for learning robust policies using multiple sensing modalities: proprioception, images, 3D representations, force, and dense tactile ...
2 weeks ago
Company Description · We are an early-stage robotics startup working on building multi-purpose mobile robots that can do complex manipulation tasks. We are looking for a creative, skilled, and motivated robot learning engineers to join our team in advancing robot manipulation cap ...
2 days ago
Efference builds robust, high-performance robotic perceptual systems that make robots easier to develop and faster to deploy. · ...
1 month ago
Efference builds robust, high-performance robotic perceptual systems that make robots easier to develop and faster to deploy. Our technology integrates proprietary optical designs, advanced sensor fusion, depth estimation, image sensor processing, and tightly optimized hardware-s ...
1 month ago
We are currently deploying the first generation of vision systems with state-of-the-art hardware and software while actively prototyping generations two and three of our core perceptual system. · As a Robotics Intern at Efference, you will work on real, open ended problems at the ...
4 weeks ago
We're developing new tools and capabilities to amplify the human experience. · ...
3 weeks ago
Efference builds robust high-performance robotic perceptual systems that make robots easier to develop and faster to deploy. · ...
4 weeks ago
AI & Robotics Research Engineer - Learned Dexterous Manipulation
Only for registered members
Expert in reinforcement learning sought to build foundation models for dexterous humanoid manipulation. · We are seeking an expert in reinforcement learning to help build foundation models for dexterous humanoid manipulation. · ...
1 month ago
AI & Robotics Research Engineer - Learned Dexterous Manipulation
Only for registered members
Proception is seeking an expert in reinforcement learning to help build foundation models for dexterous humanoid manipulation. · ...
1 month ago
We're building robots that work where people shouldn't. Our team has successfully deployed machines into real construction sites and factories. · ...
2 weeks ago