Software Engineer, Frontier Clusters Infrastructure - San Francisco, CA
3 days ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
The Frontier Systems team at OpenAI builds, launches, and supports the largest supercomputers in the world that OpenAI uses for its most cutting edge model training. · We take data center designs, turn them into real, working systems and build any software needed for running larg ...
1 month ago
About the Team · The Frontier Systems team at OpenAI builds, launches, and supports the largest supercomputers in the world that OpenAI uses for its most cutting edge model training. · We take data center designs, turn them into real, working systems and build any software needed ...
1 day ago
We are looking for engineers to operate the next generation of compute clusters that power OpenAI's frontier research. · This role blends distributed systems engineering with hands-on infrastructure work on our largest datacenters.You will scale Kubernetes clusters to massive sca ...
3 weeks ago
We are looking for a Cluster & Infrastructure Engineer to build and operate large-scale AI clusters that power frontier-level training and inference workloads. You'll design reliable infrastructure for multi-node, multi-rack GPU and TPU systems, optimize cluster utilization and s ...
4 weeks ago
Exa is building a search engine from scratch to serve every AI application. We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to index it, and develop super high performant vector databases in Rust to search over it. We also own a $5M ...
1 week ago
We're looking for a Platform Engineer who will be instrumental in building and evolving ClusterdOS. · You'll design GitOps workflows, build Kubernetes operators and controllers, · and create automation that makes cluster management invisible to end users. ...
1 month ago
This is a full-time on-site role based in San Francisco, CA, for a Kubernetes Platform Engineer at Aranya Inc. We're looking for a Platform Engineer who will be instrumental in building and evolving ClusterdOS. · ...
1 month ago
I'm working with a well-funded early-stage infrastructure company building and operating large-scale GPU clusters for AI and HPC workloads. The team focuses on deploying, scaling, and automating high-performance compute infrastructure globally. · Responsibilities · • Deploy and o ...
2 hours ago
We are looking for an AI Infra engineer to join our growing team. We work with Kubernetes, Slurm, Python, C++, PyTorch, and primarily on AWS. As an AI Infrastructure Engineer, you will be partnering closely with our Inference and Research teams to build, deploy, and optimize our ...
1 week ago
We're hiring an Infrastructure Engineer to own a Kubernetes-based platform-improving developer velocity, production reliability and the cloud foundations we scale on. · What you'll do Own and evolve our Kubernetes platform (cluster lifecycle upgrades networking autoscaling polici ...
1 month ago
Site Reliability Engineer, Frontier Systems Infrastructure
Only for registered members
About the Team · The Frontier Systems team at OpenAI builds, launches, and supports the largest supercomputers in the world that OpenAI uses for its most cutting edge model training. · We take data center designs, turn them into real, working systems and build any software needed ...
1 day ago
Physical Intelligence is bringing general-purpose AI into the physical world. We are a team of engineers, scientists, roboticists, and company builders developing foundation models and learning algorithms to power the robots of today and the physically-actuated devices of the fut ...
4 weeks ago
+ Build and scale platform systems: Operate and evolve Kubernetes clusters and service deployment patterns,+ Own workflow orchestration infrastructure: Take platform-level ownership of async and multi-stage workflows,+ Drive observability and cost-aware infrastructure: Treat logg ...
4 weeks ago
Be a part of the AI revolution with sustainable technology at Crusoe. Here you'll drive meaningful innovation make a tangible impact and join a team that's setting the pace for responsible transformative cloud infrastructure. · ...
4 weeks ago
The Infrastructure team builds and operates the backbone of everything PI does: from training state-of-the-art VLA models, to orchestrating large-scale simulation, to reliably deploying intelligence across fleets of physical robots. · ...
4 weeks ago
We're building the company which will de-risk the largest infrastructure build-out in history. · When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster ...
1 day ago
About Krea · At Krea, we are building next-generation AI creative tools. · We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it. · We believe AI is a new medium that allows us to expres ...
1 day ago
Recruiting for an SF based deep-tech startup building foundation ML models for multi-physics simulation.It's a small, highly technical team - including multiple professors from top colleges like Berkeley, multiple PHDs and even a Nobel Prize winner Well funded, hybrid environment ...
1 week ago
We're building the company which will de-risk the largest infrastructure build-out in history. · When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster ...
1 day ago
We're building the company which will de-risk the largest infrastructure build-out in history. · When people finance GPU clusters, the datacenters housing them, and the infrastructure powering them, they need "offtake" - meaning someone has signed a contract to lease the cluster ...
2 days ago