Software Engineer, Frontier Clusters Infrastructure - San Francisco
1 week ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
The Frontier Systems team at OpenAI builds, launches, and supports the largest supercomputers in the world that OpenAI uses for its most cutting edge model training. · We take data center designs, turn them into real, working systems and build any software needed for running larg ...
1 month ago
We are looking for engineers to operate the next generation of compute clusters that power OpenAI's frontier research. · This role blends distributed systems engineering with hands-on infrastructure work on our largest datacenters.You will scale Kubernetes clusters to massive sca ...
2 weeks ago
We are looking for a Cluster & Infrastructure Engineer to build and operate large-scale AI clusters that power frontier-level training and inference workloads. You'll design reliable infrastructure for multi-node, multi-rack GPU and TPU systems, optimize cluster utilization and s ...
3 weeks ago
We build massive-scale infrastructure to crawl the web, train state-of-the-art embedding models to index it, and develop super high performant vector databases in Rust to search over it. · ...
2 weeks ago
We're looking for a Platform Engineer who will be instrumental in building and evolving ClusterdOS. · You'll design GitOps workflows, build Kubernetes operators and controllers, · and create automation that makes cluster management invisible to end users. ...
1 month ago
We are looking for an AI Infra engineer to join our growing team. · We work with Kubernetes, Slurm, Python, C++, PyTorch, · and primarily on AWS. · Design scalable clusters for AI model inference and training · Benchmark system performance and diagnose bottlenecks · ...
1 month ago
This is a full-time on-site role based in San Francisco, CA, for a Kubernetes Platform Engineer at Aranya Inc. We're looking for a Platform Engineer who will be instrumental in building and evolving ClusterdOS. · ...
1 month ago
We're hiring an Infrastructure Engineer to own a Kubernetes-based platform-improving developer velocity, production reliability and the cloud foundations we scale on. · What you'll do Own and evolve our Kubernetes platform (cluster lifecycle upgrades networking autoscaling polici ...
4 weeks ago
Site Reliability Engineer, Frontier Systems Infrastructure
Only for registered members
+ We are looking for engineers to operate the next generation of compute clusters that power OpenAI's frontier research. This role blends distributed systems engineering with hands-on infrastructure work on our largest datacenters. · + Spin up and scale large Kubernetes clusters, ...
1 week ago
Physical Intelligence is bringing general-purpose AI into the physical world. We are a team of engineers, scientists, roboticists, and company builders developing foundation models and learning algorithms to power the robots of today and the physically-actuated devices of the fut ...
2 weeks ago
The Infrastructure team builds and operates the backbone of everything PI does: from training state-of-the-art VLA models, to orchestrating large-scale simulation, to reliably deploying intelligence across fleets of physical robots. · ...
5 days ago
+ Build and scale platform systems: Operate and evolve Kubernetes clusters and service deployment patterns,+ Own workflow orchestration infrastructure: Take platform-level ownership of async and multi-stage workflows,+ Drive observability and cost-aware infrastructure: Treat logg ...
2 weeks ago
We are building next-generation AI creative tools at Krea. We're dedicated to making AI intuitive and controllable for creatives. · Krea's mission is to build tools that empower human creativity, not replace it. · ...
1 week ago
The Infrastructure team builds and operates the backbone of everything PI does: from training state-of-the-art VLA models, to orchestrating large-scale simulation, to reliably deploying intelligence across fleets of physical robots. · ...
3 weeks ago
Be a part of the AI revolution with sustainable technology at Crusoe. Here you'll drive meaningful innovation make a tangible impact and join a team that's setting the pace for responsible transformative cloud infrastructure. · ...
2 weeks ago
Recruiting for an SF based deep-tech startup building foundation ML models for multi-physics simulation.It's a small, highly technical team - including multiple professors from top colleges like Berkeley, multiple PHDs and even a Nobel Prize winner Well funded, hybrid environment ...
1 day ago
We build infrastructure and tooling that is iterable, scalable, and secure. We write everything in code from network infrastructure to server management, database provisioning, and data pipelines. · ...
3 weeks ago
Anyscale is looking for a Software Engineer to join the Infrastructure team. Anyscale aims to provide the next generation of tools and infrastructure to make developing and running distributed AI applications in the cloud as easy as on your laptop. · ...
3 weeks ago
We build infrastructure and tooling that is iterable, scalable, and secure. We write everything in code from network infrastructure to server management, database provisioning, and data pipelines. Recently we have focused on scaling up our Kubernetes infrastructure. · Additionall ...
1 month ago
This is a senior, hands‑on infrastructure role that works directly with customers to design, deploy and troubleshoot large GPU clusters used for AI training and inference. · ...
2 weeks ago