Member of Technical Staff, Supercomputing Platform - San Francisco - Magic Inc

Magic Inc San Francisco

2 days ago

Description

Magic's Mission

Magic's mission is to build safe AGI that accelerates humanity's progress on the world's most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alone. Our approach combines frontier-scale pre-training, domain-specific RL, ultra-long context, and inference-time compute to achieve this goal.

About The Role

As an engineer on the Supercomputing Platform & Infrastructure team, you will design, build, and operate the large-scale GPU infrastructure that powers Magic's model training and inference workloads.

A core part of this role is building and maintaining our infrastructure using Terraform-driven infrastructure-as-code practices, ensuring reproducibility, reliability, and operational clarity across clusters spanning thousands of GPUs.

Magic's long-context models create sustained pressure on compute, networking, and storage systems. Long-running distributed jobs, high-throughput data movement, and strict availability requirements demand infrastructure that is automated, observable, and resilient by design. You will own the systems and IaC foundations that make this possible.

This role can evolve into broader ownership of supercomputing platform architecture, shaping how Magic scales GPU clusters and infrastructure reliability as model workloads grow.

What You'll Work On

Design and operate large-scale GPU clusters for training and inference
Build and maintain infrastructure using Terraform across cloud and hybrid environments
Develop modular, scalable IaC patterns for compute, networking, and storage provisioning
Improve deployment reproducibility, environment consistency, and operational safety
Optimize networking and storage systems for high-throughput AI workloads
Automate fault detection and recovery across distributed clusters
Debug complex cross-layer issues spanning hardware, drivers, networking, storage, OS, and cloud
Improve observability, monitoring, and reliability of core platform systems

What We're Looking For

Strong systems engineering fundamentals
Deep, hands-on experience with Terraform, including module design, state management, environment isolation, and large-scale deployments
Experience operating production GPU infrastructure or high-performance distributed systems
Strong understanding of networking and storage systems
Experience with major cloud platforms (GCP, AWS, Azure, OCI, etc.)
Track record of owning production-critical infrastructure end-to-end

Compensation, Benefits, And Perks (US):

Annual salary range between $200K - $550K depending on experience
Equity is a significant part of total compensation, in addition to salary
401(k) plan with 6% salary matching
Generous health, dental and vision insurance for you and your dependents
Unlimited paid time off
Visa sponsorship and relocation stipend to bring you to SF, if possible
A small, fast-paced, highly focused team

Magic strives to be the place where high-potential individuals can do their best work. We value quick learning and grit just as much as skill and experience.

Our Culture

Integrity. Words and actions should be aligned
Hands-on. At Magic, everyone is building
Teamwork. We move as one team, not N individuals
Focus. Safely deploy AGI. Everything else is noise
Quality. Magic should feel like magic

Work in company
Member of Technical Staff, Supercomputing Platform
Only for registered members

Magic's mission is to build safe AGI that accelerates humanity's progress on the world's most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alo ...

San Francisco $200,000 - $550,000 (USD)
6 days ago
Work in company
Member of Technical Staff, Supercomputing Platform
Only for registered members

Magic's mission is to build safe AGI that accelerates humanity's progress on the world's most important problems. We believe the most promising path to safe AGI lies in automating research and code generation to improve models and solve alignment more reliably than humans can alo ...

San Francisco
21 hours ago
Work in company
Software Engineer, Data Visualization
Only for registered members

The Data Visualization team at OpenAI is responsible for building and maintaining all the visualization tools used for analyzing various software and hardware aspects of our custom-built hyperscale supercomputers. · This includes visualizing hardware (nodes, network, racks, etc.) ...

San Francisco, CA
2 weeks ago
Work in company
Software Engineer, Data Visualization
Only for registered members

The Data Visualization team at OpenAI is responsible for building and maintaining all the visualization tools used for analyzing various software and hardware aspects of our custom-built hyperscale supercomputers. · ...

San Francisco, CA
1 month ago
Work in company
Software Engineer, Collective Communication
Only for registered members

About the Team · The Workload Networking team is responsible for the collective communication stack used in our largest training jobs. Using a combination of C++ and CUDA we work on novel collective communication techniques that enable efficient training of our flagship models on ...

San Francisco
1 week ago
Work in company
Software Engineer, Collective Communication
Only for registered members

The Workload Networking team is responsible for the collective communication stack used in our largest training jobs. · Using a combination of C++ and CUDA we work on novel collective communication techniques that enable efficient training of our flagship models on our largest cu ...

San Francisco $380,000 - $555,000 (USD)
1 month ago
Work in company
Senior Site Reliability Engineer
Only for registered members

We are hiring for a fast-growing AI marketing analytics company that helps enterprises understand what truly drives business outcomes.Using GPU-native analytics and causal AI, · the platform enables large organizations to measure marketing impact accurately · and make confident d ...

San Francisco
3 weeks ago
Work in company
Senior Site Reliability Engineer
Only for registered members

We are hiring for a fast-growing AI marketing analytics company that helps enterprises understand what truly drives business outcomes. · ...

San Francisco, CA
3 weeks ago
Work in company
Senior SRE
Only for registered members

We are hiring for a fast-growing AI marketing analytics company that helps enterprises understand what truly drives business outcomes. · Work on GPU-native analytics and causal AI used by Fortune 100 companies. · ...

San Francisco
3 weeks ago
Work in company
Deep Learning Compiler Engineer
Only for registered members

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated sma ...

San Francisco $160,000 - $240,000 (USD) per year
1 week ago
Work in company
Networking Operating System Firmware Engineer
Only for registered members

We're seeking a Networking Operating System Firmware Engineer to help bootstrap and scale the switching layer of our AI supercomputers. · In this role,you'll build and maintain custom SONiC NOS images from scratch, · working across the Linux kernel ,switch ASIC SAI/SDKs ,platform ...

San Francisco, CA
1 month ago
Work in company
Senior AI/ML Specialist Solutions Architect
Only for registered members

· About the Company · Our client is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform that's reshaping the landscape of artificial intelligence. The company provides cutting-edge infrastructure, including large-scale GPU clust ...

San Francisco
1 week ago
Work in company
Senior AI/ML Specialist Solutions Architect
Only for registered members

About the Company · Our client is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform that's reshaping the landscape of artificial intelligence. The company provides cutting-edge infrastructure, including large-scale GPU clusters ...

San Francisco $225,000 - $275,000 (USD) Full time
1 week ago
Work in company
Cloud Solutions Architect
Only for registered members

· About the Company · Our client is at the forefront of the AI revolution, providing cutting-edge infrastructure that's reshaping the landscape of artificial intelligence. They offer an AI-centric cloud platform that empowers Fortune 500 companies, top-tier innovative startups, ...

San Francisco $135,000 - $210,000 (USD) per year
1 week ago
Work in company
Senior AI/ML Specialist Solutions Architect
Only for registered members

About the Company · Our client is a publicly traded company at the forefront of the AI revolution, offering an AI-centric cloud platform that's reshaping the landscape of artificial intelligence. The company provides cutting-edge infrastructure, including large-scale GPU clusters ...

San Francisco
1 week ago
Work in company
Cloud Solutions Architect
Only for registered members

About the Company · Our client is at the forefront of the AI revolution, providing cutting-edge infrastructure that's reshaping the landscape of artificial intelligence. They offer an AI-centric cloud platform that empowers Fortune 500 companies, top-tier innovative startups, and ...

San Francisco $225,000 - $275,000 (USD) Full time
1 week ago
Work in company
Cloud Solutions Architect
Only for registered members

We are seeking a Cloud Solutions Architect (Pre-Sales) to join our client's team. · ...

San Francisco $180,000 - $300,000 (USD) Full time
1 month ago
Work in company
Networking Operating System Firmware Engineer
Only for registered members

About the Team · OpenAI's Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research par ...

San Francisco
1 week ago
Work in company
MTS, Developer Experience
Only for registered members

We are the AGI Autonomy organization, and we are looking for a driven and talented Member of Technical Staff to join us to build state-of-the-art agents. · Our lab is a small, talent-dense team with the resources and scale of Amazon. Each team in the lab has the autonomy to move ...

San Francisco, CA
1 month ago
Work in company
MTS, Developer Experience
Only for registered members

We are looking for a driven and talented Member of Technical Staff to join our team to build state-of-the-art agents. · Design and implement a modern, fast, and ergonomic development environment for AI researchers. · Build and manage CI/CD pipelines that support large-scale AI re ...

San Francisco
1 month ago
Work in company
Technical Program Manager, Hardware Systems
Only for registered members

The Compute team works on the design of our AI supercomputers doing everything from workload modeling to accelerator co-design We're leaning into our partnerships to make data center co-design an integral part of this process and are looking for engineers to design AI supercomput ...

San Francisco $270,000 - $340,000 (USD)
1 month ago

Member of Technical Staff, Supercomputing Platform
Only for registered members San Francisco
Member of Technical Staff, Supercomputing Platform
Only for registered members San Francisco
Software Engineer, Data Visualization
Only for registered members San Francisco, CA
Software Engineer, Data Visualization
Only for registered members San Francisco, CA
Software Engineer, Collective Communication
Only for registered members San Francisco
Software Engineer, Collective Communication
Only for registered members San Francisco
Senior Site Reliability Engineer
Only for registered members San Francisco
Senior Site Reliability Engineer
Only for registered members San Francisco, CA
Senior SRE
Only for registered members San Francisco
Deep Learning Compiler Engineer
Only for registered members San Francisco
Networking Operating System Firmware Engineer
Only for registered members San Francisco, CA
Senior AI/ML Specialist Solutions Architect
Only for registered members San Francisco
Senior AI/ML Specialist Solutions Architect
Full time Only for registered members San Francisco
Cloud Solutions Architect
Only for registered members San Francisco
Senior AI/ML Specialist Solutions Architect
Only for registered members San Francisco
Cloud Solutions Architect
Full time Only for registered members San Francisco
Cloud Solutions Architect
Full time Only for registered members San Francisco
Networking Operating System Firmware Engineer
Only for registered members San Francisco
MTS, Developer Experience
Only for registered members San Francisco, CA
MTS, Developer Experience
Only for registered members San Francisco
Technical Program Manager, Hardware Systems
Only for registered members San Francisco