Manager – AI Infrastructure Operations - Sunnyvale
2 weeks ago

Job summary
We are hiring senior AI infrastructure leaders who've operated hyperscale AI fleets and understand what it means to run platforms where minutes of downtime matter.Responsibilities
- Defining availability targets, escalation policies, and the operating strategy for next‑generation AI systems.
- Serving as the executive escalation point for complex, customer‑impacting events — driving technical clarity, cross‑org alignment, and durable systemic fixes.
Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
AI Infrastructure Operations Engineer
2 weeks ago
+Job summary · Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. · +Manage and operate multiple advanced AI compute infrastructure clusters. · Monitor and oversee cluster health, proactively identifying and resolving potential issues. · ...
Manager – AI Infrastructure Operations
5 days ago
We are seeking a senior leader to oversee the operation and reliability of our advanced AI compute infrastructure. · ...
AI Infrastructure Operations Engineer
6 days ago
The AI Infrastructure Operations Engineer (SiteOps) is an entry-level individual contributor role focused on the deployment, bring-up, · monitoring, · and first-line troubleshooting of Cerebras AI infrastructure in data center environments.This approach allows Cerebras to deliver ...
AI Infrastructure Operations Engineer
6 days ago
The AI Infrastructure Operations Engineer (SiteOps) role supports the deployment and reliable operation of Cerebras AI clusters in data center environments. · • Execute defined hardware bring-up and validation procedures, · • Monitor telemetry, · • Perform first-line troubleshoot ...
Infrastructure Operations Program Manager
1 month ago
The Customer Experience (CX) Organization at CoreWeave is dedicated to ensuring every client running AI workloads at scale has a seamless, reliable, and high-performance experience. · Own and drive RMA workflows · ...
AI Infrastructure Operations Engineer
6 days ago
The AI Infrastructure Operations Engineer (SiteOps) role involves deploying and maintaining Cerebras AI infrastructure in data center environments. · Assist with deployment and bring-up of CS-X systems, cluster servers, and networking hardware · ...
Infrastructure Operations Program Manager
1 month ago
We are looking for an Infrastructure Operations Program Manager to operationalise and scale our bare metal support & RMA programs across EMEA. You will lead a pivotal function, serving as the liaison between clients, internal teams, and external vendors to ensure effective commun ...
Join our team that transforms Apple's cloud infrastructure planning strategy. · Perform power and cooling capacity assessments for current and future deployments · Develop strategic plans for facility expansions and infrastructure upgrades · Evaluate and pilot new cooling technol ...
Operations Manager, Integrations/Infrastructure
1 month ago
We're looking for a builder - someone who loves solving hard operational problems with code, data, and product intuition. Part consultant, part product thinker, part data analyst, · You'll help us automate and scale traditionally manual workflows that underpin healthcare operatio ...
Operations Manager, Integrations/Infrastructure
4 weeks ago
+ Automatizar flujos de trabajo manualmente con SQL y APIs · + Diseñar e implementar soluciones escalables utilizando LLMs · + Trabajar en equipo con Product, Ingeniería y cuenta para influir en prioridades de roadmap. ...
We're looking for a builder - someone who loves solving hard operational problems with code, data, and product intuition. · We want people who are excellent operators, strong problem solvers, and comfortable building from scratch. · This full-time position requires working 5 days ...
We're looking for a builder - someone who loves solving hard operational problems with code, data, and product intuition. Part consultant, part product thinker, part data analyst, · you'll help us automate and scale traditionally manual workflows that underpin healthcare operatio ...
We're looking for a builder - someone who loves solving hard operational problems with code, data, · and product intuition. Part consultant, part product thinker, part data analyst, · You'll help us automate and scale traditionally manual workflows that underpin healthcare · oper ...
The Senior Director of Strategy and Operations for the Infrastructure and Solutions Group (ISG) in Google Cloud will work directly with business executives and key leaders on key business projects. · ...
Senior Infrastructure Operations Engineer
1 week ago
We are seeking a proactive, motivated Infrastructure Operations Engineer to support datacenter operations, network/server incident response, and automation improvements. · ...
The Chief Operating Officer (COO) for Hybrid and On-prem Infrastructure is a critical strategic leadership · and business management position within Google Cloud. ...
Cerebras Systems builds the world's largest AI chip, · Our novel wafer-scale architecture provides the AI compute power · of dozens of GPUs on a single chip with the programming simplicity · of a single device. · ...
As a Program Manager at Google, you'll lead complex, multi-disciplinary projects from start to finish — working with stakeholders to plan requirements, manage project schedules, identify risks, and communicate clearly with cross-functional partners across the company. · ...
· A problem isn't truly solved until it's solved for all. That's why Googlers build products that help create opportunities for everyone. · Support program management operations for AI chip development teams. · Collect and report technical program status across a broad portfolio ...
As a Program Manager at Google, you'll lead complex, multi-disciplinary projects from start to finish — working with stakeholders to plan requirements, manage project schedules... · ...