Network Engineer, Reliability - San Francisco
1 month ago

About The Role
We are seeking a Network Engineer to serve as a reliability engineer championing and building process data collections and reliability metrics with the objective of improving the quality and reliability of AI networks from deployment through operations.
About You
- li>Strong Operations Background: 5+ years in network engineering and at least 3+ years in operations with significant hands-on operational experience
- Datacenter Fabric Expertise: Deep experience operating modern datacenter networks including EVPN/VXLAN BGP CLOS topologies high radix switching <li>Incident Response Excellence: Proven ability to lead incident response perform systematic troubleshooting drive issues to resolution </li><ul class=\
Job description
Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.Get full accessAccess all high-level positions and get the job of your dreams.
Similar jobs
Cisco Silicon One ASICs are transforming the Future of the Internet. · Owning reliability test plans for new products. · Supporting High power Burn In, biased HAST and ESD/LU bring-up and debug for reliability qualification and evaluation. · ...
4 days ago
We're growing incredibly fast and need someone who thrives in a dynamic, high-pressure environment to work side-by-side with the our engineering and leadership team. · Write tests, monitoring, and evaluating the health of our platform. · Design, write, and maintain automated test ...
1 month ago
We are making sure that when businesses build AI agents the experience of doing so doesn't suck. We're growing fast and need someone who thrives in a dynamic environment to work side-by-side with our engineering team.Your job is to write tests, monitoring, and evaluating the heal ...
1 month ago
We are making sure that when businesses build AI agents the experience of doing so doesn't suck.Our team is a group of ex-athletes founders and builders with low egos and a high belief that life not about taking the easy road but challenging ourselves to find the most we can be. ...
4 days ago
Verrus is redefining the future of data centers with an emphasis on innovation, flexibility, and sustainability. · ...
2 weeks ago
Job summary · Our mission is to increase economic freedom in the world. It's a massive opportunity that demands the best of us every day · ,Responsibilities:Improve observability reliability and availability by defining and measuring key metrics · Build automation and improve sys ...
2 weeks ago
Mercor is creating a new category of work where expertise powers AI advancement. · Own reliability and production safety for core shared services and customer-facing systems. · Partner directly with infrastructure leadership to define SRE priorities, reliability standards, and pr ...
1 month ago
We're hiring an SRE to join our engineering team at Plenful. · You'll bring strong technical judgment, calm problem solving during incidents and a practical approach to improving reliability. · ...
3 weeks ago
We're hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that power our product. · You'll work across our distributed workflow engine, serverless pipelines, containerized services and Postgres based data laye ...
3 weeks ago
Join the engineering teams that bring OpenAI's ideas safely to the world as a Software Engineer in Reliability role at OpenAI in San Francisco. · ...
1 month ago
We're a fully distributed team with employees across North American time zones. · We build the systems and practices that keep everything running smoothly—handling hundreds of millions of requests, · minimizing downtime, and continuously improving service performance.The Site Rel ...
1 month ago
We are seeking an experienced Site Reliability Engineer to join our Platform Engineering team in the Bay Area. · ...
1 month ago
We are looking for experienced problem-solving engineers to ensure our systems scale. We seek to learn from deployment and distribute the benefits of AI while ensuring that this powerful tool is used responsibly and safely. · ...
6 days ago
+ Reliability expert to maintain and enhance the stability and scalability of our rapidly evolving infrastructure. · + Design and implement solutions to ensure the scalability of our infrastructure. · + Build and maintain load, chaos and synthetic testing software. · Job summary: ...
6 days ago
We are building the infrastructure for abundant intelligence at FluidstackWe partner with top AI labs, governments, and enterprises - including Mistral, Poolside, Black Forest Labs, Meta · , · Fluidstack seeks a Network Engineer to champion and build process reliability metrics f ...
6 days ago
We are creating a new category of work where expertise powers AI advancement. · Ambitious team that works alongside researchers, operators, · and AI companies shaping systems redefining society.. · ...
1 month ago
we are seeking an experienced site reliability engineer to join our platform engineering team in the bay area you ll be instrumental in ensuring the high availability performance and scalability of coderabbit s ai powered code review platform this role sits at the intersection of ...
1 month ago
We are looking for a Hardware · Reliability Engineer. In this role, · you will be responsible for planning · and executing hardware reliability tasks · for Oura's wearable products.- Plan document and also execute reliability testing for Oura hardware products and accessories un ...
2 weeks ago
We are seeking an experienced Site Reliability Engineer to join our Platform Engineering team in the Bay Area. · Design and implement scalable infrastructure on Google Cloud Platform. · Own critical platform services. · ...
1 month ago
We are looking for people who are passionate about a more sustainable future and want to make that vision a reality. · Collaborate with mechanical and electrical engineers during hardware design to establish and refine component and system level reliability targets for our power ...
1 month ago