- Keep a large production service up and running including:
- Host OS upgrades
- Docker image upgrades
- SSL certificate upgrades
- Define and refine metrics to track service health and performance.
- Automate software releases and service failovers.
- Bachelor's degree in Engineering, Mathematics or related field and 4+ years of relevant experience
- Experience supporting multiple production services
- Experience utilizing tools effectively such as Ansible, Terraform or Salt
- Ability to extract and report useful performance or service metrics
- Linux experience in any capacity
- Familiarity with Python/Bash scripting
- AWS, ECS, Kubernetes
- Master's degree in computer science or related degree
-
Site Reliability Engineer
4 days ago
Zoox Foster City, United States Full timeZoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service fro ...
-
Site Reliability Engineer
3 weeks ago
Zoox Foster City, United StatesZoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service fro ...
-
Site Reliability Engineer
1 week ago
Zoox Foster City, United StatesZoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service fro ...
-
Camera Reliability Engineer
1 day ago
Skydio San Mateo, United StatesSkydio is the leading US drone company and the world leader in autonomous flight, the key technology for the future of drones and aerial transportation. The Skydio team combines deep expertise in artificial intelligence, best-in-class hardware and software product development, an ...
-
Site Reliability Engineer
1 week ago
eTek IT San Mateo, United StatesPosition : Site Reliability Engineer · Location : San Mateo, CA · Required Skills · • Must Haves: 3 to 5 years exp. Kubernetes, DataDog, cloud services, large scale systems, AWS&GCP, minor Azure · • GKE, home strung clusters on prem, and AKS (Very Small), EKS · • Consistent up ...
-
Site Reliability Engineer
2 weeks ago
Arkose Labs San Mateo, United StatesThe mission of Arkose Labs is to create an online environment where all consumers are protected from online spam and abuse. Recognized by G2 as the 2023 Leader in Bot Detection and Mitigation, with the highest score in customer satisfaction and largest market presence four quarte ...
-
Site Reliability Engineer
2 weeks ago
Zoox San Mateo, United StatesZoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In this role, you will be heavily involved in all phases of rolling out a service fro ...
-
Site Reliability Engineering
2 weeks ago
eTek IT Services, Inc. San Mateo, United StatesJob Description · Job DescriptionRequired Skills · • Must Haves: 3 to 5 years exp. Kubernetes, DataDog, cloud services, large scale systems, AWS GCP, minor Azure · • GKE, home strung clusters on prem, and AKS (Very Small), EKS · • Consistent upgrades across all the clusters and ...
-
Site Reliability Engineer
6 days ago
Verkada San Mateo, United StatesWho We Are · Verkada is the largest cloud-based B2B physical security platform company in the world. Only Verkada offers six product lines - video security cameras, access control, environmental sensors, alarms, workplace and intercoms - integrated with a single cloud-based soft ...
-
Site Reliability Engineer
2 weeks ago
C3 AI Redwood City, United States, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...
-
Site Reliability Engineer
1 week ago
C3 AI Inc. Redwood City, United States, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...
-
Site Reliability Engineer
1 week ago
Box Redwood City, United StatesWHAT IS BOX? · Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, coll ...
-
Reliability Engineer
3 weeks ago
Mainspring Energy, Inc. Menlo Park, United StatesJob Description · Job DescriptionCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible power to ...
-
Reliability Engineer
3 weeks ago
Comtech Telecom Santa Clara, United States Full time RegularComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...
-
Site Reliability Engineer
2 weeks ago
Verkada San Mateo, United StatesWho We Are · Verkada is the largest cloud-based B2B physical security platform company in the world. Only Verkada offers six product lines — video security cameras, access control, environmental sensors, alarms, workplace and intercoms — integrated with a single cloud-based soft ...
-
Staff/Senior Staff Site Reliability Engineer
6 days ago
Zoox Foster City, United StatesFoster City, CA · • Full-time · Staff/Senior Staff Site Reliability Engineer · Zoox is looking for a site reliability engineer who will be responsible for measuring and maintaining the uptime of the many services critical to the development process for autonomous vehicles. In t ...
-
Site Reliability Engineer, Senior Manager
6 days ago
Arkose Labs San Mateo, United StatesThe mission of Arkose Labs is to create an online environment where all consumers are protected from online spam and abuse. Recognized by G2 as the 2023 Leader in Bot Detection and Mitigation, with the highest score in customer satisfaction and largest market presence four quarte ...
-
Senior Manager, Site Reliability Engineering
6 days ago
Geico Insurance San Mateo, United StatesSenior Manager, Site Reliability Engineering - Datacenter Hardware and IaaS · Position Summary · GEICO is seeking an experienced Senior Manager with a passion for building high performance, low-latency platforms, and applications. You will build and manage a team of engineers w ...
-
Site Reliability Engineer, Senior Manager
3 weeks ago
Arkose Labs San Mateo, United StatesJob Description · Job DescriptionThe mission of Arkose Labs is to create an online environment where all consumers are protected from online spam and abuse. Recognized by G2 as the 2023 Leader in Bot Detection and Mitigation, with the highest score in customer satisfaction and la ...
-
Senior Electrical Reliability Engineer
1 day ago
Element Science Inc Redwood City, United StatesSENIOR ELECTRICAL RELIABILITY ENGINEER · Element Science, Inc. is a medical device and digital health company focused on developing solutions at the intersection of clinical-grade wearables, machine learning algorithms, and lifesaving therapies in order to address leading causes ...
Site Reliability Engineer - Foster City, United States - Bayone
![Default job background](https://contents.bebee.com/public/img/bg-user-ex-1.jpg)