- Maintain production multimodal services operational.
- Instrument, monitor and optimize the performance and reliability of our service.
- Implement and maintain automation tools and processes to prevent and mitigate service disruptions.
- Collaborate with development teams to design and implement scalable, reliable systems.
- Participate in on-call rotations to provide support for critical incidents and outages.
- 5+ years of experience in a development focused SRE role in a technology organization with scale
- Possess deep coding experience with specific expertise in Python, SQL, Linux, CI/CD, Kubernetes, Terraform
- Have worked with multiple cloud computing platforms such as GCP
- Can reliably troubleshoot across a range of platforms and systems
- Have familiarity with WebRTC stacks / services
- Have familiarity with GPU clusters and/or HPC environments
- Have familiarity with monitoring and logging tools such as Prometheus and Grafana
- Have first-hand experience scaling a consumer product from early days into hypergrowth
-
Sr. Site Reliability Engineer
3 weeks ago
Outdefine San Francisco, CA, United Statesfull time $ /yr remote ???????? USD · full time $ /yr hybrid ???????? USD · #J-18808-Ljbffr ...
-
Reliability Engineer
6 days ago
Mainspring Energy Menlo Park, United StatesCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation - the linear generator - that delivers local, scalable, and fuel-flexible power to help accelerate the transition ...
-
Reliability Engineer
6 days ago
Mainspring Energy, Inc. Menlo Park, United StatesJob Description · Job DescriptionCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible power to ...
-
Hardware Reliability Engineer
1 week ago
Wing Palo Alto, United StatesAbout Wing: · Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's ...
-
Hardware Reliability Engineer
2 weeks ago
Wing Aviation Stanford, United StatesAbout Wing: · Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's ...
-
Product Reliability Engineer
2 weeks ago
Palantir Technologies Palo Alto, United StatesA World-Changing Company · Palantir builds the world's leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missi ...
-
Product Reliability Engineer
3 weeks ago
Palantir Technologies Palo Alto, CA, United StatesA World-Changing Company · Palantir builds the world's leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missi ...
-
Site Reliability Engineer
2 weeks ago
Palantir Technologies Palo Alto, United StatesSite Reliability Engineer - Security Infrastructure · Palantir builds the world's leading software for data-driven decisions and operations. By bringing the right data to the people who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chai ...
-
Site Reliability Engineer
4 days ago
TEKsystems Palo Alto, United States Paid Work:Role: Site Reliability Engineer (SRE for Cloud) · Location: Remote Project - MUST live in Pacific coast time zone · Duration: 1 year with possible extension · Number of positions: 1 · We urgently looking for 1 Site Reliability Engineer (SRE for Cloud), mid level, who are availab ...
-
Reliability Validation Engineer
2 weeks ago
Dexterity Redwood City, United StatesTitle: Reliability Validation Engineer · Location: Redwood City, California · Job Classification: Full Time · At Dexterity, we believe robots can positively transform the world. Our breakthrough technology frees people to do the creative, inspiring, problem-solving jobs that h ...
-
Site Reliability Engineer
2 weeks ago
Attain Redwood City, United StatesAbout Attain · Built for consumers and companies, alike. · In a world driven by data, we believe consumers and businesses can coexist. Our founders had a vision to empower consumers to leverage their greatest asset-their data-in exchange for modern financial services. Built wit ...
-
Staff Site Reliability Engineer
2 weeks ago
Character Technologies Menlo Park, United StatesAbout the roleThe Role:As the founding member of our DevOps/Site Reliability Engineer function here at Character, youll have the opportunity to support our infrastructure with thousands of nodes, terabytes of data and millions of daily active users on our site. Youll be responsib ...
-
Staff Site Reliability Engineer
2 weeks ago
GRAIL, Inc. Menlo Park, United StatesMenlo Park, CAResearch & Development Technology /Full-Time /HybridGRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify mu ...
-
Site Reliability Engineer
6 days ago
C3 AI Redwood City, United States, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...
-
Senior Site Reliability Engineer
2 weeks ago
Character Technologies Menlo Park, United StatesAbout the roleResponsibilities:As a Multimodal Site Reliability Engineer (SRE) at Character, you will be responsible for ensuring the reliability, scalability, and performance of our app and AI multimodal services (e.g., voice interfacing services). You will work closely with our ...
-
Staff Site Reliability Engineer
6 days ago
GRAIL, Inc. Menlo Park, United StatesGRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify multiple deadly cancer types early. The company is using the power o ...
-
Senior Site Reliability Engineer
2 weeks ago
Character Technologies Menlo Park, United StatesAbout the roleResponsibilities:As a Multimodal Site Reliability Engineer (SRE) at Character, you will be responsible for ensuring the reliability, scalability, and performance of our app and AI multimodal services (e.g., voice interfacing services). · Read on to find out what yo ...
-
Senior Site Reliability Engineer
2 weeks ago
Character Menlo Park, United StatesAbout the role · Responsibilities: · As a Multimodal Site Reliability Engineer (SRE) at Character, you will be responsible for ensuring the reliability, scalability, and performance of our app and AI multimodal services (e.g., voice interfacing services). You will work closely w ...
-
Senior Site Reliability Engineer
1 day ago
Character Menlo Park, United StatesAbout the role · Responsibilities: · As a Multimodal Site Reliability Engineer (SRE) at Character, you will be responsible for ensuring the reliability, scalability, and performance of our app and AI multimodal services (e.g., voice interfacing services). You will work closely ...
-
Staff Site Reliability Engineer
2 weeks ago
Character Menlo Park, United StatesAbout the role · The Role: · As the founding member of our DevOps/Site Reliability Engineer function here at Character, you'll have the opportunity to support our infrastructure with thousands of nodes, terabytes of data and millions of daily active users on our site. You'll be ...
Senior Site Reliability Engineer - Menlo Park, United States - Character
Description
About the roleResponsibilities:
As a Multimodal Site Reliability Engineer (SRE) at Character, you will be responsible for ensuring the reliability, scalability, and performance of our app and AI multimodal services (e.g., voice interfacing services). You will work closely with our development team to design and implement processes and systems that ensure the stability and availability of our service.
Requirements:
Ready to empower the world with AGI?
Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character is a full-stack AI powerhouse and ranks among the most utilized AI research platforms globally. Our innovative approach allows users to customize their experience with personalized AI 'Characters.'
In just two years, we achieved unicorn status and were named Google Play's AI App of the Year - a testament to our groundbreaking technology and vision.
Noam co-invented core LLM tech and was recently honored as one of TIME's 100 Most Influential in AI. Daniel created LaMDA, the breakthrough conversational AI now powering Google's Bard.
We encourage you to apply even if you don't meet all qualifications. Underrepresented individuals often experience imposter syndrome-don't underestimate yourself.
Our commitment to diversity:
Character values diversity and welcomes applicants of all backgrounds. We are an equal opportunity employer and firmly uphold a non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability. Your unique perspectives are vital to us.