-
Reliability Engineer
2 days ago
Mainspring Energy, Inc. Menlo Park, United StatesJob Description · Job DescriptionCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible power to ...
-
Reliability Engineer
2 days ago
Mainspring Energy Menlo Park, United StatesCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation - the linear generator - that delivers local, scalable, and fuel-flexible power to help accelerate the transition ...
-
Reliability Engineer
2 days ago
Mainspring Energy Menlo Park, United StatesCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation - the linear generator - that delivers local, scalable, and fuel-flexible power to help accelerate the transition ...
-
Reliability Engineer
2 weeks ago
Mainspring Energy, Inc. Menlo Park, United StatesJob Description · Job Description · Company Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible pow ...
-
Site Reliability Engineer
3 weeks ago
Aptos Palo Alto, United StatesAptos is a people-first blockchain on a mission to help billions of people achieve universal and fair access to decentralized assets in a safe and scalable way. · Founded by some of the original creators and maintainers that researched, designed, and built the Diem blockchain to ...
-
Hardware Reliability Engineer
2 days ago
Wing Aviation Palo Alto, United StatesAbout Wing: · Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's ...
-
Site Reliability Engineer
1 hour ago
Salesforce Palo Alto, United StatesTo get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job CategorySoftware Engineering · Job DetailsAbout SalesforceWe're Salesforce, the Customer Company, inspiring the future of busine ...
-
Hardware Reliability Engineer
22 hours ago
Wing Aviation Palo Alto, United StatesAbout Wing: · Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's ...
-
Site Reliability Engineer
11 hours ago
Glean Palo Alto, United StatesAbout Glean · We're on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work. In the future, everyone will work in tandem with expert AI assistants who find knowledge, create and synthesize information, and execu ...
-
Site Reliability Engineer
2 weeks ago
Rubrik Palo Alto, United StatesMust be a US CItizen in order to be considered for this role - This is FedRamp requirement. · Site Reliability Engineers at Rubrik are systems/software engineers who ensure that Rubrik's infrastructure services run smoothly and have the capacity for future growth. · As a Site Rel ...
-
Site Reliability Engineer
3 weeks ago
Mediaocean Palo Alto, United StatesMediaocean is powering the future of the advertising ecosystem with technology that empowers brands and agencies to deliver impactful omnichannel marketing experiences. With over $200 billion in annualized ad spend running through its software products, Mediaocean deploys AI and ...
-
Site Reliability Engineer
3 days ago
Insight Global Redwood City, United StatesJob Description · Insight Global is looking for a skilled Site Reliability Engineer (SRE) to work remotely in Peru or Guatemala for a large AAA game employer on a 9-12 month contract. You will be working within the Production Infrastructure & Engineering (PI&E) organization that ...
-
Site Reliability Engineer
3 days ago
Insight Global Redwood City, United StatesInsight Global is looking for a skilled Site Reliability Engineer (SRE) to work remotely in Peru or Guatemala for a large AAA game employer on a 9-12 month contract. You will be working within the Production Infrastructure & Engineering (PI&E) organization that provides the essen ...
-
Site Reliability Engineer
3 weeks ago
C3 AI Redwood City, United States, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...
-
Site Reliability Engineer
2 days ago
C3 AI Redwood City, United States, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...
-
Senior Engineering Manager, Reliability
5 days ago
Robinhood Menlo Park, United StatesJoin a leading fintech company that's democratizing finance for all. · Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to fin ...
-
Site Reliability Engineer
2 weeks ago
C3 AI Inc. Redwood City, United States, Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...
-
Staff Site Reliability Engineer
3 weeks ago
GRAIL, Inc. Menlo Park, United StatesGRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify multiple deadly cancer types early. The company is using the power o ...
-
Database Site Reliability Engineer
21 hours ago
Robinhood Menlo Park, United StatesJoin a leading fintech company that's democratizing finance for all. · Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to fina ...
-
Site Reliability Engineer
2 weeks ago
Box Redwood City, United StatesWHAT IS BOX? · Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, coll ...
Staff Site Reliability Engineer - Menlo Park, United States - Character
![Default job background](https://contents.bebee.com/public/img/bg-user-ex-1.jpg)
Description
About usCharacter's mission is to empower everyone with AGI. Our vision is to enable people with our technology so that they can use
Character.
AI
any moment of any day.
Character.
AI
is one of the world's leading personal
AIplatforms. Founded in 2021 by
AIpioneers Noam Shazeer and Daniel De Freitas,
Character.
AI
is a full-stack
AIcompany with a globally scaled direct-to-consumer platform. As of 2023 that platform was #2 in the space in user engagement.
Character.
AI
is uniquely centered around people, letting users personalize their experience by interacting with
AI"Characters." The company achieved unicorn status in 2023 and was named Google Play's
AIApp of the Year.
AIlist.
TIME called him "one of the most important and impactful people of the space's past, present, and future." Daniel created and led LaMDA, the breakthrough conversational tech project currently powering Bard.
To learn more, please visit.
About the role
The Role:
As the founding member of our DevOps/Site Reliability Engineer function here at Character, you'll have the opportunity to support our infrastructure with thousands of nodes, terabytes of data and millions of daily active users on our site.
You'll be responsible for ensuring our product's reliability, scalability, and performance as we aggressively grow our user base, with a goal of growing to 3 billion users.
Work closely with our development team to design and implement processes and systems that ensure the stability and availability of our service.
Specific Responsibilities:
Maintain production services and keep them operational.
Develop tools, Instrumentation and automation to monitor and optimize the performance and reliability of our service.
Develop, implement and maintain automation tools and processes to prevent and mitigate service disruptions.
Collaborate with development teams to design and implement scalable, reliable systems, CI/CD processes for deployment.
Establish and support SLAs and SLOs for our site
Provide system monitoring and incident alerts
Participate in on-call rotations to provide support for critical incidents and outages.
Develop plans for site reliability and disaster recovery
Job Requirements:
5+ years of experience in a development focused DevOps/SRE role within a technology organization that has significant scale
Deep experience with and proven success in developing software tools and automation wherever needed using Python and Golang
Expertise with SQL, Linux, CI/CD, Kubernetes, Terraform to support a site/application within a large multi node infrastructure and a growing user base.
Demonstrated experience to successfully and reliably troubleshoot technical issues and challenges across a range of platforms and systems
Experience with incident management and event postmortems
Desired Experience:
Familiarity with GPU clusters and/or HPC environments is preferred
Experience with monitoring and logging tools such as Prometheus and Grafana
Hands-on experience scaling a consumer product from early days into hypergrowth
Character is an equal opportunity employer and does not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.
#J-18808-Ljbffr