Jobs
>
Menlo Park

    Staff Site Reliability Engineer - Menlo Park, United States - Character

    Character
    Character Menlo Park, United States

    3 weeks ago

    Default job background
    Description
    About us

    Character's mission is to empower everyone with AGI. Our vision is to enable people with our technology so that they can use

    Character.

    AI
    any moment of any day.
    Character.

    AI
    is one of the world's leading personal
    AIplatforms. Founded in 2021 by
    AIpioneers Noam Shazeer and Daniel De Freitas,

    Character.

    AI
    is a full-stack
    AIcompany with a globally scaled direct-to-consumer platform. As of 2023 that platform was #2 in the space in user engagement.

    Character.

    AI


    is uniquely centered around people, letting users personalize their experience by interacting with
    AI"Characters." The company achieved unicorn status in 2023 and was named Google Play's
    AIApp of the Year.

    Noam co-invented the key tech powering LLMs and was recently named to TIME100's Most Influential People in
    AIlist.

    TIME called him "one of the most important and impactful people of the space's past, present, and future." Daniel created and led LaMDA, the breakthrough conversational tech project currently powering Bard.

    To learn more, please visit

    .
    About the role


    The Role:


    As the founding member of our DevOps/Site Reliability Engineer function here at Character, you'll have the opportunity to support our infrastructure with thousands of nodes, terabytes of data and millions of daily active users on our site.

    You'll be responsible for ensuring our product's reliability, scalability, and performance as we aggressively grow our user base, with a goal of growing to 3 billion users.

    Work closely with our development team to design and implement processes and systems that ensure the stability and availability of our service.


    Specific Responsibilities:
    Maintain production services and keep them operational.

    Develop tools, Instrumentation and automation to monitor and optimize the performance and reliability of our service.

    Develop, implement and maintain automation tools and processes to prevent and mitigate service disruptions.

    Collaborate with development teams to design and implement scalable, reliable systems, CI/CD processes for deployment.

    Establish and support SLAs and SLOs for our site

    Provide system monitoring and incident alerts

    Participate in on-call rotations to provide support for critical incidents and outages.

    Develop plans for site reliability and disaster recovery


    Job Requirements:
    5+ years of experience in a development focused DevOps/SRE role within a technology organization that has significant scale

    Deep experience with and proven success in developing software tools and automation wherever needed using Python and Golang


    Expertise with SQL, Linux, CI/CD, Kubernetes, Terraform to support a site/application within a large multi node infrastructure and a growing user base.

    Experience working with multiple cloud computing platforms such as GCP is also a must

    Demonstrated experience to successfully and reliably troubleshoot technical issues and challenges across a range of platforms and systems

    Experience with incident management and event postmortems


    Desired Experience:
    Familiarity with GPU clusters and/or HPC environments is preferred

    Experience with monitoring and logging tools such as Prometheus and Grafana

    Hands-on experience scaling a consumer product from early days into hypergrowth


    Character is an equal opportunity employer and does not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or any other legally protected status.

    We value diversity and encourage applicants from a range of backgrounds to apply.

    #J-18808-Ljbffr


  • Mainspring Energy, Inc. Menlo Park, United States

    Job Description · Job DescriptionCompany Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible power to ...


  • Mainspring Energy Menlo Park, United States

    Company Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation - the linear generator - that delivers local, scalable, and fuel-flexible power to help accelerate the transition ...


  • Mainspring Energy Menlo Park, United States

    Company Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation - the linear generator - that delivers local, scalable, and fuel-flexible power to help accelerate the transition ...

  • Mainspring Energy, Inc.

    Reliability Engineer

    2 weeks ago


    Mainspring Energy, Inc. Menlo Park, United States

    Job Description · Job Description · Company Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible pow ...


  • Aptos Palo Alto, United States

    Aptos is a people-first blockchain on a mission to help billions of people achieve universal and fair access to decentralized assets in a safe and scalable way. · Founded by some of the original creators and maintainers that researched, designed, and built the Diem blockchain to ...


  • Wing Aviation Palo Alto, United States

    About Wing: · Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's ...


  • Salesforce Palo Alto, United States

    To get the best candidate experience, please consider applying for a maximum of 3 roles within 12 months to ensure you are not duplicating efforts.Job CategorySoftware Engineering · Job DetailsAbout SalesforceWe're Salesforce, the Customer Company, inspiring the future of busine ...


  • Wing Aviation Palo Alto, United States

    About Wing: · Wing offers drone delivery as a safe, fast, and sustainable solution for last mile logistics. Consumer appetites for on-demand services are increasing, but current delivery methods are inefficient, costly, and contribute to road accidents and air pollution. Wing's ...


  • Glean Palo Alto, United States

    About Glean · We're on a mission to make knowledge work faster and more humane. We believe that AI will fundamentally transform how people work. In the future, everyone will work in tandem with expert AI assistants who find knowledge, create and synthesize information, and execu ...


  • Rubrik Palo Alto, United States

    Must be a US CItizen in order to be considered for this role - This is FedRamp requirement. · Site Reliability Engineers at Rubrik are systems/software engineers who ensure that Rubrik's infrastructure services run smoothly and have the capacity for future growth. · As a Site Rel ...


  • Mediaocean Palo Alto, United States

    Mediaocean is powering the future of the advertising ecosystem with technology that empowers brands and agencies to deliver impactful omnichannel marketing experiences. With over $200 billion in annualized ad spend running through its software products, Mediaocean deploys AI and ...


  • Insight Global Redwood City, United States

    Job Description · Insight Global is looking for a skilled Site Reliability Engineer (SRE) to work remotely in Peru or Guatemala for a large AAA game employer on a 9-12 month contract. You will be working within the Production Infrastructure & Engineering (PI&E) organization that ...


  • Insight Global Redwood City, United States

    Insight Global is looking for a skilled Site Reliability Engineer (SRE) to work remotely in Peru or Guatemala for a large AAA game employer on a 9-12 month contract. You will be working within the Production Infrastructure & Engineering (PI&E) organization that provides the essen ...


  • C3 AI Redwood City, United States

    , Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...


  • C3 AI Redwood City, United States

    , Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...


  • Robinhood Menlo Park, United States

    Join a leading fintech company that's democratizing finance for all. · Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to fin ...


  • C3 AI Inc. Redwood City, United States

    , Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...


  • GRAIL, Inc. Menlo Park, United States

    GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify multiple deadly cancer types early. The company is using the power o ...


  • Robinhood Menlo Park, United States

    Join a leading fintech company that's democratizing finance for all. · Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to fina ...


  • Box Redwood City, United States

    WHAT IS BOX? · Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, coll ...