Software Engineer, Site Reliability - San Mateo

Only for registered members San Mateo, United States

2 weeks ago

Default job background
+

Job summary

We're building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry.

+

Responsibilities

  • Ensure systems are designed and implemented with high availability, scalability, and performance.
  • 'Incident Management & Response:' Lead efforts in incident detection, response, and resolution for critical production issues.
  • 'Observability & Monitoring:' Develop, implement, and maintain comprehensive monitoring, alerting logging tracing solutions to provide deep insights into system health and performance.'Automation &' Toil Reduction: Identify automate repetitive operational tasks to reduce toil improve operational efficiency'Capacity Planning &' Performance Tuning Work proactively on capacity planning ensure our infrastructure can gracefully handle growth peak loads'Optimize system performance resource utilization'Reliability Best Practices Collaborate software engineers embed reliability principles into development lifecycle promoting a culture operational excellence'

Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • Only for registered members San Mateo $130,000 - $280,000 (USD)

    We are actively looking for a talented Site Reliability Engineer to join the Infrastructure team.As a member of the infrastructure team, your role will be to manage this infrastructure and continue to make it easier for our team to monitor and scale it. · ...

  • Only for registered members San Mateo, CA

    We are actively looking for a talented Site Reliability Engineer to join the Infrastructure team. · To manage this infrastructure and continue to make it easier for our team to monitor and scale it, · be it by adopting 3rd party tools or design your own.,Keep our infrastructure u ...

  • Only for registered members San Mateo, CA United States

    +Job summary · Verkada is transforming how organizations protect their people and places with an integrated, AI-powered platform. A leader in cloud physical security, Verkada helps organizations strengthen safety and efficiency through one connected software platform that include ...

  • Only for registered members San Mateo, CA, USA

    We are seeking a Site Reliability Engineer to join our team in San Mateo, CA. The ideal candidate will have 6+ years of experience in an SRE role for online services in a multi-region, multi-cloud environment with specific experience in reliability and resliency. · Serve as a men ...

  • Only for registered members Foster City, CA

    Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles. · ...

  • Only for registered members Foster City $170,000 - $205,000 (USD)

    Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles. · ...

  • Only for registered members Foster City Full time $170,000 - $205,000 (USD)

    Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles. · ...

  • Only for registered members Foster City, CA

    Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles. · ...

  • Only for registered members Foster City $160,000 - $250,000 (USD)

    We need our team to be representative of the world. · Join our Site Reliability Engineering team. · ...

  • Only for registered members San Mateo

    We're looking for a Senior Site Reliability Engineer to join our SRE team, the group responsible for keeping our systems fast, reliable and secure.This is more than just keeping the lights on. · You'll be engineering the future of a platform trusted by developers and companies ar ...

  • Only for registered members San Mateo, CA

    As a Site Reliability Engineer at Fireworks AI, you will play a critical role in making our world-scale virtual AI cloud reliable, performant, and efficient. · Ensuring System Reliability: Ensure systems are designed and implemented with high availability, scalability, and perfor ...

  • Only for registered members San Mateo, CA

    IXL Learning is seeking a Senior Site Reliability Engineer to join our team and help maintain the reliability and optimal performance of our products. We are seeking engineers with a passion for problem solving and optimization. · ...

  • Only for registered members San Mateo, CA

    At Roblox, we're building the tools and platform that empower our community to bring any experience that they can imagine to life. · ...

  • Only for registered members Redwood City Full time

    We believe consumers and businesses can coexist. Our platform allows consumers to access savings tools, earned wages and rewards without cost or hidden fees. · Write Terraform modules for deploying infrastructure resources via our GitLab pipelines · Develop Helm charts for deploy ...

  • Only for registered members San Mateo

    Backend/infrastructure engineer to join our founding team.Working on building ML and data pipelines. · Owning the observability and deployments of Wisdom stack · ...

  • Only for registered members Mountain View $197,000 - $291,000 (USD)

    SRE ensures that Google's services have reliability, uptime appropriate to users' needs and a fast rate of improvement. SRE's will keep an ever-watchful eye on our systems capacity and performance. · ...

  • Only for registered members Mountain View, CA

    Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. · ...

  • Only for registered members Redwood City, California, USA

    · You won't just be 'managing' systems; you will be the architect of their health ... . · ...

  • Only for registered members Redwood City $19.62 - $53 (USD)

    You will be joining the OCSC (Oracle Cloud Service Centre) as an SRD (site reliability developer). Your job role will be helping Oracle ensure the availability of cloud services 24x7x365. · As a Cloud Service Centre Site Reliability Developer Intern you will be involved with: · A ...

  • Only for registered members Redwood City

    You will be joining the OCSC (Oracle Cloud Service Centre) as an SRD (site reliability developer). Your job role will be helping Oracle ensure the availability of cloud services 24x7x365. · The Oracle Cloud Service Centre monitors and responds to Service Events that are impacting ...