Jobs
>
Menlo Park

    Staff Site Reliability Engineer #3718 - Menlo Park, United States - Grail

    Grail
    Grail Menlo Park, United States

    1 week ago

    Default job background
    Description
    GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured.

    GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify multiple deadly cancer types early.

    The company is using the power of next-generation sequencing, population-scale clinical studies, and state-of-the-art computer science and data science to enhance the scientific understanding of cancer biology, and to develop its multi-cancer early detection blood test.

    GRAIL is headquartered in Menlo Park, CA with locations in Washington, D.C., North Carolina, and the United Kingdom. GRAIL, LLC is a wholly-owned subsidiary of Illumina, Inc.

    (NASDAQ:

    ILMN). For more information, please visit


    GRAIL is seeking a Staff Software Engineer in our Site Reliability Engineering (SRE) team to help us improve security and reliability of production systems that are critical for our mission to detect cancer early and save lives.

    You will contribute to the architecture, design, development, implementation, and be responsible for secure, healthy, and reliable operation of critical cloud-based infrastructure, services, and applications.

    You are someone who enjoys learning and implementing best industry technology trends and practices. You foster and contribute to the creative and collaborative culture to deliver results. You embrace ambiguity and enjoy exploring new technologies delivering robust, scalable solutions.

    This is a hybrid role and requires you to be onsite 2 days a week in Menlo Park, CA

    Responsibilities


    • Ensure High Availability: Implement and maintain resilient cloud architectures, monitor system performance, and proactively identify and resolve potential bottlenecks or points of failure.
    • Incident Management: Play an active role in production on-call, responding swiftly to troubleshoot and resolve production issues. Minimize service disruptions and downtime by conducting thorough triaging and debugging of product or system issues. Continuously optimize the on-call process for sustainability and efficiency.
    • Automation and Tooling: Develop and maintain automation scripts, tools, and processes to streamline system deployment, monitoring, and management tasks. Your contributions will be vital in efficiently scaling cloud operations.
    • Performance Optimization: Optimize cloud infrastructure and applications for performance, scalability, and cost-effectiveness.
    • Security and Compliance: Collaborate with security engineers to implement best practices and ensure compliance with security standards and policies.
    • Monitoring and Alerting: Design and configure advanced monitoring systems to gain insights into system behavior, set up alerts, and respond proactively to potential issues. Create and maintain comprehensive dashboards and playbooks for production on-call.
    • Software Development Consultation: Engage actively in the entire software development lifecycle. Participate in system design reviews and provide valuable Site Reliability Engineer (SRE) insights during launch reviews, influencing and enhancing system architecture.
    Preferred Qualifications


    • Bachelor's degree in Computer Science, a related field, or equivalent practical experience.
    • 3+ years of professional experience maintaining production systems on Cloud based services and infrastructure.
    • 8+ years of software development experience in one or more programming languages with a primary focus on leveraging, working on cloud-based services and infrastructure.
    • Strong knowledge of AWS cloud platform
    • Practical experience with containerization technologies, including Docker and Kubernetes.
    • Familiarity with Python, Bash scripting and Ansible
    • Familiarity with infrastructure as code tools like Terraform is essential.
    • Solid understanding of databases, networking, security principles, and best practices.
    • Proficiency in using monitoring and alerting tools to detect and respond to potential issues effectively.
    Desired Skills


    • AWS Certifications (such as Solutions Architect, Security, etc.)
    • Experience in a regulated industry or healthcare field
    The expected, full-time, annual base pay scale for this position is $180,000 - $210,000. Actual base pay will consider skills, experience, and location.


    Based on the role, colleagues may be eligible to participate in an annual bonus plan tied to company and individual performance, or an incentive plan.

    We also offer a long-term incentive plan to align company and colleague success over time.


    In addition, GRAIL offers a progressive benefit package, including flexible time-off, a 401k with a company match, and alongside our medical, dental, vision plans, carefully selected mindfulness offerings.


    GRAIL is an Equal Employment Employer and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability or any other legally protected status.

    We will reasonably accommodate all individuals with disabilities so that they can participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.

    Please contact us to request accommodation. GRAIL maintains a drug-free workplace.

  • Mainspring Energy, Inc.

    Reliability Engineer

    2 weeks ago


    Mainspring Energy, Inc. Menlo Park, United States

    Job Description · Job Description · Company Overview · Driven by our vision of the affordable, reliable, net-zero carbon grid of the future, Mainspring has developed a new category of power generation — the linear generator — that delivers local, scalable, and fuel-flexible pow ...


  • Rubrik Palo Alto, United States

    Must be a US CItizen in order to be considered for this role - This is FedRamp requirement. · Site Reliability Engineers at Rubrik are systems/software engineers who ensure that Rubrik's infrastructure services run smoothly and have the capacity for future growth. · As a Site Rel ...


  • Aptos Palo Alto, United States

    Aptos is a people-first blockchain on a mission to help billions of people achieve universal and fair access to decentralized assets in a safe and scalable way. · Founded by some of the original creators and maintainers that researched, designed, and built the Diem blockchain to ...


  • Mediaocean Palo Alto, United States

    Mediaocean is powering the future of the advertising ecosystem with technology that empowers brands and agencies to deliver impactful omnichannel marketing experiences. With over $200 billion in annualized ad spend running through its software products, Mediaocean deploys AI and ...


  • C3 AI Inc. Redwood City, United States

    , Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...


  • Insight Global Redwood City, United States

    Job Description · Insight Global is looking for a skilled Site Reliability Engineer (SRE) to work remotely in Peru or Guatemala for a large AAA game employer on a 9-12 month contract. You will be working within the Production Infrastructure & Engineering (PI&E) organization that ...


  • C3 AI Redwood City, United States

    , Inc. (NYSE:AI) is a leading Enterprise AI software provider for accelerating digital transformation. The proven C3 AI Platform provides comprehensive services to build enterprise-scale AI applications more efficiently and cost-effectively than alternative approaches. The C3 AI ...


  • Robinhood Menlo Park, United States

    Join a leading fintech company that's democratizing finance for all. · Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to fin ...


  • GRAIL, Inc. Menlo Park, United States

    GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify multiple deadly cancer types early. The company is using the power o ...


  • GRAIL, Inc. Menlo Park, United States

    GRAIL is a healthcare company whose mission is to detect cancer early, when it can be cured. GRAIL is focused on alleviating the global burden of cancer by developing pioneering technology to detect and identify multiple deadly cancer types early. The company is using the power o ...


  • Robinhood Menlo Park, United States

    Join a leading fintech company that's democratizing finance for all. · Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers and providing greater access to fin ...


  • Box Redwood City, United States

    WHAT IS BOX? · Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, coll ...


  • Plume Design Inc Palo Alto, United States

    We're looking for a seasoned Technical Manager, experienced with Customer Facing environments, to Captain our Site Reliability Engineering Team. This team is focused on deployments, fixes, and sustainability. The right candidate needs to have strong technical knowledge in key are ...


  • Rivian Palo Alto, United States

    About Rivian: · Rivian is on a mission to keep the world adventurous forever. This goes for the emissions-free Electric Adventure Vehicles we build, and the curious, courageous souls we seek to attract. · As a company, we constantly challenge whats possible, never simply accept ...


  • Velocity Global, LLC Palo Alto, United States

    POSITION SUMMARY: · Velocity Global seeks a Senior Site Reliability Engineer (SRE) with extensive observability experience. In this role, you will help to lead the automation and support efforts of our cloud Infrastructure, identify strategies to improve our full-stack telemetry ...


  • General Motors Palo Alto, United States

    Job Description · Software-defined vehicles represent a new paradigm for automakers and consumers, fueled by technological advancements and an escalating demand for transportation solutions that are not only intelligent but also safer and more environmentally sustainable. At the ...


  • Rubrik Job Board Stanford, United States

    Senior Site Reliability Engineers at Rubrik are systems/software engineers who ensure that Rubrik's infrastructure services run smoothly and have the capacity for future growth. · As a Senior Site Reliability Engineer, you will be responsible for: · Ensure we maintain high avai ...


  • Assured Palo Alto, United States

    Job Description · Job DescriptionAssured is on a mission to modernize insurance. Claims processing (i.e. should we pay this claim?), while often overlooked, is the foundation of the entire industry. It's currently highly manual, involving phone calls, faxes, and gut instinct—cost ...


  • Rubrik Palo Alto, United States

    Must be a US CItizen - This is a FedRamp Requirement for this role. · Sr. Site Reliability Engineers at Rubrik are systems/software engineers who ensure that Rubrik's infrastructure services run smoothly and have the capacity for future growth. · As a Sr. Site Reliability Enginee ...


  • Plume Design Inc Palo Alto, United States

    Life at Plume · At Plume, we believe that technology isn't about moving faster, it's about making life's moments better. Which is why we've built the world's first, and only, open and hardware-independent service delivery platform for smart homes, small businesses, enterprises, a ...