Jobs
>
Sunnyvale

    Product Hardware Reliability Engineer, Global Hardware Reliability Engineering - Sunnyvale, United States - Google LLC

    Default job background
    Description

    Minimum qualifications:

    • Bachelor's degree in Electrical/Industrial/Mechanical Engineering or equivalent practical experience.
    • 10 years of experience in reliability engineering of cloud infrastructure hardware and technology, dealing with failure analysis and fault isolation techniques and applying them to isolate root causes.
    • 7 years of experience with system level reliability tools (RBDs, MCFs, HPPs, NHPPs).

    Preferred qualifications:

    • Master's degree or PhD in Electrical/Industrial/Mechanical Engineering or equivalent practical experience.
    • Experience leading cross-functional problem-solving teams using practical approaches.
    • Knowledge of Industry Test Standards (e.g., JEDEC, ASTM, IEEE).
    • Understanding of Physics of Failure and Reliability Physics.
    • Ability to effectively lead teams to meet corporate and customer reliability expectations, and effectively communicate to the project team with excellent people management skills.
    About the job

    Google has one of the largest and most powerful computing infrastructures in the world. Your team is responsible for providing the manufacturing capability to deliver this state-of-the-art physical infrastructure.

    As a Manufacturing Engineer, you evaluate the product designs and create the processes, tools and procedures behind Google's powerful search technology.

    When vendors build parts for our infrastructure, you're right there alongside ensuring manufacturing processes are repeatable and controlled. You collaborate with Commodity Managers and Design Engineers to determine Google's infrastructure needs and product specifications.

    Your work ensures the various pieces of Google's infrastructure fit together perfectly and keep our systems humming along smoothly for a seamless user experience.


    Google Cloud is responsible for providing the hardware design and the manufacturing capability to deliver state-of-the-art physical infrastructure that powers Machine Learning applications, among others.

    Quality and Reliability are the foundational cornerstones for the success of this complex offering.


    As a Reliability Engineer, you will lead new product introduction (NPI) reliability related activities between our Engineering teams, Contract Manufacturers (CM), and suppliers for Machine Learning hardware, identifying and managing risks, and clearly communicating project deliverables and status to stakeholders.

    You will evaluate complex product designs and provide insights to the design teams on potential improvements and tradeoffs.

    You will create procedures and tools to drive product development and manufacturing in a fast-paced environment while focusing on the root causes of failure.

    You will also evaluate the reliability status of the fleet and support product improvement initiatives. Finally, you will work with external partners to ensure their products meet our customer expectations.

    Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running.

    From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible.

    We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them.

    We keep our networks up and running, ensuring our users have the best and fastest experience possible.

    The US base salary range for this full-time position is $134,000-$198,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location.

    The range displayed on each job posting reflects the minimum and maximum target salaries for the position across all US locations.

    Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training.

    Your recruiter can share more about the specific salary range for your preferred location during the hiring process.


    Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits.

    Learn more about benefits at Google.

    Responsibilities


    • Lead system design analysis to enable evaluations and product de-risk at an early stage of development.
    • Enable the implementation of the reliability plan and lead efforts to assess and mitigate risk of failure early during NPI.
    • Lead system reliability monitoring efforts and flag unwanted system behavior. Extract field reliability data and drive failure analysis efforts, identification of root causes of failure, and creation of actionable insights.
    • Maintain relationships with outside partners, testing labs, cross-functional internal groups, and Contract Manufacturer (CM) partners, while developing in-house test and qualification capabilities where needed.
    • Lead system reliability efforts by working with other organizations to define reliability goals and plans, securing the resources needed to execute the plan. Drive reliability test plans and collect, analyze, and synthesize the data to enable verification of the design reliability goals.


    Information collected and processed as part of your Google Careers profile, and any job applications you choose to submit is subject to Google's Applicant and Candidate Privacy Policy.

    Google is proud to be an equal opportunity and affirmative action employer.

    We are committed to building a workforce that is representative of the users we serve, creating a culture of belonging, and providing an equal employment opportunity regardless of race, creed, color, religion, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition (including breastfeeding), expecting or parents-to-be, criminal histories consistent with legal requirements, or any other basis protected by law.


    See also Google's EEO Policy, Know your rights:
    workplace discrimination is illegal, Belonging at Google, and How we hire.

    If you have a need that requires accommodation, please let us know by completing our Accommodations for Applicants form.


    Google is a global company and, in order to facilitate efficient collaboration and communication globally, English proficiency is a requirement for all roles unless stated otherwise in the job posting.


    To all recruitment agencies:
    Google does not accept agency resumes. Please do not forward resumes to our jobs alias, Google employees, or any other organization location. Google is not responsible for any fees related to unsolicited resumes.


  • Comtech Telecom Santa Clara, United States Full time Regular

    Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...


  • Amiseq Inc. Sunnyvale, United States

    Site Reliability Engineer · Not sure what skills you will need for this opportunity Simply read the full description below to get a complete picture of candidate requirements. · Sunnyvale, CA - Hybrid · 6-12 Months W2 Contract · Job Description: · Hands on development on bui ...

  • Comtech TCS

    Reliability Engineer

    3 weeks ago


    Comtech TCS Santa Clara, United States

    Job Description · Job Description · Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a · Reliability/Failure · Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside cu ...


  • AMISEQ Sunnyvale, United States

    Site Reliability Engineer · Sunnyvale, CA - Hybrid · 6-12 Months W2 Contract · Job Description: · Hands on development on building n-tier applications using RESTful Services, Java/J2EE, JavaScript, Python, NoSql. · • Working knowledge of one or more cloud technologies such as AZ ...


  • Natron Energy Santa Clara, United States

    Natron is seeking a Reliability Engineer to support the development and test of our high-power battery systems for data center UPS and EV charging applications. The occupant of this position will work with the Product Engineering, Reliability, Technology, and Operations teams to ...

  • Comtech Telecom

    Reliability Engineer

    2 weeks ago


    Comtech Telecom Santa Clara, United States

    Comtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, providing solutions to a variety of ...


  • Amiseq Inc. Sunnyvale, United States

    Site Reliability Engineer · Sunnyvale, CA - Hybrid · 6-12 Months W2 Contract · Job Description: · Hands on development on building n-tier applications using RESTful Services, Java/J2EE, JavaScript, Python, NoSql. · • Working knowledge of one or more cloud technologies such as AZ ...

  • COMTECH TELECOMMUNICATIONS

    Reliability Engineer

    2 weeks ago


    COMTECH TELECOMMUNICATIONS Santa Clara, United States

    Job Description · Job DescriptionComtech Telecommunications Corp. has an opportunity in Santa Clara, CA for a Reliability/Failure Analysis Engineer. In this important role, you will collaborate with a diverse team of technical professionals and interact with outside customers, pr ...


  • Tech Mahindra Sunnyvale, United States

    Proficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Skilled at writing clean, high-performan ...


  • Tech Mahindra Sunnyvale, United States

    Proficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Make sure to apply quickly in order to ...


  • Tech Mahindra Sunnyvale, United States

    Proficiency with the architecture, deployment, performance tuning, and troubleshooting large scale distributed systems on AWS · Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation · Skilled at writing clean, high-performa ...


  • Danaher Sunnyvale, United States Full time

    At Cepheid, we are passionate about improving health care through fast, accurate diagnostic testing. Our mission drives us, every moment of every day, as we develop scalable, groundbreaking solutions to solve the world's most complex health challenges. Our associates are involved ...


  • Yoh - A Day & Zimmerman Company Mountain View, United States

    Packaging Reliability Engineer · As a Packaging Reliability Engineer, you will be responsible for qualifying packaging for consumer electronic products. The company creates iconic packaging that meets a high bar for reliability and demonstrates care for the people who use them an ...


  • Yoh, A Day & Zimmermann Company Mountain View, United States

    Job Description · Job Description · Packaging Reliability Engineer · As a Packaging Reliability Engineer, you will be responsible for qualifying packaging for consumer electronic products. The company creates iconic packaging that meets a high bar for reliability and demonstrat ...


  • Cavnue Mountain View, United States

    We believe that the future of transportation is automated. Automated travel will be safer, more comfortable, more efficient and a powerful economic enabler for our communities. However, automating driving is a massively complex engineering challenge, requiring vehicles to navigat ...


  • Wayve Mountain View, United States

    Job Overview · At Wayve, we're not just another autonomous vehicle company. We stand out with our revolutionary approach to self-driving technology, embracing the power of embodied AI to redefine the boundaries of what's possible. While others depend on static maps and rigid rule ...


  • NewsBreak Mountain View, United States

    About NewsBreak · NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, our mission is to foster safer, more vibrant, and authentically connected lives. Through robust collabor ...


  • Yoh Mountain View, United States

    Packaging Reliability Engineer · As a Packaging Reliability Engineer, you will be responsible for qualifying packaging for consumer electronic products. The company creates iconic packaging that meets a high bar for reliability and demonstrates care for the people who use them an ...


  • Cavnue Mountain View, United States

    We believe that the future of transportation is automated. · Automated travel will be safer, more comfortable, more efficient and a powerful economic enabler for our communities. However, automating driving is a massively complex engineering challenge, requiring vehicles to navi ...


  • Intershop Communications AG Mountain View, United States

    (Senior) Site Reliability Engineer (m/f/d) · Homeoffice · Jena · Senior · We are Intershop - We're built to boost your business · As an e-commerce pioneer, we have been setting standards in the development of software for digital commerce for almost 30 years. With our cloud of ...