Jobs
>
Redmond

    Senior Hardware Reliability Engineer - Redmond, United States - Microsoft Corporation

    Default job background
    Description

    Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission.

    SCHIE delivers the core infrastructure and foundational technologies for Microsoft's over 200 online businesses including Bing, MSN, Office 365, Xbox Live, Teams, OneDrive, and the Microsoft Azure platform globally with our server and data center infrastructure, security and compliance, operations, globalization, and manageability solutions.

    Our focus is on smart growth, high efficiency, and delivering a trusted experience to customers and partners worldwide and we are looking for engineers to help achieve that mission.

    We are looking for a Senior Hardware Reliability Engineer to join the team.


    As Microsoft's cloud business continues to grow the ability to deploy new offerings and hardware infrastructure on time, in high volume with high quality and lowest cost is of paramount importance.

    To achieve this goal, the Hardware, Infrastructure Management, and Fundamentals Engineering (HIFE) team is instrumental in defining and delivering operational measures of success for hardware manufacturing, improving the planning process, quality, delivery, scale and sustainability related to Microsoft cloud hardware.

    We are looking for engineers with a dedicated passion for customer focused solutions, insight and industry knowledge to envision and implement future technical solutions that will manage and optimize the Cloud infrastructure.

    Microsoft's mission is to empower every person and every organization on the planet to achieve more.

    As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals.

    Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

    In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.


    Requirements:

    • 12+ years relevant technical engineering experience
    • OR Bachelor's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 5+ years technical engineering experience
    • OR Master's Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 4+ years technical engineering experience
    • OR Doctorate Degree in Mechanical Engineering, Materials Engineering, Reliability Engineering, Electrical Engineering, or related field AND 2+ years technical engineering experience
    • 3+ years of experience in hardware development of moderate to complex hardware systems

    Other:
    Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.

    These requirements include but are not limited to the following specialized security screenings:

    Microsoft Cloud Background Check:

    This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.


    Preferred Qualifications:

    • Fundamental knowledge of Computer Architecture, Server architecture at block level, Electrical/Power Hardware Design and Hardware/Firmware/OS interactions.
    • Knowledge of electronic components/devices and their failure-modes & failure mechanism, with demonstrated experience on design for reliability, failure analysis and troubleshooting.
    • Experience creating and executing electronic subsystem reliability qualification requirements.
    • Direct experience on developing/utilizing/executing stress tool/power virus for computer/server system hardware reliability test is a plus.
    • Knowledge of statistical & probability techniques, reliability modeling and experience using tools such as ReliaSoft & JMP statistical software packages.
    • Knowledge of industry standards, IPC, JEDEC, Telcordia, and MIL-STD
    • Basic understanding of mechanical drawings, CAD design tool, and tolerance analysis, as well as PCB stackup.
    • Working knowledge of thermal engineering, liquid cooling design and infrastructure is a plus.
    • Familiarity with some programming SQL, Python etc. is a plus.
    Reliability Engineering IC- The typical base pay range for this role across the U.S. is USD $112,000 - $218,400 per year.

    There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $145,800 - $238,600 per year.

    Certain roles may be eligible for benefits and other compensation.

    Find additional benefits and pay information here:
    Microsoft will accept applications for the role until May 5, 2024.

    Microsoft is an equal opportunity employer.

    All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.

    We also consider qualified applicants regardless of criminal histories, consistent with legal requirements.

    If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.


    Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

    #azurehwjobs #HIFE


    • Successful candidate will be responsible for using Design for Reliability principles, such DFMEA, accelerated life testing, physics of failure, to ensure the cloud hardware developed and delivered to Microsoft's datacenters meet specified use-conditions and stresses to assure its design intent.
    • Participate in system/component vendor selection activities, and drive system/component qualification that are critical and strategic to Microsoft product requirements.
    • Creates and up-level reliability engineering guidelines to improve product field performance through design enhancements to meet reliability goals.
    • Collaborate with other development functional teams and internal stakeholders regarding the application of Design for Reliability principles, including system thermal-mechanical, PCB stackup, etc., to ensure products meet customer expectations.
    • Define qualification plans for new product introduction (NPI), mass production, and multi-sourcing according to use conditions.
    • Develop reliability models that represent the expected environment and operational conditions.
    • Select, analyze, and interpret the results of various test methods used during product development.
    • Evaluate and drive effectiveness of the reliability stresses (operational and non-operational) to identify, debug, and resolve reliability issues related to products and components.
    • Identifies, collects, analyzes, and manages various types of data, including from fleet telemetry, to minimize failures and improve product performance, reliability, availability, and maintainability with strategic redundancy and spare planning.
    • Other
    • Embody our culture and values.


  • Quadrant Technologies Redmond, United States

    Role: Site Reliability Engineer · Location: Redmond, WA · Responsibilities include but are not limited to: · Monitor and maintain the Reliability, Availability, and Performance of the Cosmos DB service. · Design and implement Disaster Recovery and Business Continuity plans. · Col ...


  • WaferWire Cloud Technologies Redmond, United States

    WaferWire is currently seeking a Site Reliability Engineer to join its innovative team. This role involves implementing and maintaining robust DevOps practices within an Azure cloud environment to ensure smooth deployment and operation of services. The responsibilities include us ...


  • Space Exploration Technologies Corporation Redmond, United States

    The Equipment Reliability Engineer is responsible for providing engineering support on planned and unplanned repairs, modifications, and upgrades of production equipment in the Starlink programs in Redmond. Equipment Reliability Engineers are the pri Reliability Engineer, Equipme ...


  • Space Exploration Technologies Redmond, United States

    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. ...


  • Space Exploration Technologies Corp. Redmond, United States

    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. ...


  • SPACE EXPLORATION TECHNOLOGIES CORP Redmond, United States

    HARDWARE RELIABILITY ENGINEER (STARLINK) · Starlink believes in providing fast, reliable internet to serve populations with little or no connectivity. We design, build, launch, and operate the world's largest constellation of satellites, enabling us to operate a global internet ...


  • SPACE EXPLORATION TECHNOLOGIES CORP Redmond, United States

    SITE RELIABILITY ENGINEER (STARSHIELD) · Starshield leverages SpaceX's Starlink technology and launch capability to support national security efforts. While Starlink is designed for consumer and commercial use, Starshield is designed for government use, with an initial focus on ...


  • WaferWire LLC Redmond, United States

    WaferWire is currently seeking a Site Reliability Engineer to join its innovative team. This role involves implementing and maintaining robust DevOps practices within an Azure cloud environment to ensure smooth deployment and operation of services. · The responsibilities include ...


  • Microsoft Corporation Redmond, United States

    Microsoft Silicon and Cloud Hardware Infrastructure Engineering (SCHIE) is the team behind Microsofts expanding Cloud Infrastructure and responsible for powering Microsofts Intelligent Cloud mission. CHIE delivers the core infrastructure and foundational technologies for Microsof ...


  • Microsoft Redmond, United States Full time

    Overview · Do you have a passion for high scale services and working with some of Microsoft's most critical customers? We're looking for a Senior Site Reliability Engineer with the right mix of software development, on-line services experience and passion for quality to envision ...


  • Microsoft Redmond, United States Full time

    Overview · Do you have a passion for high scale services and working with some of Microsoft's most critical customers? We're looking for a Senior Site Reliability Engineer with the right mix of software development, on-line services experience and passion for quality to envision ...


  • Microsoft Corporation Redmond, United States

    Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the team behind Microsoft's expanding Cloud Infrastructure and responsible for powering Microsoft's "Intelligent Cloud" mission. SCHIE delivers the core infrastructure and foundational technologies for M ...


  • SpaceX Redmond, United States

    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. ...


  • SPACE EXPLORATION TECHNOLOGIES CORP Redmond, United States

    ELECTRICAL TEST AND RELIABILITY ENGINEER (STARLINK) · SpaceX is leveraging its experience in building rockets and spacecraft to deploy Starlink, the world's most advanced broadband internet system. Starlink is the world's largest satellite constellation and is providing fast, re ...


  • Cascade Engineering Services Redmond, United States

    JR. RELIABILITY TEST ENGINEER · BASIC QUALIFICATIONS: · •Associate degree in Electrical or Mechanical Engineering, or a related field. A bachelor's degree is preferred. · •2+ years of hands-on experience working in a reliability and/or manufacturing test environment. · •Excel ...


  • Space Exploration Technologies Redmond, United States

    SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars. ...


  • Microsoft Redmond, WA, United States

    Microsoft is a company where passionate innovators come to collaborate, envision what can be and take their careers further. This is a world of more possibilities, more innovation, more openness, and the sky is the limit thinking in a cloud-enabled world. Microsoft's Azure Data e ...


  • Space Exploration Technologies Corporation Redmond, United States

    Starshield leverages SpaceXs Starlink technology and launch capability to support national security efforts. While Starlink is designed for consumer and commercial use, Starshield is designed for government use, with an initial focus on earth observ Reliability Engineer, Liabilit ...


  • Quadrant Technologies Redmond, United States

    Responsibilities include but are not limited to: · Monitor and maintain the Reliability, Availability, and Performance of the Cosmos DB service. · Design and implement Disaster Recovery and Business Continuity plans. · Collaborate with engineering teams to build and enhance tooli ...


  • Quadrant Technologies Redmond, United States

    Responsibilities include but are not limited to: · Monitor and maintain the Reliability, Availability, and Performance of the Cosmos DB service. · Design and implement Disaster Recovery and Business Continuity plans. · Collaborate with engineering teams to build and enhance tooli ...