Jobs
>
Salt Lake City

    Site Reliability Engineer, HPC Infrastructure and Platforms - Salt Lake City, United States - Battelle Applied Solutions, LLC

    Default job background
    Description
    Requisition Id 11979


    Overview:


    The National Center for Computational Sciences (NCCS) at Oak Ridge National Lab (ORNL), which hosts several of the world's most powerful computer systems, is seeking highly qualified individuals to play a key role in improving the security, performance, and reliability of the NCCS computing infrastructure which supports multiple highly ranked Top500 Supercomputers, including the first exaflop supercomputer, Frontier.


    The Team:


    As a Site Reliability Engineer, you will work within the HPC Infrastructure and Platforms group to support all activities of our supercomputer center.

    Our primary platform is the OLCF Slate Service, built on Kubernetes and Red Hat OpenShift, which provides a container orchestration service for running critical operation applications and user-managed persistent applications that run alongside our OLCF Supercomputer systems and other OLCF supported HPC clusters.


    Major Duties/Responsibilities:
    Improve reliability, scalability and quality of our Kubernetes and Linux based applications and services.
    Define and implement define critical metrics, processes and drive continuous improvement.
    Capture and analyze metrics to assist in tuning operating systems and applications.
    Diagnose system operational problems quickly and effectively.
    Participate in on-call rotation providing 24-hour, 7-day support and off-hours maintenance windows.
    Coordinate with vendors to resolve hardware and software problems.

    Deliver ORNL's mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service.

    Promote diversity, equity, inclusion, and accessibility by fostering a respectful workplace - in how we treat one another, work together, and measure success.


    Basic Qualifications:


    Bachelor's Degree in computer science or closely related field and a minimum of 5 years of experience as an SRE/Systems Engineer.

    An equivalent combination of education and experience may be considered.


    Preferred Qualifications:
    Excellent interpersonal/communication skills, and the ability to work as part of a team.
    Strong working knowledge of Unix system fundamentals and common network protocols.
    Experience managing Linux/UNIX operating systems in a heterogeneous environment.
    Proven understanding of networked computing environment concepts.

    Ability to develop and maintain programs and scripts that aid in the operation and automation using various shell (primarily bash) and high-level languages (Python or Go).

    Ability to proactively identify performance issues, problems, and areas for improvement.
    Experience with continuous integration and continuous deployment software methodologies and how they apply to SRE/systems engineering.
    Understanding of code review and familiarity with tools like GitHub and GitLab
    Experience using tools such as Nagios, Grafana and Prometheus to monitor systems, metrics, and create dashboards.
    Experience implementing systems/services using virtual machines and Kubernetes resources.
    Experience deploying and maintaining automated configuration management software such as Puppet or Ansible
    Experience implementing systems-level security technologies like SELinux and following best security practices.


    Special Requirement:
    This position requires the ability to obtain and maintain a clearance from the Department of Energy.

    As such, this position is a Workplace Substance Abuse program (WSAP) testing designed position which requires passing a pre-placement drug test and participation in an ongoing random drug testing program in which employees are subject to being randomly selected for testing.

    The occupant of this position will also be subject to an ongoing requirement to report to ORNL any drug-related arrest or conviction or receipt of a positive drug test result.

    #LI-KC1


    This position will remain open for a minimum of 5 days after which it will close when a qualified candidate is identified and/or hired.


    We accept Word (.doc, .docx), Adobe (unsecured .pdf), Rich Text Format (.rtf), and HTML (.htm, .html) up to 5MB in size.

    Resumes from third party vendors will not be accepted; these resumes will be deleted and the candidates submitted will not be considered for employment.

    If you have trouble applying for a position, please email

    ORNL is an equal opportunity employer. All qualified applicants, including individuals with disabilities and protected veterans, are encouraged to apply. UT-Battelle is an E-Verify employer.
    #J-18808-Ljbffr


  • SoFi Salt Lake City, United States

    Who we are: · Shape a brighter financial future with us. · Together with our members, we're changing the way people think about and interact with personal finance. · We're a next-generation financial services company and national bank using innovative, mobile-first technology to ...


  • MasterControl Salt Lake City, United States

    About MasterControl: · MasterControl Inc. is a leading provider of cloud-based quality and compliance software for life sciences and other regulated industries. Our mission is the same as that of our customers to bring life-changing products to more people sooner. The MasterContr ...


  • Velvet Salt Lake City, United States

    Velvet is an AI fintech firm based in Utah with offices in NYC. Velvet has raised $7.7M to date from world-leading investors including Outlander VC, the Winklevoss Twins, and Alumni Ventures. We are revolutionizing the venture capital industry by creating scalable automation, dat ...


  • HCLTech Salt Lake City, United States

    Role: Linux Kubernetes Engineer · Address: Dallas, TX · Working Days: 5 days from office (No work from home) · Must haves · Bachelor's degree in computer science, information technology, computer programming, or similar · Design, implement, and maintain VMware vSphere virtualizat ...

  • Altitude AI

    Software Engineer

    3 days ago


    Altitude AI Salt Lake City, United States Full time

    Be a part of the future of autonomous robots In this software engineering role, you'll be on the front lines building software for a fully autonomous robot, guided by a team of expert software engineers and roboticists from Waymo, Google, Carnegie Mellon, Princeton, and top robot ...


  • CIRCLE Salt Lake City, United States

    Circle is a financial technology company at the epicenter of the emerging internet of money, where value can finally travel like other digital data — globally, nearly instantly and less expensively than legacy settlement systems. This ground-breaking new internet layer opens up p ...


  • Software Technology Group Salt Lake City, United States

    Position Summary · Software Technology Group is a software development consulting company. We help our clients to build software solutions that transform their applications and businesses. Our full-time, salaried employees enjoy building and strengthening their skill sets with ou ...


  • Master Control Salt Lake City, United States Full time

    Summary · The Cloud Systems Engineer is responsible for technical architecture, prioritization of products and tooling, capacity planning, scalable automation, and supporting technologies related to maintaining world-class cloud-based ensure the consistency, reliability, availa ...


  • L3 Technologies Salt Lake City, United States

    Job Title: Specialist, Software Engineering · Job Code: SWP3 · Job Location: SLC Hybrid · Job Schedule: 9/80 · Job Description: · We are seeking a mid-level Software Engineer with a DevOps focus to join our L3Harris Broadband Communications Systems team in Salt Lake City, Ut ...


  • Alter Domus Salt Lake City, United States

    ABOUT US · We are Alter Domus. Meaning "The Other House" in Latin, Alter Domus is proud to be home to 85% of the top 30 asset managers in the alternatives industry, and more than 5,000 professionals across 23 countries. · With a deep understanding of what it takes to succeed i ...

  • MasterControl

    Financial Analyst

    2 weeks ago


    MasterControl Salt Lake City, United States

    About MasterControl · MasterControl Inc. is a leading provider of cloud-based quality and compliance software for life sciences and other regulated industries. Our mission is the same as that of our customers to bring life-changing products to more people sooner. The MasterContro ...

  • Cambia Health Solutions

    Unified Comm Admin II

    2 weeks ago


    Cambia Health Solutions Salt Lake City, United States Full time

    Unified Communication Administrator II · Work from home (telecommute or tele-flex) within Oregon, Washington, Idaho or Utah · Build a career with purpose. Join our to create a person-focused and economically sustainable health care system. · Who We Are Looking For: · The ...


  • InsideHigherEd Lake City, United States

    Job Title: Senior Network Engineer · Location: Clayton State University · Regular/Temporary: Regular · Full/Part Time: Full-Time · Job ID: 269663 · About Us Located in Morrow, Georgia, Clayton State University is roughly 15 miles southeast of downtown Atlanta. The university offe ...


  • Albany International Salt Lake City, United States Full time

    Job Description · Albany Engineered Composites (AEC) is one of the most technically advanced designers and manufacturers of lightweight composite aerospace structures, subassemblies and components. For more than 50 years, AEC has innovated advanced composite solutions for defens ...

  • SPECTRAFORCE

    Storage Engineer

    2 weeks ago


    SPECTRAFORCE Salt Lake City, United States

    Title: Execution Storage Engineer · Duration: Till end of 2024, likely extensions · Location: Salt Lake City, UT · Impact: · Our team of engineers builds solutions to the most complex problems. We develop cutting-edge systems and processes that form the core of our key business a ...

  • Capgemini Engineering

    Electrical Engineer

    3 weeks ago


    Capgemini Engineering Salt Lake City, United States

    Title: Electrical Engineer · Location: Salt Lake City UT · Duration: Full Time · Position Description: · Bachelor of Science degree in Electrical Engineering · At least 5 years electrical engineering work experience · Ability to demonstrate proficiency using a Digital Multimeter, ...


  • General Dynamics Information Technology Salt Lake City, United States

    The Information System Security Engineer (ISSE) is primarily responsible for conducting information system security engineering activities with a focus on lifecycle of current systems and future requirement scoping. The position will collect and process the captured information s ...


  • Square Salt Lake City, United States Full time

    Company Description · Block is one company built from many blocks, all united by the same purpose of economic empowerment. The blocks that form our foundational teams — People, Finance, Counsel, Hardware, Information Security, Platform Infrastructure Engineering, and more — prov ...


  • O.C. Tanner Salt Lake City, United States Full time

    Job Description · Tanner develops employee recognition and rewards programs that help companies appreciate people who do great work. As part of that effort, we build large-scale, international, multi-million user web and mobile applications used by Fortune 500 companies. · Job D ...

  • General Dynamics Information Technology

    Information Assurance

    3 weeks ago


    General Dynamics Information Technology Salt Lake City, United States

    The Information System Security Engineer (ISSE) is primarily responsible for conducting information system security engineering activities with a focus on lifecycle of current systems and future requirement scoping. The position will collect and process the captured information s ...