Jobs
>
Mountain View

    Staff HPC Engineer - Mountain View, United States - ASRC Federal Holding Company

    Default job background
    Description
    Job Description

    ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operations to federal government customers. Our employees embrace innovation and are committed to a culture of continuous, standards-driven process improvement, and assimilation of industry best practices. We are seeking to fill a role that primarily provides development for Supercomputing Batch Scheduling with Supercomputing Systems Administration secondary support for our NASA NACS High Performance Computing (HPC) contract.

    Summary: The successful candidate will be an active supporting member of the ASRC Federal team reporting directly to the Manager of the Application Performance and Productivity (APP) group and matrixed directly to the Supercomputing Systems Team Manager.

    An individual at this skill level should have demonstrated extensive experience working with common HPC batch schedulers e.g. (PBS, Slurm, or Moab/Torque) while contributing to the support of users of HPC resources on the various issues they might have getting applications to run efficiently. This individual should demonstrate experience installing, maintaining, and upgrading HPC systems. The individual, along with the entire HPC team, will be engaged in the day-to-day operations and support of the HPC resources. Activities may include system patching, OS upgrades, deploying new systems, writing scripts, and troubleshooting system issues on the HPC system. The ability to interact with users to determine symptoms, and then reproduce their issues to isolate the causes is critical skills for this work. There will also be activities in testing, benchmarking, user tool scripting, and analyzing trouble tickets to find patterns indicating system or user education issues.

    Duties and Responsibilities:
    • Designs, deploys and maintains HPC clusters with over 2000+ nodes with InfiniBand, 100+ petabytes of data storage in production.
    • Write and shepherd scalable feature designs through the entire software development process, from requirements and use cases to release
    • Designs and develops scripts for system administration, monitoring and usage reporting.
    • Modify existing software to correct errors and/or improve performance
    • Designs and develops scripts for system regression test and performance (file systems (Luster), scheduler (PBS), interconnect (HDR/NDR, Slingshot, ), high availability, etc.).
    • Troubleshoots, isolates and resolves application, system and other technical problems (hardware, software, and network).
    • Understands research use cases, researches and deploys new technologies, defining cost, performance and other trade-offs.
    • Manages and maintains tools for configuration management (HPCM, Ansible GIT), resource management, scheduling and all necessary aspects of HPC in accordance with best practices.
    • Researches, deploys and manages networking and security infrastructure, including development of policies and procedures.
    • Assists in developing and writing proposals and publications.
    • Creates and provides clear documentation.
    • Mentoring junior staff and cross training peers
    • After hours/weekend support as required
    • Moderate Supercomputing System Administration that contributes to:
      • Day-to-day operations of the Linux HPC clusters and storage systems
      • Proactive monitoring, analyze, and correct system issues
      • Development of scripts to automate repetitive tasks or tools to enhance support of the HPC systems
      • System performance analysis and tuning
      • Building, installing, and supporting user-requested software
      • Supporting evaluation and assessment of new HPC technology
      • Resolving user report issues and manage support tickets requests in Remedy
    Requirements

    Requirements:
    • Bachelor's degree in computer science or related field
    • Strong computer science background with in-depth systems-level knowledge in operating systems and networking
    • A minimum of 5 years experience of administration of HPC systems and scheduling software (PBS, Slurm, or Moab/Torque)
    • A minimum of 5 years of experience of systems programming in heterogeneous, multi-platform HPC environments
    • Strong ability to analyze, debug and maintain the integrity of an existing code base
    • Demonstrated equivalence of 5 years of Linux/UNIX user support experience and hands-on experience with administration of Linux systems
    • Experience working with HPC applications and proficiency in at least C, C++, or Fortran
    • Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash
    • Strong ability to interact with customers to understand needs, elicit requirements, and get feedback on prototype solutions
    • Excellent communication and people skills; excellent time management and organizational skills
    • Experience with system configuration management tools e.g. , puppet, chef, ansible
    • Experience with revision control software e.g. CVS, SVN, Git
    • Track record of delivering commercial quality software on schedule with excellent quality through multiple release cycles
    • Proficiency at technical writing
    Preferred Skills (Requesting Manager Defines):
    • Proficiency with analysis and problem-solving skills for debugging and optimization of applications
    • Familiarity/proficiency with OpenMP and Message Passing Interface (MPI) programming
    • Experience with Lustre, and InfiniBand
    • Experience with cloud technologies (AWS, Azure, GCP), OpenStack or Kubernetes is a plus
    EEO Statement

    ASRC Federal and its Subsidiaries are Equal Opportunity / Affirmative Action employers. All qualified applicants will receive consideration for employment without regard to race, gender, color, age, sexual orientation, gender identification, national origin, religion, marital status, ancestry, citizenship, disability, protected veteran status, or any other factor prohibited by applicable law.

  • ASRC Federal Holding Company

    Staff HPC Engineer

    2 weeks ago


    ASRC Federal Holding Company Mountain View, United States Full time

    Job Title · Staff HPC EngineerLocation · NASA/AMES, MOFFETT FIELD-CA026Job Description · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operations to federal government c ...

  • ASRC Federal Holding Company

    Staff HPC Engineer

    2 weeks ago


    ASRC Federal Holding Company Mountain View, United States Full time

    Job Title · Staff HPC EngineerLocation · NASA/AMES, MOFFETT FIELD-CA026Job Description · ASRC Federal is searching for a Staff HPC Engineer to support Inuteq LLC out of NASA AMES, CA · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle f ...

  • ASRC Federal Holding Company

    Senior HPC Engineer

    4 weeks ago


    ASRC Federal Holding Company Mountain View, United States Full time

    Job Title · Senior HPC EngineerLocation · NASA/AMES, MOFFETT FIELD-CA026Job Description · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operations to federal government ...

  • ASRC Federal Holding Company

    Senior HPC Engineer

    2 weeks ago


    ASRC Federal Holding Company Mountain View, United States Full time

    Job Title · Senior HPC EngineerLocation · NASA/AMES, MOFFETT FIELD-CA026Job Description · ASRC Federal is searching for a Senior HPC Engineer to support Inuteq LLC which this role is fully telework · ASRC Federal InuTeq provides High Performance Computing services throughout the ...

  • ASRC Federal Holding Company

    Staff HPC Engineer

    1 week ago


    ASRC Federal Holding Company Mountain View, United States

    Job Description · ASRC Federal is searching for a Staff HPC Engineer to support Inuteq LLC out of NASA AMES, CA · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operati ...

  • Randstad

    Staff hpc engineer

    8 hours ago


    Randstad Mountain View, United States

    job summary: · Randstad Federal is seeking a Staff HPC Engineer for a role supporting NASA · location: Mountain View, California · job type: Contract · salary: $ per hour · work hours: 8am to 4pm · education: Bachelors · responsibilities: · Duties and Responsibilities: · Design ...

  • ASRC Federal Holding Company, LLC

    Staff HPC Engineer

    4 weeks ago


    ASRC Federal Holding Company, LLC Mountain View, United States

    The successful candidate will be an active supporting member of the ASRC Federal team reporting directly to the Manager of the Application Performance and Productivity (APP) group and matrixed directly to the Supercomputing Systems Team Manager. An i Staff, Engineer, Computer Sci ...

  • Randstad North America, Inc.

    Staff HPC Engineer

    4 days ago


    Randstad North America, Inc. Mountain View, United States

    . Designs, deploys and maintains HPC clusters with over 2000 nodes with Infini. Band, 100 petabytes of data storage in production. · . Write and shepherd scalable feature designs through the entire software development process, from requirements Staff, Engineer, Computer Science, ...

  • ASRC Federal Holding Company

    Senior HPC Engineer

    1 week ago


    ASRC Federal Holding Company Mountain View, United States

    Job Description · ASRC Federal is searching for a Senior HPC Engineer to support Inuteq LLC which this role is fully telework · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition ...

  • Randstad

    staff hpc engineer

    6 days ago


    Randstad Mountain View, United States

    staff hpc engineer. · mountain view , california · posted 1 day ago · job details · summary · $60 - $70 per hour · temp to perm · bachelor degree · category computer and mathematical occupations · reference · job details · job summary: · Randstad Federal is seeking a Staff ...

  • ASRC Federal Holding Company

    Senior HPC Engineer

    1 week ago


    ASRC Federal Holding Company Mountain View, United States

    Job Description · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operations to federal government customers. Our employees embrace innovation and are committed to a cult ...

  • ASRC Federal Holding Company

    Senior HPC Engineer

    4 weeks ago


    ASRC Federal Holding Company Mountain View, United States

    Job Description · ASRC Federal InuTeq provides High Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, and operations to federal government customers. Our employees embrace innovation and are committed to a cult ...


  • 1000 KLA Corporation Milpitas, United States Full time

    Description · /Preferred Qualifications Responsibilities for this exciting role will include: · Design, implementation & support of high-performance compute clusters · Solid knowledge on HPC systems, including CPU/GPU architecture, scalable/robust storage, high-bandwidth inte ...


  • Sustainable Talent Santa Clara, United States

    Are you ready to make your mark in the forefront of technological innovation? As an · HPC Cluster Engineer , you'll play a pivotal role in shaping the future of AI, deep learning, and machine learning initiatives. Join us and leverage Nvidia's cutting-edge GPU technology to driv ...


  • NVIDIA Santa Clara, United States Full time

    Salary 180, ,250 USD per year · Requirements: · - Bachelor's degree in Computer Science, Electrical Engineering, or related field, or equivalent experience. · - 8+ years of experience designing and operating large scale storage infrastructure. · - Experience analyzing and tuning ...

  • Sustainable Talent

    HPC Cluster Engineer

    4 weeks ago


    Sustainable Talent Santa Clara, United States

    Are you ready to make your mark in the forefront of technological innovation? As an HPC Cluster Engineer, you'll play a pivotal role in shaping the future of AI, deep learning, and machine learning initiatives. Join us and leverage Nvidia's cutting-edge GPU technology to drive gr ...


  • Guardant Health Palo Alto, United States

    Job Description · Job DescriptionCompany Description · Guardant Health is a leading precision oncology company focused on helping conquer cancer globally through use of its proprietary tests, vast data sets and advanced analytics. The Guardant Health oncology platform leverages c ...


  • TECHFUJI LLC Cupertino, United States

    Job Description · Job DescriptionWe are looking for a Senior Systems Developer with expertise in AWS, HPC Job Schedulers (PBS), Python, DevOps, Linux Administration, FlexLM, and Managing SQL and NoSQL on AWS. · Job Responsibilities · Designing and implementing the next generation ...


  • NVIDIA Santa Clara, United States

    NVIDIA has continuously reinvented itself over two decades. Our invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined modern computer graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI - the next era of ...


  • ASML San Jose, United States

    Job ID: J · Introduction to the job · The hands-on job of a software engineer for HPC platform is responsible for the design, review and collaboration with computation infrastructure team for a future proof cloud and virtual compute platform with optimization on both in-house a ...