Jobs

    Senior Infrastructure Engineer - California, United States - Sustainable Talent

    Sustainable Talent
    Sustainable Talent California, United States

    Found in: Appcast US C2 - 3 days ago

    Default job background
    Description

    Sustainable Talent is partnering with Nvidia a global leader who's been transforming computer graphics, PC gaming, and accelerated computing for over 25 years.

    We are looking for a HPC Cluster Engineer to support our client's GPU/HPC Infrastructure Team.

    This is a W-2 full-time contract based in Santa Clara, CA - Hybrid work option. The pay is between $90-$120/ hr based on factors like experience, education, location, etc. and provide full benefits, PTO, and amazing company culture

    As a member of the GPU/HPC Infrastructure team, you will provide leadership in the design and implementation of groundbreaking GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek an expert to identify architectural changes and/or completely new approaches for our GPU Compute Clusters. As an expert, you will help us with the strategic challenges we encounter including computer, networking, and storage design for large-scale, high-performance workloads, effective resource utilization in a heterogeneous compute environment, evolving our private/public cloud strategy, capacity modeling, and growth planning across our global computing environment.

    What you'll be doing:

    • Building and improving our ecosystem around GPU-accelerated computing including developing large scale automation solutions
    • Maintaining and building deep learning clusters at scale
    • Supporting our researchers to run their flows on our clusters including performance analysis and optimizations of deep learning workflows
    • Root cause analysis and suggest corrective action for problems large and small scales
    • Finding and fixing problems before they occur.

    What we need to see:

    • Bachelor's degree in computer science, Electrical Engineering or related field or equivalent experience.
    • Minimum 5 years of experience designing and operating large scale compute infrastructure.
    • Experience analyzing and tuning performance for a variety of HPC workloads.
    • Working knowledge of cluster configuration managements tools such as Ansible, Puppet, Salt.
    • Experience with HPC cluster job schedulers such as SLURM, LSF
    • In depth understating of container technologies like Docker, Singularity, Shifter, Charliecloud.
    • Proficient in Centos/RHEL and/or Ubuntu Linux distros including Python programming and bash scripting.
    • Experience with HPC workflows that use MPI.

    Ways to stand out from the crowd:

    • Understanding of MLPerf benchmarking
    • Familiarity with InfiniBand with IBOP and RDMA
    • Understanding of fast, distributed storage systems like Lustre and GPFS for HPC workloads.
    • Background with Software Defined Networking and HPC cluster networking
    • Familiarity with deep learning frameworks like PyTorch and TensorFlow.

    Sustainable Talent is a M/F+, disabled, and veteran equal employment opportunity and affirmative action employer.


  • Intelletec

    Machine Learning Infrastructure Engineer

    Found in: Appcast US C2 - 6 hours ago


    Intelletec California, United States

    ML Systems Engineer - Lead Role · Join our team as the Lead ML Systems Engineer and drive the development of cutting-edge machine learning systems for video foundation (VFM) and language model (VLM) in production. In this role, you'll lead a talented team, set technical strategie ...

  • Sigmaways Inc

    Senior Infrastructure Automation Engineer

    Found in: Appcast US C2 - 2 hours ago


    Sigmaways Inc California, United States

    We are seeking a Senior Infrastructure Automation Engineer for our direct client with expertise in developing Infrastructure as a code using Terraform, AWS, CICD Pipeline · Responsibilities: · In this role, you will get an opportunity to broadly apply your engineering skills acro ...

  • Woodard & Curran

    Water Infrastructure Project Engineer

    Found in: Appcast Linkedin GBL C2 - 2 days ago


    Woodard & Curran California, United States

    Woodard & Curran is a national engineering, science, and operations firm with a simple vision for clean water, a safe environment, healthy communities, and happy people. As an employee-owned company, we strive to cultivate diverse teams and encourage collaboration in an equitable ...

  • Woodard & Curran

    Water Infrastructure Project Engineer

    Found in: Talent US C2 - 2 days ago


    Woodard & Curran California, United States Full time

    Woodard & Curran is a national engineering, science, and operations firm with a simple vision for clean water, a safe environment, healthy communities, and happy people. As an employee-owned company, we strive to cultivate diverse teams and encourage collaboration in an equitable ...

  • Stealth Startup

    Machine Learning Infrastructure Engineer

    Found in: Appcast US C2 - 3 days ago


    Stealth Startup California, United States

    About Us · We're building a co-pilot for hardware designers. Our mission is to enable 9M mechanical engineers to iterate through designs 1000x faster. · We are building our geometry + physics driven foundation model for each class of part design · We've raised a first round of c ...

  • developrec

    Site Reliability Engineering Manager

    Found in: Appcast US C2 - 2 days ago


    developrec California, United States

    SRE Lead/Manager | San Diego, CA | Full-time · Role Overview: As the Engineering Manager for Site Reliability, you'll lead the charge in transitioning to cloud-based solutions while ensuring the stability of our existing systems for our rapidly growing user base, currently standi ...

  • Harvey Nash

    Head of Information Technology

    Found in: Appcast US C2 - 2 days ago


    Harvey Nash California, United States

    Job Title: Head of IT Infrastructure Technology · Location: Pier 400, in Los Angeles, CA · Perm/FTE Role · This position is based in Pier 400, in Los Angeles, CA and its 100% onsite. · US citizens and Green Card Holders and those authorized to work in the US are encouraged to app ...

  • People Source Consulting

    Data Engineer

    Found in: Appcast US C2 - 2 days ago


    People Source Consulting California, United States

    Would you be interested in a data engineering role at a fast-paced AI (LLM) start-up comprised of Meta, Google, AWS, and Microsoft alumni? · You can expect to: · Curate and manage large-scale data ingestion and indexing pipelines, ensuring data quality and error handling. · Desi ...

  • Acceler8 Talent

    Senior Machine Learning Engineer

    Found in: Appcast US C2 - 4 days ago


    Acceler8 Talent California, United States

    About Us: · Your journey as a Senior ML Engineer is not just about engineering; it's about pioneering. You will spearhead the development of novel software systems designed to empower data scientists and engineers across the spectrum. Your mission will be to diagnose and rectify ...

  • Cyber Spring

    Senior Software Engineer

    Found in: Appcast US C2 - 2 days ago


    Cyber Spring California, United States

    I am currently working with a Seed-Stage AI business developing AI & Cloud-Based Security technologies and improving their field with unprecedented quality led by serial entrepreneurs and experts in the AI & Robotics space. · Working closely with the CTO, my client are looking fo ...

  • Borneo

    Principal SRE

    Found in: Appcast US C2 - 2 days ago


    Borneo California, United States

    Overview: · Borneo is seeking a skilled, experienced, and hands-on Principal Engineer to drive innovation and contribute to our mission of transforming data security and privacy. As the Principal Engineer, you will be a driving force in shaping the technical strategy and architec ...

  • Quantum Search Partners

    Principal Cloud Product Security Architect

    Found in: Appcast US C2 - 1 day ago


    Quantum Search Partners California, United States

    A Quantum Search Partners client ($90B+ revenue global leader in electronics, media, & entertainment) is seeking a Principal Cloud Product Security Architect. This person will work cross-functionally with teams across R&D, product development, product security information securit ...

  • Storm2

    Data Team Lead

    Found in: Appcast US C2 - 2 hours ago


    Storm2 California, United States

    Founding Data Lead -Permanent · AI-powered Web3 Security platform · Up to $200k +Stock Option · US San Francisco Bay Area/ hybrid open to other commutable areas · Our key client is a US an AI-powered Web3 Security platform, which have had 5 rounds of funding and raised millions o ...

  • MWH

    resident engineer

    Found in: MyJobHelper US C2 - 4 days ago


    MWH , CA, United States

    MWH Constructors (MWH), a global leader in heavy civil construction of water and wastewater facilities, is currently seeking a Resident Engineer to join our construction management services (CMS) group in support of critical infrastructure construction work in Southern California ...

  • MWH

    resident engineer

    Found in: MyJobHelper US C2 - 4 days ago


    MWH , CA, United States

    MWH Constructors, a global leader in heavy civil construction of water and wastewater facilities, is currently seeking a Resident Engineer to join our construction management services (CMS) group in support of critical infrastructure construction work in Southern California. · Th ...

  • MWH

    resident engineer

    Found in: MyJobHelper US C2 - 4 days ago


    MWH , CA, United States

    MWH Constructors (MWH), a global leader in heavy civil construction of water and wastewater facilities, is currently seeking a Resident Engineer to join our construction management services (CMS) group in support of critical infrastructure construction work in Southern California ...

  • Stealth Startup

    Senior Network Engineer

    Found in: Appcast US C2 - 2 days ago


    Stealth Startup California, United States

    Title: Sr. Network Engineer. · Location: US Remote (Must be in US Pacific Time Zone) · Looking for a Senior Network Engineer who will be responsible for managing routing, switching, VPN, and firewall infrastructure at our office locations and in the cloud. · Duties: · Manage, mo ...

  • Dar Group

    resident engineer

    Found in: MyJobHelper US C2 - 4 days ago


    Dar Group , CA, United States

    TYLin is a globally recognized, full-service infrastructure consulting firm committed to providing innovative, cost-effective, constructible designs for the global infrastructure market. With over 3,000 employees throughout the Americas, Asia, and Europe, the firm provides suppor ...

  • DataBricks

    research scientist

    Found in: MyJobHelper US C2 - 4 days ago


    DataBricks , CA, United States

    P-1131 · At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own ...

  • Plexus Resource Solutions

    DevOps Engineer

    Found in: Appcast US C2 - 2 days ago


    Plexus Resource Solutions California, United States

    Plexus is working with a leading zero-knowledge proof-based Layer 1 blockchain. They are seeking an experienced DevOps Engineer to join their dynamic infrastructure team. · The ideal candidate will have a strong background in AWS cloud infrastructure, adept at managing and optimi ...