-
Nvidia Santa Clara, CA, United StatesNVIDIA's Deep Learning Optimized Frameworks Group is looking for a deeply technical HPC cluster administrator to lead a diverse cluster of GPU-accelerated systems and provide architectural mentorship to product teams in the deep learning and scientific computing domains. As a mem ...
-
Quantum Ventures Santa Clara, United StatesYour tasks:You will work as a full-fledged team member in various areas of our work, including thin client support, printer and network administration, and creating system documentation. · Our training coordinator will be available to answer your questions and help you deepen you ...
-
Oracle Apps Dba
1 week ago
Wipro Limited San Jose, United StatesSan Jose, California · - Tech Hiring · **Job Description**: · **About Wipro**: · - Wipro Limited (NYSE: WIT, BSE: 507685, NSE: WIPRO) is a leading technology services and consulting company focused on building innovative solutions that address clients' most complex digital transf ...
-
Staff Hpc and Infrastructure Administrator
2 days ago
Varada Consulting Mountain View, United States**Staff HPC and Infrastructure Administrator** · **Clearance**:US Citizenship is required / Ability to obtain a Public Trust · **Job Location**: AMES Research Center, Mountain View, CA (Mon-Fri Regular Business hours, Hybrid 3 days onsite/2 days remote) · **Relocation Assistance ...
-
Instructor - Graphics Communication Program
6 days ago
Newaygo County RESA Fremont, United States**General Function** · The Career-Tech Center Classroom Teacher shall organize, coordinate, promote, and give direction to a comprehensive instructional program for the Newaygo County Career-Tech Center. · **Position Qualifications** · - Minimum of Bachelor's Degree, · Experience ...
-
Property Management Coordinator
1 week ago
Alexandria Real Estate Equities San Carlos, United StatesAs a critical member of our Asset Services team, you would contribute directly to the success of our tenants and their scientific discoveries by providing white-glove services to the tenants, supporting our facilities team, and collaborating with building staff and other stakehol ...
-
DevOps Engineer
1 week ago
Tech Mahindra Santa Clara, United StatesRole : DevOps Engineer · Location : Santa Clara, CA (mandatory to be in office at least 4 days/week – no exception) · JD Summary : · Skilled engineer to join our team and contribute to the successful implementation and management of our cloud-based infrastructure. · Stay up to d ...
-
SQL Database Administrator
1 week ago
Pomeroy Pleasanton, United States**Title**: SQL DBA · **Company**: Pomeroy · **Location**:Pleasanton, CA Hybrid: 2-3 per MONTH onsite) · **Pay Rate**:$45-50/hr · **Duration**: 6+ months · **Job Summary**: · **Responsibilities**: · - SQL Server database administration, configuration, load balancing, performance t ...
-
Technical Support Engineer
2 weeks ago
NVIDIA Santa Clara, United StatesNVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation fueled by great technology—and dynamic people. Today, we're tapping into the unlimited potential of AI to define the next era of computing ...
-
Senior Infrastructure Software Engineer
6 days ago
NVIDIA Santa Clara, United StatesNVIDIAs Deep Learning Architecture and Libraries group is seeking excellent Software Engineers to design and develop the software stack for our next generation test and development cluster, the core infrastructure that provides a foundation for every stage of our product developm ...
-
HPC Cluster Engineer
2 weeks ago
Sustainable Talent Santa Clara, United StatesAre you ready to make your mark in the forefront of technological innovation? As an HPC Cluster Engineer, you'll play a pivotal role in shaping the future of AI, deep learning, and machine learning initiatives. Join us and leverage Nvidia's cutting-edge GPU technology to drive gr ...
-
Senior DevOps and Automation Engineer
1 week ago
NVIDIA Santa Clara, United StatesNVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new unive ...
-
Lead Consultant
2 weeks ago
HCL Technologies Santa Clara, CA, United StatesJob FamilyProduct / Domain Consulting Job Description (Posting).7-10 years of hands-on experience with SAP Basis as a technical core engineerSAP HANA Installation and Administration activitiesSLT Administration, Installation/Upgrades of HANAHANA DB and Client Patches, Table Parti ...
-
Senior Software Big Data Developer
1 week ago
LigaDATA Santa Clara, United StatesAbout Ligadata: · Ligadata is a cutting-edge data analytics company that leverages machine learning to empower businesses with actionable insights. We are at the forefront of innovation, providing advanced solutions for data processing, analysis, and decision-making. Join our dy ...
-
Technical Support Engineer
1 week ago
NVIDIA Santa Clara, United StatesNVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation fueled by great technology—and dynamic people. Today, we're tapping into the unlimited potential of AI to define the next era of computing ...
-
Devops Engineer
1 week ago
Omega Solutions Santa Clara, United StatesDevops Engineer · Client: Samsung · Jersey City, NJ (Onsite) · Must have 7+ years DevOps experience acting as an Individual Contributor. · Max Rate: $65/hr on c2c · Mandatory skillset · • Must be proficient in multiple scripting languages such as ruby, python, bash, etc. · ...
-
Databricks Admin
3 weeks ago
Tech Mahindra Santa Clara, United StatesGreetings · Databricks Admin · Santa Clara, CA · The candidate will work on building, scaling, and monitoring highly complex BigData platform on Databricks, Snowflake DB and ElasticSearch cloud. · The candidate will work on building, scaling, and monitoring Data science and ML ho ...
-
System Administrator
1 week ago
Omega Solutions Santa Clara, United StatesSystem Administrator · 6-9 months Contract to Perm · San Antonio, TX · Rate: Max $45/hr on w2 all inclusive · SAWS is looking for an experienced Systems Administrator. · JOB SUMMARY · The Systems Administrator is responsible for developing and implementing network server sta ...
-
Solutions Architect
3 weeks ago
NVIDIA Santa Clara, United StatesSolutions Architect - AI and HPC Cloud page is loaded · Solutions Architect - AI and HPC Cloud · Apply · locations · US, CA, Santa Clara · time type · Full time · posted on · Posted 3 Days Ago · job requisition id · JR · NVIDIA is looking for a Solutions Architect to w ...
-
System Administrator
2 days ago
Omega Solutions Santa Clara, United StatesSystem Administrator6-9 months Contract to PermSan Antonio, TXRate: Max $45/hr on w2 all inclusive · SAWS is looking for an experienced Systems Administrator.JOB SUMMARYThe Systems Administrator is responsible for developing and implementing network server standards, procedures ...
Senior High Performance Computing Cluster Administrator - Santa Clara, United States - NVIDIA
Description
NVIDIA's Deep Learning Optimized Frameworks Group is looking for a deeply technical HPC cluster administrator to lead a diverse cluster of GPU-accelerated systems and provide architectural mentorship to product teams in the deep learning and scientific computing domains.
As a member of the DLFW Infrastructure team, you will provide leadership in the design and implementation of groundbreaking GPU compute cluster that runs demanding deep learning, high performance computing, and computationally intensive workloads.
We are looking for an expert to identify architectural changes and/or completely innovative approaches for our GPU Compute Cluster.In this role, you will help us with the strategic challenges we encounter, including compute, networking, and storage design for large-scale, high-performance workloads and effective resource utilization in a heterogeneous compute environment.
What you'll be doing:
Administer Linux systems, ranging from powerful DGX servers to embedded systems, bringup hardware to publicly available systems.
Coordinate Storage Solutions and plan for growth.
Automate configuration management, software updates, and maintenance and monitoring of system availability using modern DevOps tools (Ansible, Gitlab, etc.)
Actively connect with management regarding any problems with the equipment and propose resolution.
Plan, build and install/upgrade new systems that support NVIDIA DL Software
What we need to see:
You have a BA, BS, or MS in CS, EE, CE or equivalent experience
4+ years of previous experience deploying and administrating HPC clusters
Familiar with resource scheduling managers (Slurm (preferred), LSF, etc
Proven track record to script in bash, Perl or python
Experience with containers (Docker, Singularity, LXC)
Deep understanding of operating systems, computer networks, and high-performance applications
Ability to work well with developers & test engineers
Hard-working dedication to provide quality in support for your users
Ways to stand out from the crowd:
Familiarity and prior work experience with technologies such as: Ansible, GIT, Slurm, Zabbix, Prometheus, Grafana and Docker
Familiarity with GPU usage in Compute Cluster and Cuda
Experience with mobile and embedded systems
Basic knowledge of Deep Learning.
Experience coding/scripting in Perl/Python/bash
The base salary range is 148,000 USD - 230,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and benefits ) . NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer.
As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
NVIDIA is a Learning MachineNVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and the metaverse is transforming the world's largest industries and profoundly impacting society.
Learn more about NVIDIA .
#J-18808-Ljbffr