No more applications are being accepted for this job
Senior Engineer, GPU Infrastructure - West Palm Beach, United States - Vultr
Description
Join Vultr
The Engineering team is a central pillar of our growth strategy, and we are looking for a Principal Engineer, GPU Infrastructure to help build and support our GPU-based product offerings.
You and your team will have ownership over the setup and provisioning of our GPU-based systems and help drive engineering and operational excellence around our GPU infrastructure.
Our team's mission is to provide a fast, performant, and stable infrastructure for all of our customers.What to expect:
Developing and maintaining GPU infrastructure in bare metal and containerized environments
Work directly with our networking team to build scalable and supportable GPU clusters
Ensure excellent customer experience by ensuring consistent and reliable provisioning of GPU infrastructure
Build and maintain test automation of GPU-based products to ensure fast and reliable provisioning
Implement and maintain GPU-based solutions to meet the needs of diverse applications and computational workloads
Conduct in-depth benchmarking, performance testing, and troubleshooting of GPU systems to identify and resolve any hardware or software limitations
Working with vendors to get all supported drivers and packages
Working with vendors on any bugs, performance-related issues, hardware problems, and reference architectures
Address any hardware, software, or performance issues promptly, coordinating with vendors, technical support, and internal teams as required
Our ideal candidate will have:
Hands-on experience working with current, high-performance GPUs, primarily NVIDIA products (e.g. NVLink, Infiniband, GRID drivers, vGPU and NVAIE)
In-depth, hands-on experience working with and automating bare metal internals including BIOS, BMC, firmware, NICs, Redfish/IPMI, PCIe
Experience with Linux, package management and device drivers
Experience with commercial firmware
Experience with Python, Bash, and PHP
Experience with Machine Learning software
Compensation
$120,000 - $135,000
This salary can vary based on location, years of experience, background and skill set.
#J-18808-Ljbffr