- Design, implement, and maintain high-performance computing (HPC) infrastructure on both AWS cloud and on-premises platforms.
- Manage HPC clusters on AWS cloud using AWS ParallelCluster and all related AWS services including Amazon EC2, AWS CloudFormation, Amazon FSx, and Amazon EFS.
- Implement and optimize the use of Slurm, cluster management software, for efficient HPC job scheduling and management.
- Collaborate with researchers and faculty to understand their scientific computing and machine learning (ML) needs and provide tailored solutions.
- Actively seek to understand the latest AI research computing requirements and plan infrastructure upgrades to keep up with evolving trends.
- Provide training, assistance in scripting, software installation services, and technical troubleshooting services to end-users.
- Document use cases, reusable patterns, and technical guidelines.
- Ensure quality outcomes through best practices in security, infrastructure as code, streamlined releases processes, and thorough testing and validation.
- 3+ years of experience in Linux administration.
- 2+ years as an HPC Engineer with HPC cluster user support and troubleshooting.
- 1+ year of AWS cloud infrastructure experience with AWS services used for managing HPC clusters including AWS ParallelCluster, EC2, CloudFormation, FSx, and EFS.
- Experience with Slurm cluster management software.
- Scripting experience with Python or Bash, as well as related tools such as Ansible and Git.
- Knowledge of scientific computing and machine learning.
- Experience working with researchers within an academic, research, or scientific institution.
- Experience with specialized computing including GPU utilization, parallelization, and DevOps aspects such as containerization and automation.
- Knowledge of scientific data, bioinformatics packages, big data analysis methods, and machine learning algorithms.
- AWS Certified Solutions Architect certification.
-
HPC Engineer
2 weeks ago
Brooksource Atlanta, United StatesHPC Engineer (HPC and AWS Environment) · 100% Remote (9AM-5PM EST Work Hours) · Direct Hire (Full-Time Employment) · Check below to see if you have what is needed for this opportunity, and if so, make an application asap. · We are hiring a High-Performance Computing (HPC) Eng ...
-
High Performance Computing
2 weeks ago
Brooksource Atlanta, United StatesHPC Engineer (HPC and AWS Environment) · 100% Remote · (9AM-5PM EST Work Hours) · Direct Hire · (Full-Time Employment) · We are hiring a · High-Performance Computing (HPC) Engineer · with experience working in a · hybrid on-premises HPC and AWS cloud environment . As an HPC ...
-
Lead Operating Systems Analyst/Developer
2 weeks ago
Emory Healthcare/Emory University Atlanta, United StatesDiscover Your Career at Emory University · Emory University is a leading research university that fosters excellence and attracts world-class talent to innovate today and prepare leaders for the future. We welcome candidates who can contribute to the diversity and excellence of o ...
-
Lead Operating Systems Analyst/Developer
2 weeks ago
Emory University Atlanta, United StatesDiscover Your Career at Emory University: · Emory University is a leading research university that fosters excellence and attracts world-class talent to innovate today and prepare leaders for the future. We welcome candidates who can contribute to the diversity and excellence of ...
-
Lead Operating Systems Analyst/Developer
3 weeks ago
Emory University Atlanta, United StatesDiscover Your Career at Emory University · Emory University is a leading research university that fosters excellence and attracts world-class talent to innovate today and prepare leaders for the future. We welcome candidates who can contribute to the diversity and excellence of o ...
-
Master Principal GPU/HPC Cloud Architect
1 week ago
Oracle Defunct Atlanta, United StatesJob Description · Your mission is to work with Oracles largest customers/partners on migration/net new strategies to move their intellectual property software and to develop their next-generation offerings on the Oracle Cloud. You will work with internal business development str ...
-
Master Principal GPU/HPC Cloud Architect
3 days ago
Oracle Defunct Atlanta, United StatesJob Description · Your mission is to work with Oracles largest customers/partners on migration/net new strategies to move their intellectual property software and to develop their next-generation offerings on the Oracle Cloud. You will work with internal business development stra ...
-
Sr. Sales Manager
3 weeks ago
Super Micro Computer Atlanta, United StatesJob Req ID: 22408 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
Sr. Sales Manager
3 weeks ago
Super Micro Computer Atlanta, United StatesJob Req ID: 22408 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
System Administrator
2 weeks ago
Diverse Lynx Atlanta, United StatesPosition: System Administrator · Location: Atlanta ,GA - Onsite · Duration: Full-Time · Technical/Functional Skills : System Administration · Roles & Responsibilities: · •Provide ongoing System Administration, maintenance and support of local and remote servers · •Identify an ...
-
Service Engineer, Data Center
2 days ago
Support Revolution Atlanta, United StatesSelect how often (in days) to receive an alert: · Create Alert · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We ar ...
-
Service Engineer
2 weeks ago
Knewin Atlanta, United StatesJob Req ID: 22818 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
Sr. Technical Support Associate
3 weeks ago
Knewin Atlanta, United StatesJob Req ID: 24182 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
Service Engineer
2 weeks ago
Super Micro Computer Atlanta, United StatesJob Req ID: 22818 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
Service Engineer, Data Center
3 weeks ago
Super Micro Computer Atlanta, United StatesJob Req ID: 24182 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
Sr. Technical Support Associate
1 week ago
Super Micro Computer Atlanta, United StatesJob Req ID: 24182 · About Supermicro: · Supermicro is a Top Tier provider of advanced server, storage, and networking solutions for Data Center, Cloud Computing, Enterprise IT, Hadoop/ Big Data, Hyperscale, HPC and IoT/Embedded customers worldwide. We are the #5 fastest growing ...
-
Senior Software Engineer
2 weeks ago
Microsoft Corporation Atlanta, United StatesIn Azure Specialized we are collaboratively working to bring the next generation of workloads to our Public Cloud platform. We work together across Microsoft to enable end to end new scenarios for Azure customers. Our team imagines and builds differentiating customer features and ...
-
Senior Software Engineer
2 weeks ago
Fugro Atlanta, United StatesEnjoy flexible working hours, including hybrid work-from-home options, to support your work-life balance and personal commitments · Be rewarded with competitive compensation and annual salary reviews · Feel the security that our salary continuance insurance provides in case of un ...
-
Research Scientist I/II Open Rank
1 week ago
Georgia Institute of Technology Atlanta, United StatesAbout Us · Overview · Georgia Tech prides itself on its technological resources, collaborations, high-quality student body, and its commitment to building an outstanding and diverse community of learning, discovery, and creation. We strongly encourage applicants whose values alig ...
-
Lead Quality Tech
1 week ago
Georgia Institute of Technology Atlanta, GA, United StatesJob Title: Lead IT Support Professional · Regular/Temporary: Full/Part Time: Full-Time · Georgia Tech prides itself on its technological resources, collaborations, high-quality student body, and its commitment to building an outstanding and diverse community of learning, discov ...
HPC Engineer - Atlanta, United States - Brooksource
Description
HPC Engineer (HPC and AWS Environment)
100% Remote (9AM-5PM EST Work Hours)
Direct Hire (Full-Time Employment)
We are hiring a High-Performance Computing (HPC) Engineer with experience working in a hybrid on-premises HPC and AWS cloud environment. As an HPC Engineer, you will join an innovative HPC team responsible for configuring, integrating, and managing HPC clusters on AWS cloud for our prestigious client, a private research university based in Atlanta, GA.
You will play a pivotal role in supporting their hybrid on-prem HPC infrastructure and AWS cloud-based HPC, while continually expanding and integrating HPC clusters with AWS services to meet the growing scientific computing needs of its researchers, allowing researchers to perform computationally intensive workloads more quickly and securely, particularly in the multi-disciplinary field of Artificial Intelligence (AI).
Key Responsibilities:
Minimum Requirements:
Preferred Qualifications: