No more applications are being accepted for this job
- CUDA (Compute Unified Device Architecture) /OpenCL (Open Computing Language)
- Bachelor's or higher degree in Computer Science, Electrical Engineering, or a related field
- 10+ years of relevant systems engineering experience
- Proven experience in GPU architecture design, and GPU performance optimization.
- Expertise in operating system integration for Linux.
- Strong understanding of computer hardware architecture, particularly as it relates to Linux systems.
- Knowledge of parallel computing, graphics algorithms, and real-time rendering in Linux environments.
- Familiarity with GPU debugging tools and profiling software for Linux.
- Excellent problem-solving skills and the ability to collaborate within a team.
- Strong communication skills for conveying technical information in a Linux context.
- Proficiency with scripting languages such as Python or BASH.
- Proficiency with automation tools such Ansible, Puppet, Salt, Terraform, etc.
- Candidate must, at a minimum, meet DoD IAT Level II certification requirements (currently Security+ CE, CCNA-Security, GICSP, GSEC, or SSCP along with an appropriate computing environment (CE) certification)
- Published research or contributions in the GPU industry, especially related to Linux.
- Experience with machine learning and neural network frameworks on GPUs in Linux.
- Knowledge of GPU virtualization, cloud computing, and emerging Linux-based technologies in the field.
- Proficiency in programming languages such as GPU-specific languages.
- Experience with container technologies (Docker, Kubernetes)
- Experience with Prometheus/Grafana for monitoring
- Knowledge of distributed resource scheduling systems [Slurm (preferred), LSF, etc.]
- Familiarity with CUDA and managing GPU-accelerated computing systems
- Basic knowledge of deep learning frameworks and algorithms
Linux Server GPU Engineer - Bethesda, MD, United States - Xcelerate Solutions
Description
Linux Server/NVidia Admin/ GPU Engineer - TS/SCI Xcelerate Solutions is seeking a Linux Server GPU Engineer position to support the National Media Exploitation Center (NMEC)This role requires an individual that has technical experience with administering Nvidia DGX1 and A100 servers within a within a physical and virtual environment
This individual should be detail oriented in order to capture customer inquiries appropriately
This role is responsible for interacting with administrators to handle service inquiries and problems
Duties include examining customer problems and implementing appropriate corrective action to initiate a repair or return to service
This role analyzes recurring problems and initiates solutions for preventing reoccurrence and analyzes existing infrastructure for tuning/performance enhancements
The individual will provide systems and software operations and maintenance support in a large, multi-enclave enterprise environment
This individual will work in a team environment to ensure mission needs are met and ensure functionality of capabilities of customers
Individuals in this role may be required to perform technical software configuration, rebooting, and other remedial actions on customer servers
The Customer utilizes an Agile Framework to plan and successfully complete all initiatives
The work location is in Bethesda at the Intelligence Community Campus
Security Clearance:
TS/SCI Location:
Bethesda, MD Responsibilities:
GPU Architecture and Design:
Collaborate with a multidisciplinary team to define, develop, and optimize GPU architectures, ensuring they meet stringent performance, power efficiency, and feature requirements
Leverage industry insights to drive design decisions
Ensure that GPU designs and integrations are not only optimized for Linux but are also adaptable to other operating systems.
Operating System Integration:
Work closely with operating system developers to ensure smooth GPU integration with Linux-based systems
Optimize GPU drivers for compatibility, performance, and reliability in a Linux environment
Provide regular maintenance and updates to ensure continued compatibility.
Hardware Expertise:
Contribute to the design and development of GPU hardware, providing insights into hardware architecture to ensure efficient interaction with software components
Maintain and update hardware designs as needed.
Programming:
Develop and optimize applications using CUDA or OpenCL, harnessing the full potential of GPU hardware for parallel processing, high-performance computing, and machine learning on Linux platforms
Maintain and update software for optimal performance.
Performance Analysis:
Analyze GPU performance, identify bottlenecks, and develop strategies to enhance performance across various applications in Linux, addressing both hardware and software considerations
Regularly monitor and improve performance.
GPU Tooling:
Create and maintain debugging tools, profiling utilities, and performance analysis software tailored for Linux systems to facilitate efficient GPU development and troubleshooting
Keep tools up-to-date and functional.
Power Efficiency:
Work on power management techniques to optimize GPU power consumption, ensuring efficient operation on both mobile and desktop Linux platforms
Continuously assess and enhance power efficiency strategies.
Testing and Validation:
Design and execute tests to validate GPU performance and functionality on Linux, including stress testing, benchmarking, and debugging to ensure robust operation
Maintain and expand the testing suite.
Documentation:
Maintain comprehensive technical documentation, including architectural specifications, code documentation, and Linux-specific best practices for GPU development
Keep documentation up-to-date with changes and improvements.
Industry Insight:
Stay updated on the latest trends, innovations, and competitive landscapes within the GPU industry, contributing to research efforts and proposing Linux-specific approaches to GPU design and optimization
Share regular updates and insights with the team
Minimum Requirement
Preferred Qualification
About Xcelerate Solutions:
Founded in 2009 and headquartered in McLean, VA, Xcelerate Solutions ) is one of America's fastest-growing companies
Xcelerate's culture is defined by our diversified workforce of dynamic and versatile professionals, supported with growth and development opportunities that contribute to individual and company growth
This strong commitment to our employees has been recognized by our inclusion on the Washington Business Journal's "50 Best Places to Work" list as well as being a "Great Place to Work" certified company with a 4.
6 star, and a 99% CEO approval Glassdoor ratingCome find out why Xcelerate Solutions is one of the DC Metro top employers Xcelerate Solutions is an Equal Employment Opportunity/Affirmative Action Employer
We evaluate qualified applicants without regard to race, color, national origin, religion, age, equal pay, disability, veteran status, sex, sexual orientation, gender identity, genetic information, or expression of another protected characteristic
As part of this commitment to the full inclusion of all qualified individuals, Xcelerate provides reasonable accommodations if needed because of an applicant's or an employee's disability
Pay Transparency Notice:
Xcelerate Solutions will not discharge or in any other manner discriminate against employees or applicants because they have inquired about, discussed, or disclosed their own pay or the pay of another employee or applicant.