Principal Software Engineer – PyTorch Training Frameworks - San Jose

Only for registered members San Jose, United States

2 weeks ago

Default job background

Job summary

The Role: AMD is looking for a Principal-level PyTorch training framework expert to help drive performance, scalability, and correctness of large-scale AI training on AMD Instinct accelerators. You will work at the intersection of PyTorch internals, distributed training, and hardware-aware optimization.


Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • Only for registered members San Jose, CA

    At AMD, Principal-level PyTorch training framework expert to help drive performance, scalability, and correctness of large-scale AI training on AMD Instinct accelerators. · ...

  • Only for registered members Santa Clara $224,000 - $356,500 (USD)

    We are looking for engineers to design develop and optimize diverse real world workloads. · NVIDIA is an open-source scalable and cloud-native framework built for researchers and developers working on Large Language Models (LLM) and Multimodal (MM) foundation model pretraining an ...

  • Only for registered members Santa Clara Full time $184,000 - $356,500 (USD)

    NVIDIA is looking for engineers for our NeMo Framework team to design, develop and optimize diverse real world workloads. · Develop algorithms for AI/DL data analytics machine learning or scientific computing · Solve large-scale end-to-end AI training and inference challenges spa ...

  • Only for registered members US, CA, Santa Clara

    NVIDIA is looking for engineers for our NeMo Framework team to design, develop and optimize diverse real world workloads. Our GenAI Frameworks provide end-to-end model training... You will collaborate with internal partners... · What you'll be doing:Develop algorithms... · ...

  • Only for registered members Santa Clara, CA

    NVIDIA is looking for engineers to design and develop diverse real world workloads using the NeMo Framework. · ...

  • Only for registered members Palo Alto, CA

    The reasoning infrastructure team builds an end-to-end RL training framework to enable pretraining-scale RL. In this role you might: design and implement state-of-the-art distributed RL systems profile debug and optimize system performance software and algorithm co-design with en ...

  • Only for registered members Palo Alto $180,000 - $440,000 (USD)

    xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. · Design and implement state-of-the-art distributed RL systems · Profile, debug, and optimize system performance · Software and algorithm co-design with ...

  • Only for registered members Palo Alto, CA

    xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. · Design and implement state-of-the-art distributed RL systems · ...

  • Only for registered members Sunnyvale

    Invented for Life drives us at Bosch and our vision of future mobility. Autonomous vehicles will change the way we move and at Bosch we are working on making this future a reality. · Define and drive execution of the technical roadmap and strategy for the E2E AI machinery, includ ...

  • Only for registered members Sunnyvale, CA

    We are growing our team to solve some of the hardest automated driving problems. As the Senior Principal Engineer, E2E AI Training Framework for Autonomous Driving Systems, you will spearhead the development and optimization of the machinery that efficiently trains, · optimizes t ...

  • OCTOPYD San Jose

    Job Description: · As an Applied Scientist at Octopyd you will tackle real-world scientific and coding challenges—developing novel reinforcement-learning algorithms and physics-informed models. · Bridge research and production by collaborating closely with ML Engineers and infras ...

  • Only for registered members San Jose

    We are seeking a highly experienced Senior Staff Engineer to lead and drive test development, performance benchmarking, and validation for AI/ML frameworks, AI models, and AI agent-based systems on GPU platforms. ...

  • Only for registered members San Jose

    We are an AI platform engineering group focused on large-scale model training systems and performance acceleration. The team builds distributed training infrastructure and optimization technologies for next-generation generative AI and computer vision models. · ...

  • Only for registered members San Jose

    We are looking for a Principal Machine Learning Engineer to join our Models and Applications team.If you are excited by the challenge of distributed training of large models on a large number of GPUs, and if you are passionate about improving training efficiency while innovating ...

  • Only for registered members San Jose, CA

    +Drive the performance of post-training workloads on AMD Instinct GPUs. · +Lead performance for finetuning and RL training solutions on AMD GPUs. · Improve throughput, memory efficiency, and stability across data, model, and optimizer steps. · ...

  • Only for registered members San Jose

    We are looking for a Principal Machine Learning Engineer to join our Models and Applications team. Train large models to convergence on AMD GPUs at scale. Improve the end-to-end training pipeline performance. · Train large models to convergence on AMD GPUs at scale. · Improve the ...

  • Only for registered members San Jose, CA

    We are looking for an AI/ML engineer to join our SaaS Engineering team at Nutanix. You will design, develop and deploy production-scale machine learning solutions for our dynamic education platform. · Participate in ML sprint planning. · Design, develop and deploy machine learnin ...

  • Only for registered members San Jose

    The AI system optimization team at AMD is looking for a specialized Principal level engineer who is passionate about enabling innovative and efficient Generative AI training/inferencing at scale.You will be part of a core team of incredibly talented specialists and work on scalin ...

  • Only for registered members San Jose Full time

    Develops and optimizes machine learning models for various applications. · ...

  • Only for registered members San Jose, CA

    We are looking for a world class AI frameworks engineer who can provide technical leadership in the development of various AI frameworks in the AMD ecosystem. · ...