Software Development Engineer, AI/ML, AWS Neuron, Model Inference - Cupertino

Only for registered members Cupertino, United States

2 weeks ago

Job summary

This role will help lead the efforts in building distributed inference support for PyTorch in Neuron SDK.

This role will tune these models to ensure highest performance and maximize efficiency of them running on customer AWS Trainium and Inferentia silicon servers.

Design develop optimize machine learning models frameworks deployment custom ML hardware accelerators.
Participate all stages ML system development lifecycle distributed computing based architecture design implementation performance profiling hardware-specific optimizations testing production deployment.

Benefits

Job description

Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.

Get full access

Access all high-level positions and get the job of your dreams.

Similar jobs

Manager, Large Language Model Inference

1 month ago

Only for registered members Santa Clara $184,000 - $356,500 (USD)

NVIDIA is accelerating the AI revolution by developing cutting-edge deep learning models on every GPU. · We're seeking a highly skilled Engineering Manager to lead in developing the next generation of LLM inference software technologies. · Your work will be collaborative, interfa ...

Manager, Large Language Model Inference

2 weeks ago

Only for registered members Santa Clara, CA

We're seeking a highly skilled and driven Engineering Manager to take the lead in developing the next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. · The Ten ...

Manager, Large Language Model Inference

1 week ago

Only for registered members US, CA, Santa Clara

+We're seeking a highly skilled and driven Engineering Manager to take the lead in developing the next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. · +Lead and grow a team responsible for specialized kernel development, runtime opti ...

Bilingual Large Model Inference Acceleration Engineer

1 week ago

Only for registered members San Francisco Bay Area

We are seeking an experienced AI model optimization engineer specializing in large model inference acceleration. · Design and optimize large model inference pipelines for low-latency and high-throughput production deployments · Benchmark and profile deep learning models to identi ...

Multimodal Model Training and Inference Optimization Engineer

6 days ago

Only for registered members San Jose, CA

We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, · including distributed training/inference and acceleration.The ideal candidate will work at the cutting edge of AI efficiency ...

Senior GenAI Algorithms Engineer — Model Optimizations for Inference

1 month ago

Only for registered members Santa Clara

NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLMs) and diffusion models for maximal inference efficiency using techniques ranging from quan ...

Multimodal Model Training and Inference Optimization Engineer

6 days ago

Only for registered members San Jose $136,800 - $359,720 (USD)

The Vision-Applied Research team focuses on applied research in Generative AI and CV/Multimodal Understanding, · and delivering intelligent solutions to ByteDance products. · Optimize large model training pipelines to improve efficiency,speed,and scalability. · Benchmark and prof ...

Senior GenAI Algorithms Engineer — Model Optimizations for Inference

2 weeks ago

Only for registered members Santa Clara, CA

We are seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and diffusion models. · ...

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

2 weeks ago

Only for registered members Cupertino $129,300 - $223,600 (USD)

The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, · the software development kit used to accelerate deep learning · and GenAI workloads on Amazon's custom machine learning accelerators, · Inferentia and Trainium.We combine · deep hardware knowledge with ML e ...

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

1 month ago

Only for registered members Cupertino, CA

+ Develop AWS Neuron, the complete software stack for Trainium. · + Optimize LLMs such as Llama and GPT-OSS to run really fast on Trainium. · + Lead a team of expert AI/ML engineers to onboard and optimize state-of-the-art open-source and customer LLMs, · + Drive improvements in ...

Multimodal Model Training and Inference Optimization Engineer

6 days ago

Only for registered members San Jose, CA

We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal candidate will work at the cutting edge of AI efficiency, ...

Senior GenAI Algorithms Engineer — Model Optimizations for Inference

1 week ago

Only for registered members US, CA, Santa Clara

NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quant ...

Multimodal Model Training and Inference Optimization Engineer

6 days ago

Only for registered members San Jose $136,800 - $359,720 (USD)

We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, · Optimize large model training pipelines to improve efficiency speed scalability. · Develop distributed training strategies s ...

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

2 weeks ago

Only for registered members Cupertino, CA

This role will help lead the efforts in building distributed inference support for Pytorch in the Neuron SDK. This role will tune these models to ensure highest performance and maximize the efficiency of them running on the customer AWS Trainium and Inferentia silicon and servers ...

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

1 month ago

Only for registered members Cupertino $166,400 - $287,700 (USD)

We pioneer cloud computing and never stopped innovating — that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.AWS Utility Computing (UC) provides product innovations, · We develop ...

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

1 day ago

Only for registered members Cupertino, California, USA

+ Develop AWS Neuron · + Lead team of expert AI/ML engineers to onboard state-of-the-art open-source LLMs · + Drive improvements in model enablement speed and experience ...

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

2 weeks ago

Only for registered members Cupertino, CA

This role will lead efforts in building distributed inference support for PyTorch in the Neuron SDK. · Design develop and optimize machine learning models frameworks for deployment on custom ML hardware accelerators. · Participate in all stages of the ML system development lifecy ...

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

2 weeks ago

Only for registered members Cupertino, CA

The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron. · ...

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

2 days ago

Only for registered members Cupertino, California, USA

The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on Amazon's custom machine learning accelerators. · ...

Research Scientist, Latent State Inference for World Models

6 days ago

Only for registered members Los Altos, CA

We are seeking a forward-thinking Research Scientist to focus on inferring latent state representations from sensor data powering world models and supporting rigorous policy evaluation for autonomous vehicles. · Design and train learning-based systems that transform raw multimoda ...

Software Development Engineer, AI/ML, AWS Neuron, Model Inference - Cupertino

Job summary

Benefits

Job description

Similar jobs

Manager, Large Language Model Inference

Manager, Large Language Model Inference

Manager, Large Language Model Inference

Bilingual Large Model Inference Acceleration Engineer

Multimodal Model Training and Inference Optimization Engineer

Senior GenAI Algorithms Engineer — Model Optimizations for Inference

Multimodal Model Training and Inference Optimization Engineer

Senior GenAI Algorithms Engineer — Model Optimizations for Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

Multimodal Model Training and Inference Optimization Engineer

Senior GenAI Algorithms Engineer — Model Optimizations for Inference

Multimodal Model Training and Inference Optimization Engineer

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

Software Development Manager, LLM Inference Model Enablement, Neuron SDK

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Software Development Engineer, AI/ML, AWS Neuron, Model Inference

Research Scientist, Latent State Inference for World Models

Directory

for Recruiters

Information