Manager, Large Language Model Inference, Manager, Large Language Model Inference - Santa Clara

Only for registered members Santa Clara, United States

4 days ago

Default job background
$184,000 - $356,500 (USD)
At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. The TensorRT inference platform is the backbone of modern AI, delivering the industry's fastest and most efficient deployment of cutting-edge deep learning models on every NVIDIA GPU. · With demand for AI ...
Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • Work in company

    Manager, Large Language Model Inference

    Only for registered members

    NVIDIA is accelerating the AI revolution by developing cutting-edge deep learning models on every GPU. · We're seeking a highly skilled Engineering Manager to lead in developing the next generation of LLM inference software technologies. · Your work will be collaborative, interfa ...

    Santa Clara $184,000 - $356,500 (USD)

    1 month ago

  • Work in company

    Manager, Large Language Model Inference

    Only for registered members

    We're seeking a highly skilled and driven Engineering Manager to take the lead in developing the next generation of LLM/VLM/VLA inference software technologies that will define the future of AI. At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. · The Ten ...

    Santa Clara, CA

    3 weeks ago

  • Work in company

    Manager, Large Language Model Inference

    Only for registered members

    At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. The TensorRT inference platform is the backbone of modern AI, delivering the industry's fastest and most efficient deployment of cutting-edge deep learning models on every NVIDIA GPU. With demand for AI e ...

    US, CA, Santa Clara $184,000 - $287,500 (USD) per year

    4 days ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLMs) and diffusion models for maximal inference efficiency using techniques ranging from quan ...

    Santa Clara

    1 month ago

  • Work in company

    Multimodal Model Training and Inference Optimization Engineer

    Only for registered members

    We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, · including distributed training/inference and acceleration.The ideal candidate will work at the cutting edge of AI efficiency ...

    San Jose, CA

    1 week ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quant ...

    US, CA, Santa Clara $152,000 - $287,500 (USD) per year

    14 hours ago

  • We are seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and diffusion models. · ...

    Santa Clara, CA

    3 weeks ago

  • Work in company

    Multimodal Model Training and Inference Optimization Engineer

    Only for registered members

    The Vision-Applied Research team focuses on applied research in Generative AI and CV/Multimodal Understanding, · and delivering intelligent solutions to ByteDance products. · Optimize large model training pipelines to improve efficiency,speed,and scalability. · Benchmark and prof ...

    San Jose $136,800 - $359,720 (USD)

    1 week ago

  • Work in company

    Multimodal Model Training and Inference Optimization Engineer

    Only for registered members

    We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal candidate will work at the cutting edge of AI efficiency, ...

    San Jose, CA

    1 week ago

  • Work in company

    Multimodal Model Training and Inference Optimization Engineer

    Only for registered members

    We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, · Optimize large model training pipelines to improve efficiency speed scalability. · Develop distributed training strategies s ...

    San Jose $136,800 - $359,720 (USD)

    1 week ago

  • Work in company

    Bilingual Large Model Inference Acceleration Engineer

    Only for registered members

    We are seeking an experienced AI model optimization engineer specializing in large model inference acceleration. · Design and optimize large model inference pipelines for low-latency and high-throughput production deployments · Benchmark and profile deep learning models to identi ...

    San Francisco Bay Area

    2 weeks ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from neura ...

    Santa Clara $224,000 - $356,500 (USD) Full time

    1 month ago

  • The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from neural architecture search and pruning to sparsity, quantization ...

    Santa Clara $152,000 - $287,500 (USD)

    3 weeks ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models LLM and diffusion models for maximal inference efficiency using techniques ranging from neural ...

    Santa Clara $224,000 - $356,500 (USD)

    1 month ago

  • We are now looking for a Senior Deep Learning Software Engineer to develop and scale up our automated inference and deployment solution. · As part of the team, you will be instrumental in pushing the limits of inference efficiency and large-scale, automated deployment. · Increasi ...

    Santa Clara $152,000 - $287,500 (USD) Full time

    3 weeks ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from neura ...

    US, CA, Santa Clara $184,000 - $356,500 (USD) per year

    14 hours ago

  • We are seeking an experienced Multimodal Model Training Optimization Engineer with expertise in optimizing AI model training and inference including distributed traininginference. · Optimize large model training pipelines to improve efficiency speedand scalability. · Benchmarkand ...

    San Jose, CA

    1 week ago

  • We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. · The ideal candidate will work at the cutting edge of AI efficienc ...

    San Jose $208,800 - $438,000 (USD)

    1 week ago

  • We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration.The ideal candidate will work at the cutting edge of AI efficiency, ...

    San Jose $208,800 - $438,000 (USD)

    1 week ago

  • We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference. · Optimize large model training pipelines to improve efficiency, speed, and scalability. · Benchmark and profile deep learning ...

    San Jose, CA

    1 week ago