Solutions Architect, Inference Deployments - Santa Clara

Only for registered members Santa Clara, United States

1 day ago

Default job background
$152,000 - $241,500 (USD)
We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. · As a Solutions Architect focused on inference, you'll collaborate closely with our engineering, DevOps, and customers to develop ent ...
Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you'll collaborate closely with our engineering, DevOps, and customer success teams to fos ...

    Santa Clara $152,000 - $218,500 (USD)

    1 week ago

  • Work in company

    Solutions Architect, Inference Deployments

    Only for registered members

    We're forming a team of innovators to roll out and enhance AI inference solutions at scale demonstrating NVIDIA's GPU technology and Kubernetes. · Help customers craft deploy and maintain scalable GPU-accelerated inference pipelines on Kubernetes for large language models LLMs an ...

    Santa Clara, CA

    1 week ago

  • Work in company

    Solutions Architect, Inference Deployments

    Only for registered members

    We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect focused on inference, you'll collaborate closely with our engineering, DevOps, and customers to develop enter ...

    Santa Clara $152,000 - $241,500 (USD) Full time

    1 day ago

  • Work in company

    Solutions Architect, Inference Deployments

    Only for registered members

    We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you'll collaborate closely with our engineering, DevOps, and customer success teams to fos ...

    US, CA, Santa Clara $152,000 - $218,500 (USD) per year

    1 week ago

  • Work in company

    Solutions Architect, Inference Deployments

    Only for registered members

    We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect focused on inference, you'll collaborate closely with our engineering, DevOps, and customers to develop enter ...

    US, CA, Santa Clara

    6 hours ago

  • Work in company

    Deployment Engineer, AI Inference

    Only for registered members

    We are seeking a highly skilled Deployment Engineer to build and operate our cutting-edge inference clusters. · These clusters would provide the candidate an opportunity to work with the world's largest computer chip, the Wafer-Scale Engine (WSE), and the systems that harness its ...

    Sunnyvale

    1 month ago

  • Work in company

    Product Manager MBA Intern, AI Platform Inference - Summer 2026

    Only for registered members

    We are seeking an intern for our Product Management organization at NVIDIA. As a Product Manager MBA Intern for AI Platform Inference you will be responsible for building tools SDKs and libraries which enables developers' Inference deployments to thrive on NVIDIA GPUs. · ...

    Santa Clara $27 - $82 (USD)

    1 month ago

  • Work in company

    Product Manager MBA Intern, AI Platform Inference - Summer 2026

    Only for registered members

    As NVIDIA Product Managers our goal is to enable developers to be successful on the NVIDIA Platform and push the boundaries of what is possible with their AI deployments For Inference we are the champions inside NVIDIA for AI developers looking to accelerate their deployments on ...

    Santa Clara $27 - $82 (USD) Internship

    1 month ago

  • Work in company

    Principal Software Engineer

    Only for registered members

    About · NVIDIA Dynamo is an innovative, open-source platform focused on efficient, scalable inference for large language and reasoning models in distributed GPU environments. By bringing to bear sophisticated techniques in serving architecture, GPU resource management, and intell ...

    Santa Clara $272,000 - $431,250 (USD)

    4 days ago

  • Work in company

    Principal Software Engineer

    Only for registered members

    NVIDIA Dynamo is an innovative, open-source platform focused on efficient, scalable inference for large language and reasoning models in distributed GPU environments. By bringing to bear sophisticated techniques in serving architecture, GPU resource management, and intelligent re ...

    Santa Clara $272,000 - $425,500 (USD)

    1 month ago

  • Work in company

    Senior Deep Learning Software Engineer

    Only for registered members

    +We are looking for a Senior Deep Learning Software Engineer to design and build our automated inference and deployment solution. · +Leverage and build upon the torch 2.0 ecosystem (TorchDynamo, Torch.compile etc...) to analyze and extract standardized model graph representation ...

    Santa Clara $224,000 - $356,500 (USD)

    4 weeks ago

  • Work in company

    Principal Software Engineer

    Only for registered members

    NVIDIA Dynamo is an innovative platform focused on efficient inference for large language models in distributed GPU environments. · ...

    Santa Clara $272,000 - $425,500 (USD) Full time

    1 month ago

  • Work in company

    Principal Software Engineer

    Only for registered members

    NVIDIA Dynamo is an innovative platform for efficient inference of large language and reasoning models in distributed GPU environments. · We're searching for engineers to build the next generation of scalable AI systems as a Principal Software Engineer on the Dynamo project. · Ad ...

    Santa Clara $272,000 - $431,250 (USD) Full time

    1 month ago

  • Work in company

    Senior Inference Technical Product Marketing Manager

    Only for registered members

    We are looking for a Senior Technical Product Marketing Manager to join our rapidly growing data center business.You will be focused on working with engineering to understand the technical capabilities of our inference stack from GPUs , CPUs , networking , CUDA libraries , model ...

    Santa Clara $148,000 - $287,500 (USD)

    1 month ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models LLM and diffusion models for maximal inference efficiency using techniques ranging from neural ...

    Santa Clara $224,000 - $356,500 (USD)

    1 month ago

  • We are now looking for a Senior Deep Learning Software Engineer, LLM Performance NVIDIA is seeking an experienced Deep Learning Engineer passionate about analyzing and improving the performance of LLM inference NVIDIA is rapidly growing our research and development for Deep Learn ...

    Santa Clara $184,000 - $356,500 (USD)

    1 week ago

  • Work in company

    AI Software Applications Engineer

    Only for registered members

    We are seeking individuals passionate about tackling challenges and driven by execution to join our team as an AI Software Applications Engineer. · ...

    Santa Clara $180,000 - $300,000 (USD)

    1 week ago

  • The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from neural architecture search and pruning to sparsity, quantization ...

    Santa Clara $152,000 - $287,500 (USD)

    1 month ago

  • Work in company

    Senior Software Engineer

    Only for registered members

    We are seeking a Senior Software Engineer to join our Software Infrastructure Team in Santa Clara CA. This team is at the heart of the NVIDIA AI Factory initiative building and maintaining the core infrastructure that powers our closed and open source AI models. · What You'll Be ...

    Santa Clara $200,000 - $322,000 (USD)

    2 weeks ago

  • NVIDIA is at the forefront of the generative AI revolution The Algorithmic Model Optimization Team specifically focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from neura ...

    Santa Clara $152,000 - $287,500 (USD)

    3 days ago