Work in company

Inference Frontend

Only for registered members

Cerebras Systems builds the world's largest AI chip. Our novel wafer-scale architecture provides AI compute power of dozens of GPUs on a single chip. · ...

Sunnyvale Full time

1 month ago

Work in company

Solutions Architect, Inference Deployments, Solutions Architect, Inference Deployments

Only for registered members

We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you'll collaborate closely with our engineering, DevOps, and customer success teams to fos ...

Santa Clara $152,000 - $218,500 (USD)

2 weeks ago

Work in company Remote job

Inference Engineer

Only for registered members

We're looking for engineers who can bridge the gap between ML research and high-performance inference. · JAX / Equinox / Pallas stack · Rust systems programming with a focus on developer experience · ...

San Francisco Bay Area

1 month ago

Work in company

AI Inference Engineer

Only for registered members

What you can expect · We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work on the mos ...

San Jose (CA) $90,000 - $170,000 (USD) per year

1 day ago

Work in company

Solutions Architect, Inference Deployments

Only for registered members

We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect focused on inference, you'll collaborate closely with our engineering, DevOps, and customers to develop enter ...

Santa Clara $152,000 - $241,500 (USD) Full time

1 week ago

Work in company

Principal Engineer Inference Stack

Only for registered members

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. · Develop techniques for optimizing scale-up and scale-out inference. · Develop methods and tooling to utilize dynam ...

Santa Clara, CA

1 month ago

Work in company

AI Inference Engineer

Only for registered members

What You Can Expect · We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work on the most ...

San Jose, CA $90,000 - $170,000 (USD) per year

1 week ago

Work in company

Head of Inference Kernels

Only for registered members

As a core member of the team, you will play a pivotal role in leading a high-performing team to build optimized kernels and implement highly optimized inference stacks for state-of-the-art transformer models. · Architect Best-in-Class Inference Performance on Sohu: Deliver contin ...

San Jose, CA

1 month ago

Work in company

AI Inference Engineer

Only for registered members

+ Develop state-of-the-art automatic speech recognition system for various Zoom products · + Work on cutting edge speech modeling and inference technologies with world-class speech scientists · + Collaborate with cross-functional teams to deliver high-impact projects from ground ...

San Jose $151,800 - $332,200 (USD) Full time

3 weeks ago

Work in company

Solutions Architect, Inference Deployments

Only for registered members

We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect focused on inference, you'll collaborate closely with our engineering, DevOps, and customers to develop enter ...

Santa Clara $152,000 - $241,500 (USD)

1 week ago

Work in company

Solutions Architect, Inference Deployments

Only for registered members

We're forming a team of innovators to roll out and enhance AI inference solutions at scale demonstrating NVIDIA's GPU technology and Kubernetes. · Help customers craft deploy and maintain scalable GPU-accelerated inference pipelines on Kubernetes for large language models LLMs an ...

Santa Clara, CA

3 weeks ago

Work in company

AI Inference Engineer

Only for registered members

We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. · In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work on the most cutting edge speech ...

San Jose, CA

1 month ago

Work in company

Principal Engineer Inference Stack

Only for registered members

We are looking for a strategic software engineering lead who is passionate about improving the performance of key applications and benchmarks.Join us as we shape the future of AI and beyond. · ...

Santa Clara

1 month ago

Work in company

AI Inference Engineer

Only for registered members

+Job summary · We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference.Developing state-of-the-art speech services for Zoom products. · Optimizing ASR inference systems for production deployment. · +We are developing speech re ...

San Jose $151,800 - $332,200 (USD) Full time

1 month ago

Work in company

Solutions Architect, Inference Deployments

Only for registered members

We're forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA's GPU technology and Kubernetes. As a Solutions Architect (Inference Focus), you'll collaborate closely with our engineering, DevOps, and customer success teams to fos ...

US, CA, Santa Clara $152,000 - $218,500 (USD) per year

1 day ago

Work in company

Inference Software Engineer

Only for registered members

About Etched · Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like ...

San Jose $100,000 - $120,000 (USD) per year

1 week ago

Work in company

AI Inference Engineer

Only for registered members

What you can expect · We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. In this role, you will develop state-of-the-art automatic speech recognition system and ship it to various Zoom products. You will work on the mos ...

San Jose $151,800 - $332,200 (USD) Full time

1 day ago

Work in company

Head of Inference Kernels

Only for registered members

About Etched · Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like ...

San Jose

1 week ago

Work in company

AI Inference Engineer

Only for registered members

+We are looking for an AI Inference Engineer with a solid background in speech recognition and model inference. · +Developing state-of-the-art speech services for Zoom products. · Optimizing ASR inference systems for production deployment. · + ...

San Jose, CA

1 month ago

Work in company

Manager, Large Language Model Inference, Manager, Large Language Model Inference

Only for registered members

At NVIDIA, we aren't just powering the AI revolution—we're accelerating it. The TensorRT inference platform is the backbone of modern AI, delivering the industry's fastest and most efficient deployment of cutting-edge deep learning models on every NVIDIA GPU. With demand for AI e ...

Santa Clara $184,000 - $356,500 (USD)

2 weeks ago

Work in company

Inference Software Engineer

Only for registered members

About Etched · Etched is building the world's first AI inference system purpose-built for transformers - delivering over 10x higher performance and dramatically lower cost and latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like ...

San Jose, CA

1 week ago

Senior GenAI Algorithms Engineer — Model Optimizations for Inference - US, CA, Santa Clara

Job description

Similar jobs

Inference Frontend

Solutions Architect, Inference Deployments, Solutions Architect, Inference Deployments

Inference Engineer

AI Inference Engineer

Solutions Architect, Inference Deployments

Principal Engineer Inference Stack

AI Inference Engineer

Head of Inference Kernels

AI Inference Engineer

Solutions Architect, Inference Deployments

Solutions Architect, Inference Deployments

AI Inference Engineer

Principal Engineer Inference Stack

AI Inference Engineer

Solutions Architect, Inference Deployments

Inference Software Engineer

AI Inference Engineer

Head of Inference Kernels

AI Inference Engineer

Manager, Large Language Model Inference, Manager, Large Language Model Inference

Inference Software Engineer

Directory

for Recruiters

Information