
Inferact Jobs in United States
14 jobs at Inferact in United States
-
We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. You'll write the kernels and low-level optimizations that make vLLM the fastest inference engine in the world. · Design and implement high-performance kernels for attention, GEMM, sampling, ...
San Francisco, CA1 month ago
-
We're looking for an cloud orchestration engineer to build the operational backbone that keeps vLLM running reliably at massive scale. · ...
San Francisco3 weeks ago
-
We're looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · ...
San Francisco1 month ago
-
We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. · You'll write kernels and low-level optimizations that make vLLM the fastest inference engine in the world. · Your code will run on hundreds of accelerator types from NVIDIA GPUs to emergi ...
San Francisco3 weeks ago
-
We're looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · You'll design and implement the foundational layers that enable vLLM to serve models across thousands of accelerators with minimal latency and maximum reliability ...
San Francisco3 weeks ago
-
We are looking for an MLOps engineer to build the operational backbone that keeps vLLM running reliably at massive scale. · Design the systems for cluster management deployment automation and production monitoring. · Evaluate machine-level issues across diverse hardware configura ...
San Francisco, CA1 month ago
-
We're looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · Bachelor's degree or equivalent experience in computer science, engineering, or similar. · Strong systems programming skills in Rust, Go, or C++. · Experience des ...
San Francisco, CA1 month ago
-
We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. · Bachelor's degree or equivalent experience in computer science. · Deep understanding of transformer architectures. · ...
San Francisco3 weeks ago
-
Inferact is looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · ...
San Francisco, CA1 month ago
-
We're looking for a cloud orchestration engineer to build the operational backbone that keeps vLLM running reliably at massive scale. · ...
San Francisco1 month ago
-
inferact's mission is to make vllm the world's inference engine, and accelerate ai by building the systems to run it everywhere. · we're looking for an inference runtime engineer to push the boundaries of what's possible in llm and diffusion model serving. · bachelor's degree or ...
San Francisco, CA1 month ago
-
We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. · You'll write the kernels and low-level optimizations that make vLLM the fastest inference engine in the world. Your code will run on hundreds of accelerator types, · from NVIDIA GPUs to e ...
San Francisco1 month ago
-
We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. · Optimizing how models execute across diverse hardware and architectures. · ...
San Francisco1 month ago
-
+Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. · +This is a globally remote opportunity for exceptional generalist engineers who can work across the entire vLLM stack: from low-level GPU ke ...
United States1 month ago