Default company background
Inferact

Inferact Jobs in United States

14 jobs at Inferact in United States

  • We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. You'll write the kernels and low-level optimizations that make vLLM the fastest inference engine in the world. · Design and implement high-performance kernels for attention, GEMM, sampling, ...

    San Francisco, CA

    1 month ago

  • We're looking for an cloud orchestration engineer to build the operational backbone that keeps vLLM running reliably at massive scale. · ...

    San Francisco

    3 weeks ago

  • We're looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · ...

    San Francisco

    1 month ago

  • We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. · You'll write kernels and low-level optimizations that make vLLM the fastest inference engine in the world. · Your code will run on hundreds of accelerator types from NVIDIA GPUs to emergi ...

    San Francisco

    3 weeks ago

  • We're looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · You'll design and implement the foundational layers that enable vLLM to serve models across thousands of accelerators with minimal latency and maximum reliability ...

    San Francisco

    3 weeks ago

  • We are looking for an MLOps engineer to build the operational backbone that keeps vLLM running reliably at massive scale. · Design the systems for cluster management deployment automation and production monitoring. · Evaluate machine-level issues across diverse hardware configura ...

    San Francisco, CA

    1 month ago

  • We're looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · Bachelor's degree or equivalent experience in computer science, engineering, or similar. · Strong systems programming skills in Rust, Go, or C++. · Experience des ...

    San Francisco, CA

    1 month ago

  • We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. · Bachelor's degree or equivalent experience in computer science. · Deep understanding of transformer architectures. · ...

    San Francisco

    3 weeks ago

  • Inferact is looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · ...

    San Francisco, CA

    1 month ago

  • We're looking for a cloud orchestration engineer to build the operational backbone that keeps vLLM running reliably at massive scale. · ...

    San Francisco

    1 month ago

  • inferact's mission is to make vllm the world's inference engine, and accelerate ai by building the systems to run it everywhere. · we're looking for an inference runtime engineer to push the boundaries of what's possible in llm and diffusion model serving. · bachelor's degree or ...

    San Francisco, CA

    1 month ago

  • We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. · You'll write the kernels and low-level optimizations that make vLLM the fastest inference engine in the world. Your code will run on hundreds of accelerator types, · from NVIDIA GPUs to e ...

    San Francisco

    1 month ago

  • We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. · Optimizing how models execute across diverse hardware and architectures. · ...

    San Francisco

    1 month ago

  • +Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. · +This is a globally remote opportunity for exceptional generalist engineers who can work across the entire vLLM stack: from low-level GPU ke ...

    United States

    1 month ago