Senior Platform Engineer, Model Serving - San Francisco - Sciforium

    Sciforium
    Sciforium San Francisco

    1 day ago

    Description

    A leading AI infrastructure company in San Francisco is seeking a Senior Technical Leader to architect and develop a cutting-edge model serving platform.


    To be considered for an interview, please make sure your application is full in line with the job specs as found below.

    This role involves hands-on development of core components and mentoring other engineers. xhmxlyz


    The ideal candidate has over 5 years of experience in scalable backend systems, strong skills in C++ and Python, and a collaborative mindset.

    Benefits include health insurance and flexible time off.

  • Only for registered members San Francisco

    Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. · ...

  • Only for registered members San Francisco

    Turbocharge our serving layer consisting of various LLM speech and vision models. · Partner with ML infrastructure and training engineers to build fast accurate and reliable serving layer to power new consumer product category. · Modify extend LLM serving frameworks like VLLM SGL ...

  • Only for registered members San Francisco, CA

    Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. · We're designing a new kind of computer, focused on making voice companions part of our daily lives. · Join us in shaping a futu ...

  • Only for registered members San Francisco, California

    We are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. · ...

  • Menlo Ventures San Francisco

    We are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. · ...

  • Only for registered members San Francisco $192,000 - $260,000 (USD)

    +We are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. · We do this by building and running the world's best data and AI infrastructure p ...

  • Only for registered members San Francisco, California

    +En Databricks estamos apasionados por habilitar equipos de datos para resolver los problemas más difíciles del mundo — desde hacer realidad el próximo modo de transporte hasta acelerar el desarrollo de avances médicos. Construimos y ejecutamos la plataforma mundial mejorada para ...

  • Only for registered members San Francisco

    We do this by building and running the world's best data and AI infrastructure platform so our customers can use deep data insights to improve their business. · Databricks' Model Serving product provides enterprises with a unified, scalable, and governed platform to deploy and ma ...

  • Only for registered members San Francisco, California

    We are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. · Foundation Model Serving is the API Product for hosting and serving frontier AI m ...

  • Only for registered members San Francisco

    Job summary · This is a rare chance to help architect and lead the development of Sciforium's next-generation model serving platform, the high-performance engine that will bring a multimodal, highly efficient foundation model to market. As a senior technical leader, you'll not on ...

  • Only for registered members San Francisco, CA

    We are passionate about enabling data teams to solve world's toughest problems by building and running world's best data AI infrastructure platform. · Design implement core systems APIs that power Databricks Foundation Model Serving ensuring scalability reliability operational ex ...

  • Only for registered members San Francisco, CA

    A well-funded AI infrastructure company is building a next-generation model serving platform to power real-time, multimodal foundation models at scale. · ...

  • Only for registered members San Francisco $180,000 - $300,000 (USD)

    You'll be the bridge between research breakthroughs and production reality.This isn't about maintaining existing APIs—it's about taking models fresh from research, optimizing them for inference, wrapping them in robust serving infrastructure, and shipping demos that show the worl ...

  • Only for registered members San Francisco, California

    We are passionate about enabling data teams to solve the world's toughest problems — from making the next mode of transportation a reality to accelerating the development of medical breakthroughs. · Databricks' Model Serving product provides enterprises with a unified, scalable, ...

  • Only for registered members San Francisco

    We’re redefining the architecture of intelligence itself at Liquid. Our mission is to build efficient AI systems at every scale. · We believe great talent powers great technology. · ...

  • Only for registered members San Francisco, CA

    inferact's mission is to make vllm the world's inference engine, and accelerate ai by building the systems to run it everywhere. · we're looking for an inference runtime engineer to push the boundaries of what's possible in llm and diffusion model serving. · bachelor's degree or ...

  • Only for registered members San Francisco

    +Help us make inference blazingly fast. If you love squeezing every last drop of performance out of GPUs, diving deep into CUDA kernels, and turning optimization techniques into production systems, we'd love to meet you. · +Implement and productionize optimization techniques incl ...

  • Only for registered members San Francisco

    + GPU inference optimization vLLM SGLang or TensorRT-LLM experience Distributed compute with GPUs is a super plus Deploy and tune models with optimizations like KV caching paged attention sequence packing etc Conducting model performance reviews Improve scheduler batcher autoscal ...

  • Only for registered members San Francisco

    We are looking for a Senior Software Engineer to join our team. The ideal candidate will have experience in ML systems, inference optimization, and GPU programming. · ...

  • Only for registered members San Francisco Full time $220,000 - $320,000 (USD)

    We are hiring a Senior Software Engineer to join our team in San Francisco. The ideal candidate will have experience in ML systems, inference optimization, and GPU programming. They will be responsible for making our inference stack as fast and efficient as possible. · The role r ...

Jobs
>
San Francisco