Member of Technical Staff, Inference - San Francisco

Only for registered members San Francisco, United States

6 days ago

Default job background
Full time $200,000 - $400,000 (USD)

Job summary

We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving.

Skills and Qualifications

  • Bachelor's degree or equivalent experience in computer science.
  • Deep understanding of transformer architectures.

Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • Only for registered members San Francisco

    fal is pioneering the next generation of generative-media infrastructure. We're pushing the boundaries of model inference performance to power seamless creative experiences at unprecedented scale. · We're looking for a Staff Technical Lead for Inference & ML Performance · ...

  • Only for registered members San Francisco $200,000 - $400,000 (USD)

    We're looking for an inference runtime engineer to push the boundaries of what's possible in LLM and diffusion model serving. · Optimizing how models execute across diverse hardware and architectures. · ...

  • Only for registered members San Francisco, CA

    Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation and make a tangible impact. · ...

  • Only for registered members San Francisco, CA - US

    Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that's setting the pace for responsible transformative cloud infrastructure. · ...

  • Only for registered members San Francisco, CA

    +We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. · ...

  • Only for registered members San Francisco

    We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. · Developing, deploying, and operating the AI platform delivering Cohere's large languag ...

  • Only for registered members San Francisco Full time

    +We are looking for Members of Technical Staff to join the Model Serving team at Cohere. · +5+ years of engineering experience running production infrastructure at a large scale · Experience designing large, highly available distributed systems with Kubernetes, and GPU workloads ...

  • Only for registered members San Francisco

    +We are looking for Members of Technical Staff to join the Model Serving team at Cohere. The team is responsible for developing, deploying, and operating the AI platform delivering Cohere's large language models through easy to use API endpoints. · +Work closely with many teams t ...

  • Only for registered members San Francisco, CA

    Inferact is looking for an infrastructure engineer to build the distributed systems that power inference at global scale. · ...

  • Only for registered members San Francisco, CA

    inferact's mission is to make vllm the world's inference engine, and accelerate ai by building the systems to run it everywhere. · we're looking for an inference runtime engineer to push the boundaries of what's possible in llm and diffusion model serving. · bachelor's degree or ...

  • Only for registered members Palo Alto, CA; San Francisco, CA

    xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. · ...

  • Only for registered members Palo Alto $180,000 - $440,000 (USD)

    This role involves optimizing model inference latency and throughput as well as building reliable and performant production serving systems to serve billions of users. · ...

  • Only for registered members Palo Alto, CA

    About xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated and focused on engineering excellence. · This organization is for individuals who appreciate challenging thems ...

  • Only for registered members Palo Alto $180,000 - $440,000 (USD)

    About xAI is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. · We operate with a flat organizational structure. · All employees are expected to be hands-on and to contribute directly to the company's mission. · ...

  • Only for registered members Palo Alto, CA; San Francisco, CA

    xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. · •Architect and implement scalable distributed infrastructure for model ser ...

  • Only for registered members Palo Alto $208,454 - $364,795 (USD)

    The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. The team is looking for a staff engineer with strong hands-on experience in large-scale ML inference systems. · L ...

  • Only for registered members Palo Alto Full time $208,454 - $364,795 (USD)

    The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. · Lead efforts to build next-generation model inference and feature serving systems. · Design low-latency inferen ...

  • Only for registered members Palo Alto, CA

    We're on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product. Discover a career where you ignite innovation for millions, transform passion into growth opportunities. · Lead and drive efforts to build next-gen ...

  • Only for registered members San Francisco, California

    As a staff software engineer for GenAI inference, you will lead the architecture, development, and optimization of the inference engine that powers Databricks Foundation Model API. · ...

  • Only for registered members San Francisco, CA

    How should Intuit be using causal inference methods to make decisions across marketing product and business strategy?We re building out a causal inference team and we are looking for a Senior Staff Data Scientist thats ready to lead the way. ...