Senior Runtime Engineer - Sunnyvale - CEREBRAS SYSTEMS INC.

    CEREBRAS SYSTEMS INC.
    CEREBRAS SYSTEMS INC. Sunnyvale

    3 days ago

    Description

    Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.
    Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.
    Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.
    About The Role
    We are building the next generation of large-scale AI systems that power training and inference workloads at unprecedented scale and efficiency.
    You will design and develop high-performance distributed software that orchestrates massive compute and data pipelines across heterogeneous clusters. Your work will push the limits of concurrency, throughput, and scalability-enabling efficient execution of models at massive scale. This role sits at the intersection of systems engineering and machine learning performance, demanding both architectural depth and low-level implementation skills. You will help shape how models are executed and optimized end-to-end, from data ingestion to distributed execution, across cutting-edge hardware platforms.
    We're hiring for runtime roles across both Training and Inference.
    Responsibilities

    • Design and implement distributed runtime components to efficiently manage large-scale execution workloads.
    • Develop and optimize high-performance data and communication pipelines that fully utilize CPU, memory, storage, and network resources.
    • Enable scalable execution across multiple compute nodes, ensuring high concurrency and minimal bottlenecks.
    • Collaborate closely with ML and compiler teams to integrate new model architectures, training regimes, and hardware-specific optimizations.
    • Diagnose and resolve complex performance issues across the software stack using profiling and instrumentation tools.
    • Contribute to overall system design, architecture reviews, and roadmap planning for large-scale AI workloads.
    Skills & Qualifications
    • 3+ years of experience developing high-performance or distributed system software.
    • Strong programming skills in C/C++, with expertise in multi-threading, memory management, and performance optimization.
    • Experience with distributed systems, networking, or inter-process communication.
    • Solid understanding of data structures, concurrency, and system-level resource management (CPU, I/O, and memory).
    • Proven ability to debug, profile, and optimize code across scales-from threads to clusters.
    • Bachelor's, Master's, or equivalent experience in Computer Science, Electrical Engineering, or related field.
    Preferred Skills & Qualifications
    • Familiarity with machine learning training or inference pipelines, especially distributed training and large-model scaling.
    • Exposure to Python and PyTorch, particularly in the context of model training or performance tuning.
    • Experience with compiler internals, custom hardware interfaces, or low-level protocol design.
    • Prior work on high-performance clusters, HPC systems, or custom hardware/software co-design.
    • Deep curiosity about how to unlock new levels of performance for large-scale AI workloads.
    This offer is contingent upon Cerebras successfully obtaining an export license from the U.S. Department of Commerce's Bureau of Industry and Security authorizing the release to you of certain software source code and/or technology that is subject to the Export Administration Regulations. However, we can make no assurances with respect to the final disposition of an export license application.
    Why Join Cerebras
    People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we've reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:
    • Build a breakthrough AI platform beyond the constraints of the GPU.
    • Publish and open source their cutting-edge AI research.
    • Work on one of the fastest AI supercomputers in the world.
    • Enjoy job stability with startup vitality.
    • Our simple, non-corporate work culture that respects individual beliefs.
    Read our blog: Five Reasons to Join Cerebras in 2026.
    Apply today and become part of the forefront of groundbreaking advancements in AI
    Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

  • Only for registered members Santa Clara

    At Lemurian Labs we are on a mission to bring the power of AI to everyone without leaving an environmental footprint We care deeply about the impact AI has on society and planet and we are building a rock solid foundation for its future ensuring AI grows sustainably and responsib ...

  • Only for registered members Cupertino Full time

    The people here at Apple don't just build products, · we build the kind of wonder that revolutionise entire industries. · Imagine what you could do here Do you have a passion for understanding how each line of code affects all the others? · In the Core Operating Systems group ens ...

  • Only for registered members Cupertino, CA

    The Darwin Runtime team in Core OS is looking for enthusiastic engineers interested in developing low-level system technologies for Apple's operating systems. · You will be responsible for working on a range of technologies that form the foundation of Apple's operating systems. T ...

  • Only for registered members Cupertino $126,800 - $220,900 (USD)

    The Darwin Runtime team in Core OS is looking for enthusiastic engineers interested in developing low-level system technologies for Apple's operating systems. · The team operates at the intersection of operating systems, programming language design, systems security, and high-per ...

  • Only for registered members Santa Clara Internship

    We are seeking an enthusiastic intern with a background in Electrical Engineering (EE) or Computer Science (CS) who is eager to work on embedded systems development, Linux software, RTOS, security and solve complex problems. · Developing software in C/C++ for Linux systems or rel ...

  • Only for registered members Sunnyvale, CA

    We are looking for a lead software engineer with deep experience in optimizing ML models and deploying them on production-grade embedded runtime environments. · Drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deployme ...

  • Only for registered members Santa Clara

    We value humility and believe in direct communication. · We are seeking individuals passionate about tackling challenges and driven by execution. · ...

  • Only for registered members Sunnyvale $199,295 - $264,500 (USD)

    We are looking for a lead software engineer with deep experience in optimizing ML models and deploying them on production-grade embedded runtime environments. · We drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deplo ...

  • Only for registered members Cupertino, CA

    We are seeking a Software Development Engineer to join our team at AWS AI. As a member of our team, you will work on innovative software and hardware solutions that make deep learning pervasive for everyday developers and democratize access to cutting edge infrastructure. · ...

  • Only for registered members Mountain View Full time $204,000 - $259,000 (USD)

    Waymo is looking for Machine Learning Engineers to improve compute performance on cloud and car. · ...

  • Only for registered members San Francisco Bay Area

    The Agents Runtime team builds the low-latency, reliable, and secure foundation that powers Glean's AI agents and assistant experiences at scale. · ...

  • Only for registered members Mountain View, California, USA

    Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. · M.S. in CS, EE, Deep Learning or a related field · 5+ years of experience developing solutions in ML systems or ML software stack (Pytorch/JAX/TF, runtime libraries, ML com ...

  • Only for registered members Mountain View, CA

    We are looking for a software engineer with deep experience in optimizing ML models and deploying them on production-grade embedded runtime environments. · Drive ML performance optimization on multiple technologies for on-road and off-road ADAS / AD stacks targeting deployment on ...

  • Only for registered members Santa Clara Full time $100,000 - $500,000 (USD)

    Tenstorrent is leading the industry on cutting-edge AI technology with high-performance RISC-V CPU from scratch and passion for solving hard problems. · ...

  • Only for registered members Santa Clara

    We are seeking individuals passionate about tackling challenges and are driven by execution. · We value humility and believe in direct communication. Our team is inclusive, and our differing perspectives allow for better solutions.Bachelor's in computer engineering or electrical ...

  • Only for registered members Santa Clara

    We are seeking individuals passionate about tackling challenges and driven by execution for Runtime Software Engineering working on AI compute platform focusing on in-memory compute for AI inference in datacenters. · ...

  • Only for registered members Santa Clara

    We are seeking individuals passionate about tackling challenges and are driven by execution. Ready to come find your playground? Together, we can help shape the endless possibilities of AI. · In-depth knowledge of networking protocols like TCP/IP, · OSPF, BGP, VLANs, ARP etc., an ...

  • Only for registered members Santa Clara, California, United States

    Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use and cost efficiency. · ...

  • Only for registered members Santa Clara, CA Remote job

    Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. · Build and optimize the Metal runtime that runs directly on our AI accelerators. · Develop bare-metal software that controls compute uni ...

  • Only for registered members Santa Clara $139,900 - $274,800 (USD)

    We are seeking a Principal Software Engineer to join our DPU Runtime (OS) PCIe team responsible for developing cutting edge, high-performance scalable and programmable DPU software with a focus on next generation PCIe device firmware development. · Architect and design PCIe firmw ...

  • Only for registered members Santa Clara

    We are seeking a Principal Software Engineer to join our DPU Runtime (OS) PCIe team responsible for developing cutting edge, · high-performance scalable and programmable DPU software with a focus on next generation PCIe device firmware development. · The Data Processing Unit (DPU ...

Jobs
>
Sunnyvale