Staff Software Engineer, Training - San Francisco Bay Area
1 month ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
We are building a unified, modular runtime that meets researchers where they are and moves with them up the scaling curve. · Success for us is measured by raising both training throughput (how fast models train) and researcher throughput (how fast ideas become experiments and pro ...
1 month ago
About the Team · Training Runtime designs the core distributed machine-learning training runtime that powers everything from early research experiments to frontier-scale model runs. With a dual mandate to accelerate researchers and enable frontier scale, we're building a unified, ...
2 days ago
Sciforium is an AI infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency serving platform. Backed by multi-million-dollar funding and direct sponsorship from AMD with hands-on support from AMD engineers the team is scaling rapid ...
6 days ago
About The Team · Training Runtime designs the core distributed machine-learning training runtime that powers everything from early research experiments to frontier-scale model runs. With a dual mandate to accelerate researchers and enable frontier scale, we're building a unified, ...
5 days ago
The Sora team is working on making video a key capability of OpenAI's foundation models.As a Distributed Systems/ML engineer, you will work on improving the training throughput for our internal training framework and enable researchers to experiment with new ideasCollaborate with ...
1 month ago
About the Team · The Sora team is working on making video a key capability of OpenAI's foundation models. We are a hybrid research and product team that seeks to understand and expand the capabilities of our video models, while ensuring their reliability and safety. We accomplish ...
2 days ago
The Sora team is working on making video a key capability of OpenAI's foundation models. As a Distributed Systems/ML engineer, you will work on improving the training throughput for our internal training framework and enable researchers to experiment with new ideas. ...
1 month ago
About the Team · Training Runtime designs the core distributed machine-learning training runtime that powers everything from early research experiments to frontier-scale model runs. With a dual mandate to accelerate researchers and enable frontier scale, we're building a unified, ...
2 days ago
+Job summary · We are seeking a Research Engineer to join our Pre-training team, · responsible for developing the next generation of large language models.+ResponsibilitiesConduct research and implement solutions in areas such as model architecture, · Data processing, · ...
1 month ago
Drive down wall-clock time to convergence by profiling and eliminating bottlenecks across the foundation model training stack stack from data pipelines to GPU kernels Design build and optimize distributed training systems PyTorch for multi-node GPU clusters ensuring scalability r ...
1 month ago
· Thinking Machines Lab's mission is to empower humanity through advancing collaborative general intelligence. We're building a future where everyone has access to the knowledge and tools to make AI work for their unique needs and goals. · We are scientists, engineers, and buil ...
3 days ago
We are building the world's first general purpose robotic intelligence that is robust and adapts to unseen scenarios without failing. · We believe massive scale through data-driven machine learning is the key to unlocking these capabilities for the widespread deployment of robots ...
1 month ago
We are a group of engineers to support training foundation models at Apple We build infrastructure to support training foundation models with general capabilities such as understanding and generation of text images speech videos and other modalities and apply these models to Appl ...
1 month ago
· Job Title: Machine Learning Engineer, Training Infrastructure · Position Type: Full time · Location: San Francisco, CA, USA · Salary Range: $150,000 - $250, 000 (USD) · Job ID#: 158135 · Job Description:We are looking for an ML Engineer with 3+ YOE in high-performance computin ...
2 days ago
We are building cutting-edge infrastructure to enable efficient and scalable training of large language models (LLMs). We focus on optimizing training frameworks, algorithms, and infrastructure to push the boundaries of AI performance, scalability, · and cost-efficiency.We invite ...
1 month ago
Anthropics mission is to create reliable interpretable and steerable AI systems We want AI to be safe and beneficial for our users and for society as a whole Our team is a quickly growing group of committed researchers engineers policy experts and business leaders working togethe ...
1 month ago
We're looking for a DevOps/IaC Engineer to shape the data that powers frontier AI. · Fluent in English with strong writing and communication skills. · Expertise in DevOps and Infrastructure as Code (IaC): containers (Docker), orchestration (Kubernetes), CI/CD (GitHub Actions, Cir ...
3 weeks ago
+Perform structural analyses under supervision +Design new structures and repairs of existing structures +Work on routine problems independently and on project teams +Assist in preparation of proposals letters reports calculations drawings specifications budgeting scheduling and ...
2 weeks ago
LLM Training Dataset and Checkpoint Optimization Engineer
Only for registered members
We are seeking a Training Dataset and Checkpoint Acceleration Engineer to optimize data pipelines and checkpoint mechanisms for large-scale machine learning workloads.In this role, you will work at the intersection of data engineering and distributed systems, ensuring that traini ...
1 month ago
We are seeking an entry-level Mechanical Designer/Engineer-in-Training to join our team. · As a member of our Buildings team, you will work with guidance and direction on tasks and smaller projects, · and as a team member of a larger project under the guidance of a Senior Enginee ...
3 weeks ago
We are an AI platform engineering group focused on large-scale model training systems and performance acceleration. · Optimize large model training pipelines for performance and scalability · Design and improve distributed training systems · ...
3 weeks ago