- Ensure that training, fine-tuning, and inference jobs can meet performance, throughput, and cost efficiency needs for various multi-modal use cases.
- Evaluate and possibly integrate academic, OSS or enterprise training and inference performance optimizations.
- Design and implement tooling around the productization of generative foundation models such as RAG, version control, prompt management.
- Design easy-to-use APIs and interfaces for experienced ML practitioners, as well as non-experts.
- 2-5 years of experience in ML engineering on production systems dealing with training or inference of deep learning models.
- MS/PhD in Computer Science, or a related field
- Experience with at least one of PyTorch, Tensorflow, or JAX performance tuning and optimization
- Development, debugging, and optimization experience on GPUs and other accelerators and associated software ecosystem
- Development, debugging, and optimization experience on NVIDIA GPUs
- Experience with cloud computing providers such as AWS
- Comfortable with ambiguity, ability to take on and execute 0-1 projects
- Experience partnering closely with ML researchers
- Excellent written and verbal communication skills.
- Experience with training, fine-tuning, or serving large deep learning models, or LLMs, at the scale of millions of users.
- Development, debugging, and optimization experience on GPUs and other accelerators and associated software ecosystem
- Experience adapting OSS CUDA kernels, or writing your own to maximize training or inference performance of deep learning models.
- Experience with popular optimized LLM serving libraries such as DeepSpeed, TensorRT, or vLLM.
- Experience with large-scale distributed training and different parallelism techniques for scaling up training, such as FSDP and tensor/pipeline parallelism.
- Experience with cloud computing providers such as AWS
- Invited Paper at RecSys InTune: RL based pipeline optimization for Deep RecSys
- Synergistic Signals: Exploiting Co-Engagement and Semantic Links via Graph Neural Networks
- Talk on heterogeneous compute environments for ML at Ray Summit 2023
- OSS LLM Serving & Benchmarking - Talk at ML Platform Meetup Dec 2023
- Opportunities for OSS Gen AI in the Enterprise - Panel Discussion at ML Platform Meetup Dec 2023
-
Engineer, Network
Found in: Jooble US O C2 - 7 hours ago
Netflix Los Gatos, CA, United States Full timeNetflix is the world's leading streaming entertainment service with 260+ million paid memberships in over 190 countries enjoying TV series, documentaries, and feature films across a wide variety of genres and languages. Machine Learning drives innovation across all product functi ...
-
Senior DevOps Engineer
Found in: Lensa US 4 C2 - 3 days ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Creation is the ...
-
DevOps Engineer
Found in: Lensa US 4 C2 - 4 days ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Creation is the ...
-
Tech Lead, Machine Learning Engineer, Core Feed Recommendation
Found in: Click to Hired US C2 - 1 day ago
TikTok San Jose, United StatesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Mumbai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsCreation is the core ...
-
Senior Machine Learning Engineer, Core Feed Recommendation
Found in: Click to Hired US C2 - 1 day ago
TikTok San Jose, United StatesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Mumbai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsCreation is the core ...
-
Senior Software Engineer
Found in: Click to Hired US C2 - 1 day ago
TikTok San Jose, United StatesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsCreation is the core of TikTo ...
-
Senior Software Engineer
Found in: Click to Hired US C2 - 1 day ago
TikTok San Jose, United StatesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsCreation is the core of TikTo ...
-
Senior Software Engineer
Found in: Click to Hired US C2 - 1 day ago
TikTok San Jose, United StatesTikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join UsCreation is the core of TikTo ...
-
Senior Backend Software Engineer
Found in: Click to Hired US C2 - 1 day ago
TikTok San Jose, United StatesTikTok is the leading destination for short-form mobile video. At TikTok, our mission is to inspire creativity and bring joy. TikTok's global headquarters are in Los Angeles and Singapore, and its offices include New York, London, Dublin, Paris, Berlin, Dubai, Jakarta, Seoul, and ...
-
Senior Machine Learning Engineer
Found in: Lensa US 4 C2 - 3 days ago
Fossbytes Media Pvt Ltd San Jose, United StatesAbout Us · Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it ...
-
Engineering Manager and Lead
Found in: Lensa US 4 C2 - 21 hours ago
Nutanix San Jose, United StatesHungry, Humble, Honest, with Heart. The Opportunity Nutanix, a leader in enterprise cloud, is looking to hire an outstanding technical lead and Manager to join our LCM/Infra team. You will be part of a small yet impactful team with ample opportunity to grow and expand your skill ...
-
Tech Lead, Content E-commerce
Found in: Lensa US 4 C2 - 3 hours ago
Tik Tok San Jose, United StatesAbout the company:TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo.Why Join Us:Creation i ...
-
Fullstack Software Engineer, TikTok for Business Recommendation
Found in: Lensa US 4 C2 - 1 day ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Why Join Us · C ...
-
Machine Learning Engineer, Core Feed Recommendation
Found in: Jooble US O C2 - 7 hours ago
TikTok San Jose, CA, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Mumbai, Singapore, Jakarta, Seoul and Tokyo. Why Join Us ...
-
Senior Android Software Engineer
Found in: Lensa US 4 C2 - 6 days ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Why Join Us · Cr ...
-
Software Engineer
Found in: Lensa US 4 C2 - 1 day ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Why Join Us · C ...
-
iOS Software Engineer
Found in: Lensa US 4 C2 - 3 days ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Why Join Us · Cr ...
-
Senior Software Engineer
Found in: Lensa US 4 C2 - 5 days ago
ByteDance San Jose, United StatesResponsibilities · About Us · Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, B ...
-
Engineering Manager, Development Experience and Efficiency
Found in: Lensa US 4 C2 - 4 days ago
Tik Tok San Jose, United StatesResponsibilities · TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul and Tokyo. · Why Join Us · Cr ...
-
Full Stack Software Engineer, Developer Infrastructure
Found in: Lensa US 4 C2 - 3 days ago
ByteDance San Jose, United StatesResponsibilities · About Us · Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Helo, and Resso, as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, B ...
ML Engineer, Generative AI Core Infra - Los Gatos, CA, United States - Netflix
Description
Netflix is the world's leading streaming entertainment service with 260+ million paid memberships in over 190 countries enjoying TV series, documentaries, and feature films across a wide variety of genres and languages. Machine Learning drives innovation across all product functions and decision-support needs. Building highly scalable and differentiated ML infrastructure is key to accelerating this innovation.
The Opportunity We are looking for driven and experienced Machine Learning Engineers to join a new team, GenAI Core Infra, under our Machine Learning Platform (MLP) org. MLP's charter is to maximize the business impact of all ML at Netflix. We develop innovative ML infrastructure to support key product functions such as personalized recommendations, studio algorithms, virtual productions, growth intelligence, and content demand modeling among others. In this role, you will scale the training, customization, fine-tuning, and serving capabilities for large language and multi-modal foundation models. You will partner closely with ML researchers and data scientists to address critical performance, usability, and scalability challenges that come with training and tuning generative foundation models at the Netflix scale. Responsibilities