Staff Research Engineer, Pre-training Science - Remote - United States

Only for registered members Remote - United States

11 hours ago

Default job background
Reddit is a community of communities. It's built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communiti ...
Job description


Reddit is a community of communities. It's built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active communities and approximately 121 million daily active unique visitors, Reddit is one of the internet's largest sources of information. For more information, visit

Reddit is continuing to grow our teams with the best talent. This role is completely remote friendly within the United States. If you happen to live close to one of our physical office locations (San Francisco, Los Angeles, New York City & Chicago) our doors are open for you to come into the office as often as you'd like.

The AI Engineering team at Reddit is embarking on a strategic initiative to build our own Reddit-native foundational Large Language Models (LLMs). This team sits at the intersection of applied research and massive-scale infrastructure, tasked with training models that truly understand the unique culture, language, and structure of Reddit communities. You will be joining a team of distinguished engineers and safety experts to build the "engine room" of Reddit's AI future—creating the foundational models that will power Safety & Moderation, Search, Ads, and the next generation of user products.

As a Staff Research Engineer for Pre-training Science, you will serve as the technical lead for defining the Continual Pre-Training (CPT) strategies that transform generic foundation models into Reddit-native experts. You will bridge the gap between "General Intelligence" and "Community Context," designing scientific frameworks that inject Reddit's unique knowledge (conversational trees, slang, multimodal memes) into base models without causing catastrophic forgetting. You will define the "learning recipe"—the precise mix of data, hyperparameters, and architectural adaptations needed to build a model that speaks the language of the internet.

Responsibilities:

  • Architect and validate rigorous Continual Pre-Training (CPT) frameworks, focusing on domain adaptation techniques that effectively transfer Reddit's knowledge into licensed frontier models.
  • Design the "Science of Multimodality": Lead research into fusing vision and language encoders to process Reddit's rich media (images, video) alongside conversational text threads.
  • Formulate data curriculum strategies: scientifically determining the optimal ratio of "Reddit data" vs. "General data" to maximize community understanding while maintaining safety and reasoning capabilities.
  • Conduct deep-dive research into Scaling Laws for Graph-based data: investigating how Reddit's tree-structured conversations impact model convergence compared to flat text.
  • Design and scale continuous evaluation pipelines (the "Reddit Gym") that monitor model reasoning and safety capabilities in real-time, enabling dynamic adjustments to training recipes.
  • Drive high-stakes architectural decisions regarding compute allocation, distributed training strategies (3D parallelism), and checkpointing mechanisms on AWS Trainium/Nova clusters.
  • Serve as a force multiplier for the engineering team by setting coding standards, conducting high-level design reviews, and mentoring senior engineers on distributed systems and ML fundamentals.

Required Qualifications:

  • 7+ years of experience in Machine Learning engineering or research, with a specific focus on LLM Pre-training, Domain Adaptation, or Transfer Learning.
  • Expert-level proficiency in Python and deep learning frameworks (PyTorch or JAX), with a track record of debugging complex training instabilities at scale.
  • Deep theoretical understanding of Transformer architectures and Pre-training dynamics—specifically regarding Catastrophic Forgetting and Knowledge Injection.
  • Experience with Multimodal models (VLM): understanding how to align image/video encoders (e.g., CLIP, SigLIP) with language decoders.
  • Experience implementing continuous integration/evaluation systems for ML models, measuring generalization and reasoning performance.
  • Demonstrated ability to communicate complex technical concepts (like loss spikes or convergence issues) to leadership and coordinate efforts across Infrastructure and Data teams.

Nice to Have:

  • Published research or open-source contributions in Continual Learning, Curriculum Learning, or Efficient Fine-Tuning (LoRA/Peft).
  • Experience with Graph Neural Networks (GNNs) or processing tree-structured data.
  • Proficiency in low-level optimization (CUDA, Triton) or distributed training frameworks (Megatron-LM, DeepSpeed, FSDP).
  • Familiarity with Safety alignment techniques (RLHF/DPO) to understand how pre-training objectives impact downstream safety.

Benefits:

  • Comprehensive Healthcare Benefits and Income Replacement Programs
  • 401k with Employer Match
  • Global Benefit programs that fit your lifestyle, from workspace to professional development to caregiving support
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Flexible Vacation & Paid Volunteer Time Off
  • Generous Paid Parental Leave 

#LI-SP1

Pay Transparency:

This job posting may span more than one career level.

In addition to base salary, this job is eligible to receive equity in the form of restricted stock units, and depending on the position offered, it may also be eligible to receive a commission. Additionally, Reddit offers a wide range of benefits to U.S.-based employees, including medical, dental, and vision insurance, 401(k) program with employer match, generous time off for vacation, and parental leave. To learn more, please visit

To provide greater transparency to candidates, we share base salary ranges for all US-based job postings regardless of state. We set standard base pay ranges for all roles based on function, level, and country location, benchmarked against similar stage growth companies. Final offer amounts are determined by multiple factors including, skills, depth of work experience and relevant licenses/credentials, and may vary from the amounts listed below.

The base salary range for this position is:

$230,000 - $322,000 USD

In select roles and locations, the interviews will be recorded, transcribed and summarized by artificial intelligence (AI). You will have the opportunity to opt out of recording, transcription and summarization prior to any scheduled interviews.

During the interview, we will collect the following categories of personal information: Identifiers, Professional and Employment-Related Information, Sensory Information (audio/video recording), and any other categories of personal information you choose to share with us. We will use this information to evaluate your application for employment or an independent contractor role, as applicable.  We will not sell your personal information or disclose it to any third party for their marketing purposes.  We will delete any recording of your interview promptly after making a hiring decision.  For more information about how we will handle your personal information, including our retention of it, please refer to our Candidate Privacy Policy for Potential Employees and Contractors.

Reddit is proud to be an equal opportunity employer, and is committed to building a workforce representative of the diverse communities we serve.  Reddit is committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans in our job application procedures. If, due to a disability, you need an accommodation during the interview process, please let your recruiter know.



Similar jobs

  • Work in company

    Staff Research Engineer, Pre-training Science

    Only for registered members

    Join the AI Engineering team at Reddit as a Staff Research Engineer for Pre-training Science and help build foundational Large Language Models (LLMs) that understand the unique culture, language, and structure of Reddit communities. · ...

    Remote - United States

    3 weeks ago

  • Work in company

    Staff Research Engineer, Pre-training Science

    Only for registered members

    This role is a Staff Research Engineer for Pre-training Science at Reddit. · The AI Engineering team at Reddit is embarking on a strategic initiative to build our own Reddit-native foundational Large Language Models (LLMs). This team sits at the intersection of applied research a ...

    Remote - United States

    5 days ago

  • Work in company

    Staff Research Engineer, Pre-training Data

    Only for registered members

    This role is for Staff Research Engineer. · The AI Engineering team at Reddit is embarking on a strategic initiative to build our own Reddit-native foundational Large Language Models (LLMs). You will be joining a team of distinguished engineers and safety experts to build the ...

    Remote - United States

    2 weeks ago

  • Work in company

    Staff Research Engineer, Pre-training Data

    Only for registered members

    Reddit is continuing to grow its teams with the best talent. · ...

    Remote - United States

    4 days ago

  • Work in company

    Staff Research Engineer, Pre-training Data

    Only for registered members

    · Reddit is a community of communities. It's built on shared interests, passion, and trust, and is home to the most open and authentic conversations on the internet. Every day, Reddit users submit, vote, and comment on the topics they care most about. With 100,000+ active commun ...

    Remote - United States

    11 hours ago

  • Work in company

    Senior Solutions Architect, Data Scientist

    Only for registered members

    We're building a team of sharp creative people who love solving hard problems. We value curiosity over ego initiative over waiting for permission and people who genuinely care about doing great work. · Partner with Sales and Capture teams throughout the pre-sales lifecycle to sup ...

    Remote, United States

    1 week ago

  • Work in company

    Customer Support Engineer

    Only for registered members

    Bruker's high-performance scientific instruments and high-value analytical and diagnostic solutions enable scientists to explore life and materials at molecular, cellular and microscopic levels.In close cooperation with our customers, Bruker is enabling innovation, improved produ ...

    Remote, United States

    6 days ago

  • Work in company

    Clinical Applications Specialist, Ultrasound General Imaging

    Only for registered members

    Drive and execute clinical education training and demonstration strategies. Facilitate evidence-based practice and support customer experience and commercial teams. · ...

    Remote Full time

    2 weeks ago

  • Work in company

    Technical Director EHR

    Only for registered members

    The EHR Technical Director will lead the development and support of the EHR solutions and services for our Life Sciences customers. · EHR solutions are things like step-by-step guides to help a healthcare provider or EHR team member optimize the EHR . They are created using inter ...

    Remote, United States

    1 month ago

  • Work in company

    Solutions Architecture, GTM Strategy Director/Senior Director

    Only for registered members

    N-Power Medicine is leading a revolution in how oncology trials are designed and conducted. · The Solutions Architecture GTM StrategyDirector/Senior Director is responsible for pre-sales solution design project scoping handoff to delivery across all N-Power products This role ser ...

    Remote

    3 weeks ago

  • Work in company

    Machine Learning Engineer

    Only for registered members

    We are looking for a Machine Learning Engineer with strong expertise in Google Cloud AI tools, ML model development, and end-to-end deployment. · The ideal candidate will have hands-on experience with Google Cloud Document AI, Vertex AI, and Large Language Models (LLMs). You will ...

    USA - Remote

    1 week ago

  • Work in company

    AI Engineer

    Only for registered members

    We're building AI systems that have natural and behaviorally realistic real-time conversations with people. · About the RoleWe're building AI systems that have natural and behaviorally realistic real-time conversations with people. As an AI Engineer on our team, you'll design and ...

    Remote, Oregon, United States Full time

    3 weeks ago

  • Work in company

    Senior Lead Implementation Specialist

    Only for registered members

    The Senior Lead Implementation Specialist is a strategic customer-facing leader responsible for supporting complex pre-sales initiatives and driving successful delivery of large-scale ERP implementations.The role serves as a bridge between Sales Solution Engineering Architecture ...

    Remote, US

    1 week ago

  • Work in company

    Senior Machine Learning Engineer

    Only for registered members

    Qventus is looking for a Senior Machine Learning Engineer to productionalize, operate, and scale machine learning models and advanced feature pipelines developed by our Data Science team across our AI-driven healthcare products. · ...

    Remote, United States

    1 week ago

  • Work in company

    Lead Generation Consultant

    Only for registered members

    Precision AQ esquiere un consultor de generación de leads proactivo que apoye la ejecución de programas de alcance objetivo y actividades de campaña para impulsar reuniones calificadas con perfiles ideales del cliente (ICPs) en farmacéutica y biotecnología. · Este rol es fundamen ...

    Remote, United States

    1 month ago

  • We are currently hiring adjunct faculty to teach SPAC 315- Space Policy at Embry-Riddle Worldwide Campus. · Geopolitical, technological, historical context and external societal forces influence space policy making. · ...

    Remote - United States

    5 days ago

  • Work in company

    Head of Data Science

    Only for registered members

    We're seeking an experienced Head of Data Science to lead our data science team in driving financial and customer outcomes through machine learning and cross-functional collaboration. · Drive financial and customer outcomes through machine learning and cross-functional collaborat ...

    Remote, United States

    3 weeks ago

  • Work in company

    Clinical Scientist

    Only for registered members

    As a Clinical Scientist you will distinguish yourself through research collaboration within this Molecular Imaging technology. · ...

    Remote

    5 days ago

  • Work in company

    Sales Engineer

    Only for registered members

    We are looking for ambitious new team members motivated to directly impact our future growth and success as we launch what we consider Aizon 3.0. · ...

    Remote

    3 days ago

  • Work in company

    Associate Solutions Architect

    Only for registered members

    Jama Software is a company that maximizes innovation success in multidisciplinary engineering organizations. They use requirements management software to minimize defects, rework, cost overruns, and recalls. · ...

    Remote - US

    1 week ago