Research Scientist Intern, LLM Evaluation - Menlo Park, CA
5 days ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
We're partnering with a deep-tech AI company building autonomous agentic systems for complex physical and real-world environments. The team operates at the edge of what's possible today designing AI systems that plan act recover and improve over long horizons in high-stakes setti ...
1 month ago
We're partnering with a deep-tech AI company building autonomous agentic systems for complex physical environments. · Build eval harnesses for agentic LLM systems offline + in-workflow ...
1 month ago
You'd be building the UI that turns messy LLM evaluation outputs into clear, explorable artifacts that researchers can trust. · ...
1 month ago
Lensa is a career site that helps job seekers find great jobs in the US. We are not a staffing firm or agency. Lensa does not hire directly for these jobs, but promotes jobs on LinkedIn on behalf of its direct clients, recruitment ad agencies, and marketing partners. Lensa partne ...
3 days ago
We are looking for a Member of Technical Staff to develop and implement cutting-edge methodologies to evaluate how well Copilot performs in real-world usage scenarios. · Leverage expertise to measure the performance of Copilot... · ...
2 weeks ago
Our Client is a well-funded nonprofit research organization focused on measuring frontier AI capabilities—especially agentic / autonomous capabilities and the ability of models to conduct AI R&D because those capabilities can create outsized societal and security risk if they sca ...
1 month ago
We're looking for outstanding individuals with experience in the social sciences machine learning and analysis of natural language to develop and implement cutting-edge methodologies to help us evaluate how well Copilot performs in real-world usage scenarios. · Leverage expertise ...
2 weeks ago
Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver—to improve access to mobili ...
3 days ago
+ Work with a creative team of people who help to build the state-of-the-art Foundation Models that are used throughout Waymo's systems. · + Lead the development of end-to-end evaluation systems and benchmarks for Waymo Foundation models, · + Implement and extend large scale data ...
2 weeks ago
Develop machine learning solutions addressing open problems in autonomous driving to safely operate Waymo vehicles in dozens of cities and under all driving conditions. · ...
2 weeks ago
Senior Research Engineer, LLM Evaluation and Behavioral Analysis
Only for registered members
About The Role · Together AI is building the fastest, most capable open-source-aligned LLMs and inference stack in the world. As part of the Turbo organization, you will be a critical bridge between cutting-edge model research and real-world behavioral reliability. This role focu ...
5 days ago
· ...
1 month ago
+ Build Multi-Modal LLM backbones · + Prepare human-labeled or auto-generated data for pre-training and fine-tuning · + Fine-tune/RLHF for downstream content understanding and user understanding tasks · + Model evaluation with human labeling or auto judgeWe conduct focused resear ...
1 month ago
Meta is seeking research engineers to help us build the solutions for Personalization of Meta AI. · We're looking for researchers with LLM post training LLM expertise to join us on working with improving the personalization of LLM responses. · Our team contributes to post trainin ...
1 month ago
AI Research Scientist, Personalization, SuperIntelligence Labs
Only for registered members
Meta is seeking AI research scientists to help us build solutions for Personalization of Meta AI. · We're looking for researchers with LLM post-training expertise to join us on working with improving the personalization of LLM responses. · Our team contributes to post-training re ...
1 month ago
Meta is seeking Research Engineers to join the Safety System and Foundations team within Meta Superintelligence Labs. · Design novel safety techniques for large language models. · Create datasets for safety system evaluation. · ...
1 month ago
· ...
4 weeks ago
· Design novel safety alignment techniques for large language models and multimodal AI systemsCreate high-quality datasets for safety alignmentFine-tune LLMs to adhere to Meta's safety policies ...
4 weeks ago
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries. · ...
1 month ago
· ...
3 weeks ago