Research Engineer, Evaluations - Menlo Park
1 week ago

Job summary
We are seeking Research Engineers to join the Evaluations team within Meta Superintelligence Labs. You will curate and build benchmarks for our most advanced AI models.
+
Bullet points
- Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development
- Develop and implement evaluation environments, including environments for novel model capabilities and modalities
Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
+Job summary · Meta is seeking Research Engineers to join the Evaluations team within Meta Superintelligence Labs.+ · +Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development · Develop and implement evaluation envir ...
1 week ago
Evaluations are the core of AI progress at MSL determining what capabilities get built which features get prioritized and how fast our models improve. · ...
6 days ago
Meta is seeking Research Engineers to join the Evaluations team within Meta Superintelligence Labs. · Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. · ...
6 days ago
Luma's mission is to build multimodal AI to expand human imagination and capabilities. · Evaluate generative model performance across diverse tasks, prompts, and modalities. · ...
1 month ago
We are seeking a candidate to help shape and scale the way we understand, measure and improve model performance. · Evaluate generative model performance across diverse tasks and modalities. · ...
3 weeks ago
+Responsibilities: · Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development · Develop and implement evaluation environments, including environments for novel model capabilities and modalities · ...
5 days ago
· ...
5 days ago
We are currently seeking a security-focused Embedded Systems Engineer/Evaluator for our Electrical Engineering and Computer Science Practice in Menlo Park, CA. · We will work as part of a team to test and troubleshoot secure identity credentials and other security peripherals suc ...
1 month ago
We are currently seeking a security-focused Embedded Systems Engineer/Evaluator for our Electrical Engineering and Computer Science Practice in Menlo Park, CA. · ...
3 days ago
+Responsable por realizar experimentos en laboratorio secundario para la adquisición de datos. +Desarrolla directrices de etiquetado y prepara los datos para el desarrollo del modelo AI. +Contribuye al diseño, ejecución y análisis de estudios de validación., +Se enfoca en práctic ...
3 weeks ago
We are looking for an experienced Senior AI Data and Validation Engineer to maintain the safety, effectiveness, · & quality of AI-enabled medical devices throughout their product lifecycle.The engineer will contribute to design, · & execution of validation studies ensuring device ...
3 weeks ago
We are looking for an experienced and highly skilled Senior AI Data and Validation Engineer responsible for dry and wet lab experiments for AI functionality acquiring datasets, developing labeling guidelines, preparing dataset for AI model development. The engineer will contribut ...
2 weeks ago
We're partnering with a deep-tech AI company building autonomous agentic systems for complex physical and real-world environments. The team operates at the edge of what's possible today designing AI systems that plan act recover and improve over long horizons in high-stakes setti ...
1 month ago
We're partnering with a deep-tech AI company building autonomous agentic systems for complex physical environments. · Build eval harnesses for agentic LLM systems offline + in-workflow ...
1 month ago
The GenAI Model Evaluation team is the main line of defense in ensuring customer safety.We are looking for an experienced fullstack developer to take ownership of the tooling we use to determine the best model for release. · Create, maintain and expand internal tools for evaluati ...
1 month ago
Jobsummary · AnomaliisheadquarteredinSiliconValleyandistheLeadingAIPoweredSecurityOperationsPlatformthatismodernizingsecurityoperations.Atthecenterofitisanomnipresent,intelligent,andmultilingualAnomalicopilotthatautomatesimportanttasksandempowersyourteamtodeliverriskinsightstoman ...
6 days ago
.hidden{display:none}Looking for a place that values your unique talents? Discover Stryker's award-winning culture. · We are looking for an experienced Senior AI Data Engineer to join our team. · This role is crucial for maintaining the safety, effectiveness, and quality of AI-en ...
3 weeks ago
Research Engineering Manager, Evaluations, Superintelligence Labs
Only for registered members
Meta is seeking a Research Engineering Manager to lead the Evaluations team within Meta Superintelligence Labs. In this leadership role,you will guide a team of research engineers who curate and build the benchmarks for our most advanced AI models, · You'll partner with world-cla ...
6 days ago
Research Engineering Manager, Evaluations, Superintelligence Labs
Only for registered members
Meta is seeking a Research Engineering Manager to lead the Evaluations team within Meta Superintelligence Labs. · ...
5 days ago
Elicit radically increases the amount of good reasoning in the world. · ...
3 weeks ago