Test & Evaluation Engineer (Entry-Level, Associate, and Experienced) - Berkeley
1 month ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
Elicit is an AI research platform that uses language models to help researchers figure out what's true and make better decisions. · ...
3 weeks ago
Elicit is an AI research platform that uses language models to help researchers figure out what's true and make better decisions. · At Elicit, we're focused on understanding and hill-climbing towards models that help us make better decisions. · Build a comprehensive system that r ...
4 weeks ago
Elicit radically increases the amount of good reasoning in the world. · ...
4 weeks ago
We need someone to own the technical foundation of our auto-evaluation systems. Our evals are currently much slower than they need to be, and our interfaces aren't optimized for the diverse set of people who need to use them—ML engineers iterating on models, product managers moni ...
4 weeks ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. · The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · ...
1 month ago
At Retool, we're on a mission to bring good software to everyone. · We believe that the future of software development lies in abstracting away the tedious and repetitive tasks developers waste time on, · while creating reusable components that act as a force multiplier for futur ...
4 weeks ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · Porti ...
1 month ago
We're on a mission to bring good software to everyone. · We believe that the future of software development lies in abstracting away the tedious and repetitive tasks developers waste time on, · while creating reusable components that act as a force multiplier for future developer ...
3 weeks ago
About Distyl AI · Distyl AI develops production-grade AI systems to power core operational workflows for Fortune 500 companies. Powered by a strategic partnership with OpenAI, in-house software accelerators, and deep enterprise AI expertise, we deliver working AI systems with ra ...
17 hours ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · ...
1 month ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. · The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · You ...
1 month ago
+We are seeking Research Engineers to join the Evaluations team within Meta Superintelligence Labs. You will curate and build benchmarks for our most advanced AI models. · + · +Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier ...
1 week ago
Meta is seeking Research Engineers to join the Evaluations team within Meta Superintelligence Labs. · Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. · ...
1 week ago
+Job summary · Meta is seeking Research Engineers to join the Evaluations team within Meta Superintelligence Labs.+ · +Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development · Develop and implement evaluation envir ...
1 week ago
Luma's mission is to build multimodal AI to expand human imagination and capabilities. · Evaluate generative model performance across diverse tasks, prompts, and modalities. · ...
1 month ago
Evaluations are the core of AI progress at MSL determining what capabilities get built which features get prioritized and how fast our models improve. · ...
1 week ago
+Responsibilities: · Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development · Develop and implement evaluation environments, including environments for novel model capabilities and modalities · ...
1 week ago
· ...
1 week ago
+Build and maintain infrastructure and tooling for the AI evaluations platform used by internal teams, including automated testing platform for AI voice agents, debugging and observability tools. · +Develop and productionalize evaluation frameworks for individual system component ...
4 days ago
Waymo is looking for a Software Engineer to build the metrics and pipelines that grade its hybrid environment simulator. The ideal candidate will have experience in systems engineering and AI, with proficiency in C++ or Python. · ...
1 week ago
We are seeking a candidate to help shape and scale the way we understand, measure and improve model performance. · Evaluate generative model performance across diverse tasks and modalities. · ...
4 weeks ago