Evaluation Engineer - Oakland
3 weeks ago

Job summary
Elicit is an AI research platform that uses language models to help researchers figure out what's true and make better decisions.
Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
Elicit radically increases the amount of good reasoning in the world. · ...
3 weeks ago
We need someone to own the technical foundation of our auto-evaluation systems. Our evals are currently much slower than they need to be, and our interfaces aren't optimized for the diverse set of people who need to use them—ML engineers iterating on models, product managers moni ...
3 weeks ago
Elicit is an AI research platform that uses language models to help researchers figure out what's true and make better decisions. · At Elicit, we're focused on understanding and hill-climbing towards models that help us make better decisions. · Build a comprehensive system that r ...
3 weeks ago
At Retool, we're on a mission to bring good software to everyone. · We believe that the future of software development lies in abstracting away the tedious and repetitive tasks developers waste time on, · while creating reusable components that act as a force multiplier for futur ...
3 weeks ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. · The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · ...
1 month ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · Porti ...
1 month ago
We're on a mission to bring good software to everyone. · We believe that the future of software development lies in abstracting away the tedious and repetitive tasks developers waste time on, · while creating reusable components that act as a force multiplier for future developer ...
2 weeks ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · ...
1 month ago
We are looking for engineers to join us on a 6-month contract (with the possibility of extension) our Engineering Team. · The primary work is split between engineering work to port external benchmarks to run on internal infrastructure and developing novel model evaluations. · You ...
1 month ago
Build and maintain infrastructure and tooling for the AI evaluations platform used by internal teams.Develop and productionalize evaluation frameworks for individual system components such as ASR, LLMs, TTS, knowledge bases. · 5+ years of professional software engineering experie ...
1 week ago
+Join the team bringing advanced autonomy to the built world. · +We're moving AI out of the lab and into the real world. · Our team is composed of industry veterans who helped launch Waymo, scaled Segment to a $3.2B acquisition, and grew Uber Freight to $5B in revenue. · ...
1 day ago
Waymo is looking for a Software Engineer to build the metrics and pipelines that grade its hybrid environment simulator. The ideal candidate will have experience in systems engineering and AI, with proficiency in C++ or Python. · ...
1 week ago
About Anthropic's mission is to create reliable, interpretable and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. · ...
6 days ago
We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. · ...
1 month ago
+Job summary · Join our team bringing advanced autonomy to the built world. · We're deploying autonomous systems on heavy construction machinery across the country, · accelerating project schedules of billion-dollar infrastructure projects and improving safety on job sites. · ++W ...
1 week ago
+Build and maintain infrastructure and tooling for the AI evaluations platform used by internal teams, including automated testing platform for AI voice agents, debugging and observability tools. · +Develop and productionalize evaluation frameworks for individual system component ...
1 day ago
This is where algorithms meet steel-toed boots. You'll collaborate with construction veterans and world-class engineers to solve physical-world problems that simulations can't touch. · Responsibilities · Design and maintain eval systems: Build pipelines for measuring system perfo ...
1 week ago
We seek to learn from deployment and broadly distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely.About The RoleIn this role, you'll lead development of the systems we use to evaluate the quality of our AI models and products. · ...
1 month ago
We are looking for an experienced metrics engineer to join our Data Insights team. · The successful candidate will have a strong background in quantitative fields such as computer science or mathematics. · Responsibilities include architecting new analytics data collection,workin ...
4 weeks ago
We're moving AI out of the lab and into the real world. Our team is composed of industry veterans who helped launch Waymo, scaled Segment to a $3.2B acquisition, · and grew Uber Freight to $5B in revenue. · ,Develop Metrics · ,Predict System Performance · , Please apply anyway We ...
18 hours ago