Staff Machine Learning Research Scientist, LLM Evals - Seattle
2 weeks ago

Job summary
We are building industry-leading LLM evals, setting new standards for model performance assessment. Our mission is to develop rigorous, scalable and fair evaluation methodologies to drive the next generation of AI capabilities.Responsibilities
- Drive research on the effectiveness and limitations of existing LLL evaluation techniques.
- Design and develop novel evaluation benchmarks for large language models,
- Publish research findings in top-tier AI conferences
Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
We are building industry-leading LLM evals, setting new standards for model performance assessment. Our mission is to develop rigorous, scalable, and fair evaluation methodologies to drive the next generation of AI capabilities. · ...
Technical Program Manager, Model Evaluations
4 weeks ago
+Job summary · As a Technical Program Manager for model evaluations, you'll own end-to-end coordination of our evaluation ecosystem— building a feedback loop from shaping eval strategy during early model development through launch execution. · +ResponsibilitiesStandardize how eva ...
AI Research Engineer
4 days ago
We're looking for an AI Research Engineer who deeply understands LLMs from the inside — someone who doesn't just call APIs, but tinkers, probes, breaks, fine-tunes and rebuilds models to understand how and why they behave the way they do. · Design experiment with and optimize LLM ...
Customer Solutions Architect
1 month ago
We're looking for a Customer Solutions Architect to join our Field Engineering team at Braintrust.This deeply technical role focuses on helping teams deploy and operate Braintrust in production environments. · You'll work with platform DevOps infrastructure teams at leading AI-fo ...
Customer Solutions Architect
1 month ago
We're looking for a Customer Solutions Architect to join our Field Engineering team. · This deeply technical customer-facing role focused on helping teams deploy Braintrust in production environments. · You'll support deployments troubleshoot issues across layers guide best pract ...
Software Engineer
1 month ago
We are seeking a driven and analytical Software Engineer to join Apple's Generative AI Evaluations team. · In this role, you will help define how we measure, monitor, and improve the performance of AI systems that power next-generation user experiences. · ...
Senior Principal Machine Learning Engineer
1 month ago
Working at Atlassian · Atlassians have flexibility in where they work – whether in an office, from home, or a combination of the two. · ...
Senior Principal Machine Learning Engineer
4 weeks ago
Atlassian is seeking a Senior Principal Machine Learning Engineer to join our GenAI Platform organization, focusing on the quality and reliability of Rovo Chat. · ...
Applied AI Engineer
3 weeks ago
About Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. · As a member of the Applied AI team at Anthropic, you will be a technical Product Engineer focused on becoming ...
Software Engineer
2 weeks ago
We are looking for a Software Engineer to design and implement core platform capabilities for AI/ML and AI Agents in SingleStore Cloud. · You'll work on services that enable model/tool orchestration (e.g. MCP style tool discovery and execution), agent workflows, retrieval pipelin ...
Senior Forward Deployed Engineer, Agentic AI
1 month ago
Robots & Pencils is seeking an outcome oriented Forward Deployed Senior Software Engineer to partner with strategic clients on high-impact agentic AI applications. Robots and Pencils builds digital-first products that matter. · ...
Senior Staff Machine Learning Engineer
1 month ago
We offer a rewarding career where your ambitions are met with endless possibilities. · Every day we honor our iconic brand by offering quality coverage to millions of customers and being there when they need us most. · ...
AI Engineer
1 week ago
Robots & Pencils is seeking a Senior Forward Deploy Engineer to partner with strategic clients on high-impact agentic AI applications. · You'll embed directly with customers throughout the product lifecycle, working hands-on from design to production. · ...
Senior Forward Deployed Engineer, Agentic AI
1 week ago
We don't just ship features, we build digital-first products that matter. As a Senior Forward Deploy Engineer, · you'll join a team that values deep craft, · cross-functional collaboration and relentless focus on quality. · ...
AI Engineer
1 month ago
We're looking for an AI Engineer to work directly with our most strategic customers and help them successfully deploy, scale, and extract value from Braintrust in real production environments. · ...
Solutions Architect, Applied AI
3 weeks ago
About Anthropic's mission is to create reliable, interpretable, and steerable AI systems. · ...
Senior Staff Machine Learning Engineer
3 weeks ago
We are seeking an accomplished Senior Staff ML Engineer who will serve as a technical leader for the generative AI domain at GEICO. · This individual contributor role involves collaborating with a dynamic team of AI and software engineers to design, develop and deploy systems tha ...
We are hiring a UX writer and content designer to develop and execute content strategies for ad products on Search. As a stellar writer, your portfolio of work demonstrates content that simplifies and beautifies the overall user experience. · ...
AIML - Sr Machine Learning Engineer
5 days ago
Contribute to model hill climbing for Apple Intelligence features that leverage Apple Foundation Models and work with the people who built the intelligent products that helps millions of people get things done — just by asking or typing. · ...
Staff Software Engineer, Platform
1 month ago
Job summary: AI has the potential to exponentially augment human intelligence. Every person will have a personal tutor, coach, assistant, personal shopper, travel guide and therapist throughout life. · ...