Software Engineer, Applied Evals - San Francisco
1 week ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
We're hiring product-minded engineers to design and build evals and harnesses that capture real-world quality for advanced AI systems. ...
1 month ago
We're hiring product-minded engineers to design and build evals and harnesses that capture real-world quality for advanced AI systems. · ...
1 month ago
I'm hiring a Lead LLM Evals Engineer to join an early-stage physical AI startup building systems with general physical ability to experiment, · engineer,and manufacture anything. · → Build eval harnesses for agentic LLM systems in complex workflows · → · → Turn eval failures int ...
1 month ago
I'm hiring a Lead LLM Evals Engineer to join an early-stage physical AI startup building systems with general physical ability to experiment, engineer, and manufacture anything. · ...
1 month ago
We are on a mission to create the most helpful search engine in the world—one that prioritizes transparency, privacy, and user control. · Define and own what means for search-augmented and agentic AI systems by designing evaluation frameworks that measure real-world quality, reli ...
1 week ago
We re building a team of innovators problem solvers and visionaries who are passionate about shaping the future of AI and technology. · , ...
1 week ago
We re looking for a DevRel Engineer to help grow activate and inspire our developer community. · ...
1 month ago
We're looking for a DevRel Engineer to help grow, activate, and inspire our developer community.As a DevRel Engineer, you'll be the public voice of Braintrust for developers—creating content, engaging with the community, and bringing insights back to our product and engineering t ...
1 month ago
We're building a team of innovators who are passionate about shaping the future of AI and technology. · We combine advanced AI models with user-first principles. · ...
2 weeks ago
We're looking for a DevRel Engineer to help grow, activate, and inspire our developer community. · Create content engaging with the community. · Bring insights back to our product engineering teams. · ...
1 week ago
HUD (YC W25) está desarrollando evaluaciones agénticas para Agentes de Uso de Computadora (CUAs) que navegan por Internet. Nuestro marco evaluativo CUA es la primera herramienta completa de evaluación para CUAs. · Building new evaluations/eval environments for HUD's CUA evaluatio ...
1 week ago
The Support Automation team at OpenAI scales the organization by applying cutting-edge AI models to real-world challenges. · We're looking for a Backend Software Engineer with experience working in ML/LLM-heavy domains to help design and build an evals infrastructure that measure ...
1 month ago
+ Design eval pipelines that are reliable reproducible and extendable · + Build the infrastructure for continuous eval monitoring frameworks regression drift monitoring building robust golden datasets along with feedback loops that ultimately strengthen support automation. · + Co ...
5 days ago
This role sits at the intersection of AI research and engineering. · You will anticipate upcoming data needs at leading AI labs, · translate research signals into executable data projects, · and drive early pilots with teams working at the frontier.Anticipate future human data re ...
1 week ago
We're passionate about crafting products that serve those around us, blending rapid prototyping with a focus on long-term quality and reliability. · In this role, you will:• Design eval pipelines that are reliable, reproducible, and extendable · • Build robust systems and backend ...
1 week ago
We're looking for a technical leader who understands that building an AI agent is only 10% of the work—the real engineering challenge is measuring it. We need a thought leader who can solve the "problem nobody talks about": evaluating non-deterministic agentic systems in producti ...
1 week ago
At Dynamo AI we are looking for a ML Research Engineer Intern to work at the intersection of machine learning research and production systems. · ...
2 days ago
Sesame believes in a future where computers are lifelike - with the ability to see, hear, and collaborate with us in ways that feel natural and human. · ...
1 month ago
We're Salesforce, the Customer Company, inspiring the future of business with AI+ Data +CRM. · Leading with our core values, we help companies across every industry blaze new trails and connect with customers in a whole new way. And, we empower you to be a Trailblazer, too — driv ...
3 weeks ago
We are looking for a technical leader who understands that building an AI agent is only 10% of the work—the real engineering challenge is measuring it. · ...
3 weeks ago