Research Scientist Intern, LLM Evaluation - New York
11 hours ago

Job description
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.
Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.
Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Access all high-level positions and get the job of your dreams.
Similar jobs
· Design, implement, and maintain comprehensive evaluation protocols for large language models · Analyze model outputs to identify strengths, weaknesses, and failure modes · ...
1 month ago
$7,650/month to $12,134/month + benefits ...
3 weeks ago
· ...
3 weeks ago
We are looking for a Staff Machine Learning Engineer to join our Waymo AI Foundations team. · ...
1 week ago
We are looking for AI Evaluation & Data Engineering Specialists to design curate and operationalize datasets and evaluation frameworks for AI product performance assessment.This role involves working with large language models LLMs human raters and automation tools to measure mod ...
1 month ago
We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. · ...
1 month ago
+Job summary · Perplexity serves tens of millions of users daily with reliable, high-quality answers grounded in an LLM-first search engine and our specialized data sources. · +ResponsibilitiesArchitect and maintain automated evaluation pipelines to assess answer quality across P ...
6 days ago
We are building LLM evaluation and training datasets to train LLM to work on realistic software engineering problems. One of our approaches, in this project, is to build verifiable SWE tasks based on public repository histories in a synthetic approach with human-in-the-loop; whil ...
4 weeks ago
Datamundi is looking for AI Model (Query Vetting Experts) to support Generative AI (GenAI) data evaluation and training across multiple languages. · ...
1 month ago
We are seeking a Senior AIML Engineer with deep expertise in agentic AI systems, LLM orchestration and end to end model development. · ...
1 week ago
As we build to the next level, we're looking for a top-quality AI engineer with a strong focus on AI agents - someone who knows how to leverage LLMs, both open-source and closed, in combination with complex tool-calling hierarchies and operational patterns.This is an opportunity ...
2 weeks ago
The mandateYou will co-own the technical architecture and AI strategy end -to -end with the CEO. · We believe one engineer armed with the right AI stack can out-ship a traditional team of five. · Solve "Unstructured-to-Structured" at scale: Instead of writing brittle scrapers, y ...
2 weeks ago
As a Senior Data Scientist on the Medhub team, you will be the primary architect of our LLM evaluation framework. · You believe that LLMs should be held to the same (or higher) scientific standards as traditional supervised models · ...
1 month ago
You'll be responsible for making tech support and employee benefit articles LLM-friendly via NLP-aware editing. · Native or equivalent proficiency in English · Ownership of a content schema... · ...
4 weeks ago
We are looking for a Senior Data Scientist to join our AI Red Teaming efforts and focus on adversarial evaluation, failure analysis, and risk discovery in AI models and AI agents. · Design and execute AI redteaming experiments against LLMs and AI agents to identify: prompt inject ...
2 weeks ago
We are looking for an AI Engineer to modernize and enhance our existing regex/keyword-based Elastic Search system by integrating state-of-the-art semantic search,dense retrieval,and LLM-powered ranking techniques. · Analyze limitations in current regex & keyword-only search imple ...
1 month ago
We're hiring an Applied AI Engineer to push the boundaries of our Cofounder agent. · You'll own core backend systems and applied LLM work advancing agent reliability and autonomy building evaluation pipelines shipping techniques that measurably improve agent performance. This is ...
1 month ago
The AI Security Engineer designs secure architectures for Large Language Model (LLM) and Agentic AI ecosystems across the enterprise.This includes securing platforms like ChatGPT Enterprise, Claude Enterprise, Gemini Enterprise, · Azure OpenAI environments. · Engineer secure envi ...
1 month ago
We are scaling intelligence to serve humanity. We're training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. · ...
1 week ago
The AI Security Engineer designs secure architectures for Large Language Model (LLM) ecosystems across the enterprise. · ...
1 month ago