Bilingual LLM Evaluator

Only for registered members United States

2 weeks ago

Default job background
About The Job · Mercor · connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include · Benchmark · , · General Catalyst · , · Peter Thiel · , · Adam D'Angelo · , · Larry Summers · , and · Jack Dorsey · . · Posi ...
Lorem ipsum dolor sit amet
, consectetur adipiscing elit. Nullam tempor vestibulum ex, eget consequat quam pellentesque vel. Etiam congue sed elit nec elementum. Morbi diam metus, rutrum id eleifend ac, porta in lectus. Sed scelerisque a augue et ornare.

Donec lacinia nisi nec odio ultricies imperdiet.
Morbi a dolor dignissim, tristique enim et, semper lacus. Morbi laoreet sollicitudin justo eget eleifend. Donec felis augue, accumsan in dapibus a, mattis sed ligula.

Vestibulum at aliquet erat. Curabitur rhoncus urna vitae quam suscipit
, at pulvinar turpis lacinia. Mauris magna sem, dignissim finibus fermentum ac, placerat at ex. Pellentesque aliquet, lorem pulvinar mollis ornare, orci turpis fermentum urna, non ullamcorper ligula enim a ante. Duis dolor est, consectetur ut sapien lacinia, tempor condimentum purus.
Get full access

Access all high-level positions and get the job of your dreams.



Similar jobs

  • Work in company

    LLM Model Evaluator

    Only for registered members

    Evaluate LLM outputs for accuracy relevance bias and safety Design test cases and evaluation benchmarks for AI models Analyze model behavior and document findings Collaborate with ML engineers and data scientists to improve models Provide structured feedback to enhance model perf ...

    Austin

    3 weeks ago

  • Work in company Remote job

    LLM Evaluation, Benchmarking

    Only for registered members

    We're looking for an LLM Evaluation, Benchmarking & Experimentation Engineer to rigorously test our proprietary LLM API and build the infrastructure for systematic model improvement. · ...

    2 weeks ago

  • Work in company

    AI/LLM Evaluation

    Only for registered members

    · At LeoTech, we are passionate about building software that solves real-world problems in the Public Safety sector. Our software has been used to help the fight against continuing criminal enterprises, drug trafficking organizations, identifying financial fraud, disrupting sex ...

    Austin, TX

    2 days ago

  • Work in company Remote job

    Bilingual LLM Evaluator

    Only for registered members

    Mercor connects elite creative and technical talent with leading AI research labs. Headquartered in San Francisco, our investors include Benchmark , General Catalyst , Peter Thiel , Adam D'Angelo , Larry Summers , and Jack Dorsey . · ...

    1 month ago

  • Work in company Remote job

    Bilingual LLM Evaluator

    Only for registered members

    Mercor connects elite creative and technical talent with leading AI research labs. · Evaluate LLM-generated responses on their ability to effectively answer user queries. · ...

    2 weeks ago

  • Work in company Remote job

    LLM Evaluation Specialist

    Only for registered members

    +Job summary · Evaluate LLM-generated responses on their ability to effectively answer user queries. · QualificationsBachelor's degree · Native speaker or ILR 5/primary fluency (C2 on the CEFR scale) in Hindi · ...

    1 month ago

  • Work in company

    AI/LLM Evaluation

    Only for registered members

    We are passionate about building software that solves real-world problems in the Public Safety sector. · ...

    Austin $135,000 - $160,000 (USD) Full time

    1 month ago

  • Work in company

    LLM Evaluation Engineer

    Only for registered members

    Build the evaluation layer in the ThirdLaw platform—the part of the system that decides whether an LLM prompt, response, tool call, or agent behavior is acceptable. · ...

    Remote

    1 week ago

  • Work in company

    LLM Evaluation Engineering Lead

    Only for registered members

    We're partnering with a deep-tech AI company building autonomous agentic systems for complex physical and real-world environments. The team operates at the edge of what's possible today designing AI systems that plan act recover and improve over long horizons in high-stakes setti ...

    Redwood City

    1 month ago

  • Work in company

    LLM Security Evaluation Expert

    Only for registered members

    Tetrad Digital Integrity (TDI) is a leading-edge cybersecurity firm with a mission to safeguard and protect our customers from increasing threats and vulnerabilities in this digital age. · We are seeking a highly skilled LLM Security Evaluation Expert to join our team. In this ro ...

    Fort Meade

    2 days ago

  • Work in company

    Frontend Developer — LLM Evaluation

    Only for registered members

    Our Client is a well-funded nonprofit research organization focused on measuring frontier AI capabilities—especially agentic / autonomous capabilities and the ability of models to conduct AI R&D because those capabilities can create outsized societal and security risk if they sca ...

    San Francisco Full time

    1 month ago

  • Work in company

    LLM-GenAI Model Evaluator

    Only for registered members

    We are looking for an LLM-GenAI Model Evaluator position. · ...

    Austin, TX

    2 weeks ago

  • Work in company

    LLM-GenAI Model Evaluator

    Only for registered members

    We are seeking an LLM-GenAI Model Evaluator position for our team in Austin tx & Sunnyvale. The ideal candidate will have strong understanding of LLMs, generative AI, and transformer-based architectures. · Strong understanding of LLMs, generative AI, and transformer-based archite ...

    Austin

    2 weeks ago

  • Work in company

    Frontend Developer — LLM Evaluation

    Only for registered members

    You'd be building the UI that turns messy LLM evaluation outputs into clear, explorable artifacts that researchers can trust. · ...

    San Francisco

    1 month ago

  • Work in company

    LLM-GenAI Model Evaluator

    Only for registered members

    Immediate need for a talented LLM-GenAI Model Evaluator. · Evaluate LLM-GenAI models. · ...

    Austin

    2 weeks ago

  • Work in company Remote job

    Bilingual LLM Evaluation Analyst

    Only for registered members

    Mercor connects elite creative and technical talent with leading AI research labs. The company is headquartered in San Francisco and has investors including Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larry Summers, and Jack Dorsey. As a Bilingual LLM Evaluation Anal ...

    Israel, OH

    1 month ago

  • Work in company

    LLM Security Evaluation Expert

    Only for registered members

    SilverEdge Government Solutions is seeking a highly skilled LLM Security Evaluation Expert to join our team. · ...

    Columbia Full time

    1 month ago

  • Work in company

    LLM Evaluation Engineering Lead

    Only for registered members

    We're partnering with a deep-tech AI company building autonomous agentic systems for complex physical environments. · Build eval harnesses for agentic LLM systems offline + in-workflow ...

    Redwood City, CA

    1 month ago

  • Work in company

    LLM Security Evaluation Expert

    Only for registered members

    SilverEdge Government Solutions is seeking a highly skilled LLM Security Evaluation Expert. In this role, you will be responsible for rigorously testing the security and integrity of Large Language Models (LLMs). · TS/SCI with Polygraph level Clearance · Familiarity with prominen ...

    Columbia, MD

    4 weeks ago

  • Work in company Remote job

    LLM Evaluation and Benchmarking Mentor

    Only for registered members

    I'm seeking a technical mentor to help deepen my understanding of LLM evaluation and benchmarking, with particular attention to high-stakes applications · (e.g., mental health), while developing a generalizable framework for reasoning about model performance across domains. · ...

    $50 - $150 (USD) per hour

    1 month ago