FACTS Grounding: A new benchmark for evaluating the factuality of large language models

DeepMind Blog
December 17, 2024
Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in provided source material and avoid hallucinations
Verticals
airesearch
Originally published on DeepMind Blog on 12/17/2024