FACTS Benchmark Suite: Systematically evaluating the factuality of large language models

DeepMind Blog

December 9, 2025

The FACTS Benchmark Suite provides a systematic evaluation of Large Language Models (LLMs) factuality across three areas: Parametric, Search, and Multimodal reasoning.