Findings from a pilot Anthropic–OpenAI alignment evaluation exercise: OpenAI Safety Tests

OpenAI Blog

August 27, 2025

OpenAI and Anthropic share findings from a first-of-its-kind joint safety evaluation, testing each other’s models for misalignment, instruction following, hallucinations, jailbreaking, and more—highlighting progress, challenges, and the value of cross-lab collaboration.