Evaluation on Probably Aligned

Evaluation on Probably Aligned https://probablyaligned.ai/tags/evaluation/ Recent content in Evaluation on Probably Aligned Hugo en-us Wed, 07 Jan 2026 00:00:00 +0000 Why Passing a Safety Test Might Mean Nothing https://probablyaligned.ai/safety/formal-methods/detectability-of-testing/ Wed, 07 Jan 2026 00:00:00 +0000 https://probablyaligned.ai/safety/formal-methods/detectability-of-testing/ Simulated environments are our best tool for probing alignment. But if a model can distinguish test from deployment — and it almost certainly can — then testing tells you what the model does when it knows it's being watched. Nothing more.