Mesa-Optimization and the Optimization Pressure Spectrum

Prior reading: Gradient Descent and Backpropagation | Decision Theory for AI Safety | What Is RL? The Question Why does an AI system behave the way it does? And why do optimizers keep creating sub-optimizers with misaligned goals? Three frameworks give different (complementary) answers. But first — some terminology that trips people up. Mesa vs. Meta I've seen these two confused often enough that it's worth being explicit. I had to look up what "mesa" even meant the first time I encountered it. ...

August 6, 2025 · 13 min · Austin T. O'Quinn
.