Decision Theory for AI Safety

Prior reading: Game Theory for AI Safety Why Decision Theory Matters Every AI system that takes actions is implicitly using a decision theory — a framework for choosing among options given beliefs and preferences. The choice of decision theory determines: Whether the AI cooperates or defects in strategic situations Whether it's manipulable or manipulation-resistant Whether it takes catastrophic gambles or plays it safe How it reasons about its own future behavior Decision theory isn't just philosophy. It's the operating system of agency. ...

April 9, 2025 · 5 min · Austin T. O'Quinn
.