The Specification and Language Problem

Prior reading: What Are Formal Methods? | Model Checking and Formal Specification The Problem Formal methods can prove a system satisfies a specification. The hard part is writing the specification. "Be helpful and don't cause harm" is not a formal spec. Turning it into one requires resolving ambiguity, edge cases, and value judgments that humans can't even agree on in natural language. Specification as Translation Every spec is a translation from human intent to formal language. Translation is lossy. The gap between what we mean and what we write is the specification problem. ...

June 4, 2025 · 2 min · Austin T. O'Quinn

Model Checking and Formal Specification

Prior reading: What Are Formal Methods? What Is Model Checking? Given a model of a system and a property expressed in formal logic, model checking exhaustively verifies whether the property holds in all reachable states. Temporal Logic Crash Course LTL (Linear Temporal Logic): "Eventually the system reaches a safe state." $\Diamond \text{safe}$ CTL (Computation Tree Logic): "On all paths, the system never enters an unsafe state." $\forall\Box \neg \text{unsafe}$ These let you express liveness, safety, and fairness properties. Classical Model Checking Works brilliantly for finite-state systems: protocols, hardware, control software. Tools like SPIN, NuSMV, and PRISM are mature. ...

May 21, 2025 · 1 min · Austin T. O'Quinn

What Is RL, Why Is It So Hard, and How Does It Go Wrong?

Prior reading: Gradient Descent and Backpropagation | Loss Functions and Spaces The Setup An agent takes actions in an environment, receives rewards, and learns a policy $\pi(a|s)$ that maximizes expected cumulative reward. $$\max_\pi \mathbb{E}\left[\sum_{t=0}^T \gamma^t r_t\right]$$ Where Are the Neural Nets? Neural networks serve as function approximators for things that are too complex to represent exactly: Policy network: $\pi_\theta(a|s)$ — maps states to action probabilities Value network: $V_\phi(s)$ — estimates expected future reward from a state Q-network: $Q_\psi(s,a)$ — estimates expected future reward from a state-action pair World model (optional): $p_\omega(s'|s,a)$ — predicts next state Without neural nets, RL only works for tiny state spaces (tabular methods). Neural nets let it scale to images, language, and continuous control. ...

February 26, 2025 · 3 min · Austin T. O'Quinn
.