Mini-Batch

Prior reading: Gradient Descent and Backpropagation Three Ways to Look at a Model Loss surface: The landscape over parameter space. What the optimizer sees. Decision boundary: The surface in input space that separates classes. What the user sees. Activation space: The internal geometry of learned representations. What the model "thinks." These are different views of the same object, but they behave differently. Which Are Data-Dependent? Loss surface: Entirely data-dependent. Change the data, change the landscape. Decision boundary: Data-dependent through training, but fixed at inference. Activation space: Shaped by data and architecture jointly. The architecture constrains which representations are possible; the data selects among them. How They Relate The loss function defines the objective. Gradient descent reshapes the decision boundary to minimize loss. The activation space is the intermediate computation that makes the decision boundary expressible. ...

Mini-Batch

Loss Functions, Decision Boundaries, Activation Spaces, and Why MSE

Gradient Descent, Backpropagation, and the Misconceptions That Tripped Me Up