Sabitlenmiş Tweet

I have just released "Part II of the Thermodynamic Theory of Learning series." It shows that the difficulty of continual learning and the emergence of catastrophic forgetting can be understood as a consequence of a “critical period closure,” in which the learning process itself gradually restricts future adaptability.
In Part I, we formulated learning as an irreversible transport process occurring over finite time and demonstrated that there is a theoretical lower bound on the irreversible cost required to move from one state to another. We termed this bound the Epistemic Speed Limit (ESL), showing that finite-time learning inevitably incurs entropy production.
Part II investigates how this irreversibility constrains future reachability.
Learning can be described as a transport map over parameter distributions. Let Ψ_A denote the transport map corresponding to one stage of learning, and Ψ_B the subsequent stage. The overall learning process is then represented by the composition Ψ_B ∘ Ψ_A.
The Jacobian of this transport map characterizes how infinitesimal perturbations in the current neighborhood of parameters are stretched or contracted as the learning dynamics evolve.
When maps are composed, their Jacobians compose multiplicatively. Since matrix rank cannot increase under composition, and singular values obey submultiplicative bounds, collapsed directions are not generically restored by subsequent learning.
This structure implies that as learning progresses, the “dynamically usable degrees of freedom”—that is, the reachable set—monotonically shrink.
To continuously measure this contraction of the reachable set, we introduce the notion of effective rank. This quantity measures the log-volume of directions that can be reconfigured without degrading previously learned tasks, and thus represents the degrees of freedom that remain dynamically accessible under future learning.
Importantly, effective rank may decrease even when task performance remains unchanged. In other words, future adaptability can be silently lost independently of current performance.
Furthermore, we formalize the local curvature structure required by a new task in terms of the stable rank of the Hessian. We prove a capacity-threshold theorem: when this required curvature dimension exceeds the remaining effective rank, adaptation without forgetting becomes impossible.
The key point is not that compatible multi-task solutions fail to exist. Rather, under finite-time non-equilibrium learning dynamics, even if compatible solutions exist in principle, they may become unreachable.
From this perspective, the difficulty of continual learning arises not merely from information loss, but from the irreversible disappearance of reconfigurable directions.
We refer to this phenomenon as critical period closure. As learning progresses and the reachable set contracts, a stage is eventually reached beyond which adapting to new tasks without disrupting existing structure becomes geometrically impossible. The term is inspired by its structural analogy to biological critical periods.
This framework also provides a geometric explanation for why widely used continual learning methods—such as replay buffers and curriculum learning—can be effective. Replay can be interpreted as suppressing reachable-set contraction and mitigating directional collapse. Curriculum learning can be viewed as a strategy to control early-stage anisotropic contraction, thereby preventing premature loss of effective degrees of freedom.
At the same time, because submultiplicative contraction under map composition is fundamentally unavoidable, these methods cannot eliminate irreversibility entirely; they can only delay critical period closure.
This work does not propose a new algorithm but rather analyzes structural constraints inherent in learning dynamics. We hope that this framework will serve as a theoretical foundation for future design principles in continual learning and for reconsidering update rules from a dynamical perspective.
English