How Overfitting Warps LLM Vectors

August 28, 2025

How overfitting reshapes high dimensional spaces, collapsing variance, creating hubs, and boosting brittle features that memorize rather than generalize.

You can learn a lot about a language model by peeking at its vectors. When training goes past the sweet spot, the geometry inside those vectors starts acting weird. Think of a dance floor where everyone slowly drifts into the same corner. The music did not change. The space did.

This post stays inside the representation space: token embeddings, the LM head, and the contextual hidden states that a transformer produces for each position.

Diagram illustrating how overfitting warps embedding geometry: logit scale inflation, anisotropy, prototype locking, and hubness.
Click to open the SVG and zoom.

A 60 second refresher

Probabilities come from $z_t = W_t \cdot h$ and $p(t \mid h) = \operatorname{softmax}(z)$. Overfitting bends the geometry of $E$, $W$, and the distribution of $h$.


How overfitting warps high dimensional space


How to see the geometry move

You can spot these shifts without a single leaderboard score. Build a small internal dashboard and compare a shard of training data with a fresh, time-boxed shard.

  1. Covariance spectrum of hidden states: compute eigenvalues of $\operatorname{Cov}(h)$. Track effective rank or participation ratio. Overfitting shows up as swelling top eigenvalues and a shrinking effective rank.
  2. Isotropy checks: mean cosine of random hidden state pairs after unit normalization. Rising averages signal crowding into cones. Track the norm of the mean hidden state relative to per-token norms.
  3. Embedding and LM head norm histograms: watch for heavy tails and identify tokens whose columns change rapidly in scale or angle.
  4. Alignment margins: on held-out data measure $\max_t \cos(h, E_t)$ and the margin between the top two logits. Train-only growth is a red flag.
  5. Hubness metrics: build k-NN graphs. Count how often a point appears in someone else’s neighbor list. Long-tail growth means hubness is increasing.
  6. Attention and MLP outliers: track attention entropy per head and activation RMS per channel. Persistent outliers usually point to memorization hooks.

Tip: always compare train versus fresh at the same time. Geometry that seems fine on train can look obviously distorted on new text.


If you train embedding encoders for RAG or search

Contrastive learning tries to balance two forces:

Overfitting tips the balance toward alignment without uniformity.

Simple monitors:


How to shape the space back into health


Closing thought

Overfitting is not just a lower loss that went a bit too low. It is a quiet rearrangement of angles, spectra, and neighborhoods inside a very large space. Watch the geometry. When the cloud collapses, the outputs might still look confident, but the map they come from has lost its depth.