Sequence Modelling from Markov Chains to GLM-5.2

frederico.wieser@proton.me (Fred Wieser) — Sat, 20 Jun 2026 00:00:00 +0000

Open-weight models such as GLM-5.2 make the gap between closed and open models feel much smaller. The useful way to read that history is not as a list of model names, but as a sequence modelling story.

A sequence model first chooses a representation, then a dependency graph, then a way to spend compute. Raw text is mapped into tokens $z_1,\ldots,z_T$, tokens become vectors $X\in\mathbb{R}^{T\times d}$, and the model repeatedly mixes information across positions and across channels.

Attention on Fred Wieser

Sequence Modelling from Markov Chains to GLM-5.2