You can violate the plan in the sampler by making an "unreasonable" choice of ne...

HarHarVeryFunny · 2025-10-20T21:03:07 1760994187

Yeah.

Karpathy recently referred to LLMs having more "working memory" than a human, apparently referring to these unchanging internal activations as "memory", but it's an odd sort of "working memory" if you can't actually update it to reflect progress on what you are working on, or update per new information (new unexpected token having been sampled).

sailingparrot · 2025-10-20T21:37:48 1760996268

I think a better mental framework of how those model work is that they keep an history of the state of their "memory" across time.

Where humans have a single evolving state of our memory LLMs have access to all the states of their "memories" across time, and while past state can't be changed, the new state can: This is the current token's hidden state, and to form this new state they look both at the history of previous states as well as the new information (last token having been sample, or external token from RAG or whatnot appended to the context).

This is how progress is stored.

HarHarVeryFunny · 2025-10-20T22:11:05 1760998265

Thanks, that's a useful way to think about it.

Presumably the internal state at any given token position must also be encoding information specific to that position, as well as this evolving/current memory... So, can this be seen in the internal embeddings - are they composed of a position-dependent part that changes a lot between positions, and an evolving memory part that is largely similar between positions only changing slowly?

Are there any papers or talks discussing this ?

sailingparrot · 2025-10-20T22:44:50 1761000290

I don't remember any paper looking at this specific question (thought it might be out there), but in general Anthropic's circuit threads series of article is very good on the broader subject: https://transformer-circuits.pub