Is there some documentation for this? The code is probably the simplest (Not So) Large Language Model implementation possible, but it is not straight forward to understand for developers not familiar with multi-head attention, ReLU FFN, LayerNorm and learned positional embeddings.
This projects shares similarities with Minix. Minix is still used at universities as an educational tool for teaching operating system design. Minix is the operating system that taught Linus Torvalds how to design (monolithic) operating systems. Similarly having students adding capabilities to GuppyLM is a good way to learn LLM design.
give the code to an LLM and have a discussion about it.
does this work? there is no more need for writing high level docs?
> does this work?
Absolutely. If you loaded this into an agentic coding harness with a decent model, I can practically guarantee it would be able to help you figure out what's going on.
> there is no more need for writing high level docs?
Absolutely not. That would be like exploring a cave without a flashlight, knowing that you could just feel your way around in the dark instead.
Code is not always self-documenting, and can often tell you how it was written, but not why.
> If you loaded this into an agentic coding harness with a decent model, I can practically guarantee it would be able to help you figure out what's going on.
My non-coder but technically savvy boss has been doing this lately to great success. It's nice because I spend less time on it since the model has taken my place for the most part.
> since the model has taken my place for the most part
Hah, you realize the same thing is going on in your boss's head right? The pie chart of Things-I-Need-stronglikedan-For just shrank tiny bit...
There are so many blogs and tutorials about this stuff in particular, I wouldn't worry about it being outside the training data distribution for modern LLMs. If you have a scarce topic in some obscure language I'd be more careful when learning from LLMs.
LLMs can tell you what the code does but not why the developer chose to do it that way.
Also, large codebases are harder to understand. But projects like these are simple to discuss with an LLM.
> LLMs can tell you what the code does but not why the developer chose to do it that way.
Do LLMs not take comments into consideration? (Serious question - I'm just getting into this stuff)
They do (it's just text), if they are there...