While I'm always happy to see more people using open models, I was hoping the "playing" would be a bit more about actually interacting with the models themselves, rather than just running them.
For anyone interested in playing around with the internals of LLMs without needing to worry about having the hardware to train locally, a couple of projects I've found really fun and educational:
- Implement speculative decoding for two different sized models that share a tokenizer [0]
- Enforce structured outputs through constrained decoding (a great way to dive deeper in to regex parsing as well).
- Create a novel sampler using entropy or other information about token probabilities
The real value of open LLMs, at least for me, has been that they aren't black boxes, you can open them up and take a look inside. For all the AI hype it's a bit of shame that so few people seem to really be messing around with the insides of LLMs.