Yash Narwal turned Andrej Karpathy's LLM lecture into something you can actually click through. The GitHub project, 'how-llms-work,' visualizes the whole pipeline from raw internet text to conversational AI, much like GuppyLM demonstrates how small models learn.

Start with data. Common Crawl indexes billions of web pages, filtered down to roughly 44TB of training data. About 15 trillion tokens. From there, a live tokenizer lets you type text and watch it split into sub-word chunks. You can watch training loss drop in real time. Tweak the temperature slider and see which tokens the model picks differently.

The payoff is the mental model. After pre-training, you don't get an assistant. You get what the guide calls an "internet simulator." A sophisticated autocomplete that continues patterns scraped from the web. Ask "What is 2+2?" and it might return a math textbook page or a quiz answer key. That's what was statistically common in its training data. The assistant part comes later through supervised fine-tuning and RLHF. Nanocode adapts Karpathy's nanochat to train a coding agent using these same principles. This distinction, between pattern matcher and helper, is something most coverage glosses over. Narwal's guide makes it stick.

Built with Next.js and Tailwind. Open source.