
Seb Zuddas 🦄
556 posts

Seb Zuddas 🦄
@SebZuddas
Thinking in Systems. Building Resilience. Engineering Excellence.



Training an LLM from scratch is easier to study when the whole path is in one repo. Train LLM From Scratch is a PyTorch repository for learning how a transformer language model is built, trained, saved, and used for text generation. It helps you move from “I understand attention on paper” to a runnable training pipeline by pairing model code with data download, preprocessing, config, training, and generation scripts. Key features: • Transformer components from scratch – separate PyTorch modules for MLP, attention, transformer blocks, and the final model • Pile-based data path – scripts download The Pile files and preprocess JSONL.ZST text into tokenized HDF5 datasets • Configurable training setup – model size, context length, heads, blocks, batch size, learning rate, and file paths live in config.py • Hardware guidance – README compares common GPUs for 13M and 2B-class training runs • Generation workflow included – generate_text.py loads trained checkpoints and produces sample text outputs It’s open-source (MIT license). Link in the reply 👇









.@ylecun’s definition of what is a world model.



🚨 Approved: 1 Silk Street redevelopment ✅ 86,000 sqm of Grade A workspace ✅ New public plaza by the Barbican ✅ Cultural, retail & community spaces ✅ Greener, lower-carbon design A major step in delivering a more vibrant, inclusive, 7-day Square Mile. #DestinationCity Read more here: news.cityoflondon.gov.uk/city-of-london…










