Seb Zuddas ๐ฆ
561 posts

Seb Zuddas ๐ฆ
@SebZuddas
Thinking in Systems. Building Resilience. Engineering Excellence.




Training an LLM from scratch is easier to study when the whole path is in one repo. Train LLM From Scratch is a PyTorch repository for learning how a transformer language model is built, trained, saved, and used for text generation. It helps you move from โI understand attention on paperโ to a runnable training pipeline by pairing model code with data download, preprocessing, config, training, and generation scripts. Key features: โข Transformer components from scratch โ separate PyTorch modules for MLP, attention, transformer blocks, and the final model โข Pile-based data path โ scripts download The Pile files and preprocess JSONL.ZST text into tokenized HDF5 datasets โข Configurable training setup โ model size, context length, heads, blocks, batch size, learning rate, and file paths live in config.py โข Hardware guidance โ README compares common GPUs for 13M and 2B-class training runs โข Generation workflow included โ generate_text.py loads trained checkpoints and produces sample text outputs Itโs open-source (MIT license). Link in the reply ๐























