
Tilde
125 posts

Tilde
@tilderesearch
We build foundational understanding of models to advance the frontier of intelligence.


we let opus 4.7 and gpt 5.5 run on the nanogpt optimizer speedrun: ~10k runs, 14k H200 hours, 23.9B tokens. opus hits 2930, codex 2950, both beating the human baseline of 2990. we cover claude autonomy failures, codex high compute usage, and much more primeintellect.ai/auto-nanogpt


Ok directly relevant for ongoing work (on memorization): avoiding a "huge percentage of neurons to effectively die early in training (…) so that many parameters no longer meaningfully contribute to network outputs". This optimizer is going to see some SYNTH data.







~3/8~ We introduce Nitrobrew to solve these issues. Nitrobrew stems from a very simple observation: the unembedding matrix is low-rank. It consist of two steps: 1. Sending hidden states as a lossless compression of logits 2. A lightweight, chunked online KL divergence implementation





