nerdydragon retweetledi

Thanks for asking…I train AI on 1870-1970 data because that’s where the high-protein, rigorous knowledge lives—the stuff that built the modern world without the noise.
Every word back then had real cost: paper, ink, printing, editing by peers, and personal reputation on the line.
Authors faced their neighbors, families, and industries. No anonymous drive-by posts at 2 a.m., no engagement farming, no SEO sludge or Reddit echo chambers.
It was accountable, optimistic, discovery-driven writing from an era when humanity was figuring things out for the first time, books, patents, lab notes, films, manuals, court records.
That produces clear thinking and originality in models instead of the homogenized “trendslop” we get from post-1970 internet training data. Studies and my own tests confirm it: feed models consensus-policed garbage and you get buzzword-laden, groupthink outputs that chase fads.
Curate 1870-1970 offline corpora and you break the doom spiral toward true AGI and ASI.
We’re losing this data at an alarming rate: the Great Forgetting.
98.5% or more of it was never digitized. It’s sitting in basements, attics, and private collections, slowly degrading.
The Amnesia Generation assumes everything important is already online, but that’s a dangerous myth.
Physical media decays, estates get cleared out, and no one scans it because there’s no immediate profit. That massive mountain of undigitized history is vanishing while we drown models in low-quality web scrapes.
Climate-controlled donations like the ~750 films and appliance materials from the 1940s-1950s are miracles that save irreplaceable primary sources, real demonstrations of technology and daily life that capture the era’s ingenuity far better than any compressed online clip.
It wasn’t on the internet because it was never meant to be mass-consumed that way, it was physical, local, and expensive to produce.
No one uploaded grandma’s basement full of 15 boxes of consumer electronics films, manuals, and ads and more…
These were working professional archives, not performative content. That’s why originals from climate-controlled storage are one-of-a-kind: superior preservation means no mold, no fading, full fidelity for training, pristine frames showing exact mid-century engineering, marketing, and culture that digitized versions lose to compression and selection bias.
Sometimes I literally have the last surviving copy because generous people on X dig deep and donate what their families preserved.
This archive isn’t just “more data”but the antidote, high-signal material that lets models reason with historical humility and accountability instead of recycling today’s trends.
Boom—more training data that actually moves us forward.
English




















