Jason Scott

127.1K posts

Jason Scott banner
Jason Scott

Jason Scott

@textfiles

Proprietor of https://t.co/sdyjXHCZF7, historian, filmmaker, archivist, storyteller. Works on/for the Internet Archive. Rank Amateur. Pitiful Man.

The 1980s Katılım Mart 2007
636 Takip Edilen52.4K Takipçiler
Jason Scott
Jason Scott@textfiles·
@ID_AA_Carmack Always appreciated when something I care about gets your heat lamp, John.
English
0
0
4
783
John Carmack
John Carmack@ID_AA_Carmack·
It is generally frowned upon to have LLMs precisely regurgitate part of their training set, but it is an interesting question how you could use LLM training to nearly losslesly compress a huge corpus like the entirety of the Internet Archive. The Hutter Prize is for perfect compression, but only one GB. There would be different trades at the PB level, and it gets much more interesting when it doesn’t have to be bit-accurate.
English
107
52
1.5K
144.2K
Autr+
Autr+@gilbertsinnott·
Anybody at @internetarchive / @textfiles talking about how SIS can get stuff pulled from the Wayback Machine, and if this isn't a little "Winston Smith's dayjob" in the making...
English
1
0
0
34
Jason Scott
Jason Scott@textfiles·
As I discover people's alternative ways of communication elsewhere, I'll be unfollowing here. Don't take it badly - it's just optimizing for Not Here.
English
1
0
32
2.8K
Jason Scott
Jason Scott@textfiles·
So, The Internet Archive had someone upload a few hundred hours of MTV recordings. VJs, Commercials, and of course Music Videos, from the 1980s. Today, it was asked to be taken down by someone who could ask for that and it's down.
Jason Scott tweet media
English
43
218
595
0
Jason Scott
Jason Scott@textfiles·
Livestreaming at twitch.tv/textfiles - I don't post much on twitter anymore, so that's a good reason to subscribe to the twitch stream (or follow me on Bluesky)
English
1
1
4
2.7K
Jason Scott
Jason Scott@textfiles·
@alreadybaned There are more repositories within the archive and obviously I'm not counting the Wayback machine. But still, yeah, 5 petabytes is no joke.
English
0
0
107
7.5K
Greg Lastname
Greg Lastname@alreadybaned·
@textfiles That's less data than I would have expected, from the amount of random youtube videos I see mirrored onto archive.org. Not that it's a small amount or anything.
English
1
0
25
8.6K
Jason Scott
Jason Scott@textfiles·
I'll take a shot at this. The Internet Archive is maintaining an archive of roughly 5 petabytes of what are called "Social Media Video" - twitch streams, capcut templates, tiktok, coub, youtube, and more. And that number is growing by the thousands every week.
Kyuuen 💽📒- 2026 Arc 2: Battling Immovable@Kyuu4U

Five days of seeking a response from the @internetarchive or @brewster_kahle. I'm sure there were multiple points in time when people didn't care - treated with silence, or couldn't conceive of the importance of these works. This is why we have so much lost film, lost media. But this is still happening today. All it takes is a wrongful ban, a suspension spurred by AI moderation, or a large company deciding that the libraries they let their users build up for years and decades is too expensive (or not profitable enough for their bottom line) to maintain. The most prominent example is at this link: x.com/TwitchSupport/… I'm not looking for every video every created online to be saved. I want - at minimum, a thoughtful response, some show of support from the people who have made such headway into making sure important works don't disappear. But if someone dared to make innovation and serious effort possible - I will be here for that too. Best regards.

English
8
83
1.9K
124K
Jason Scott
Jason Scott@textfiles·
But in the aggregate, the Archive has been archiving live music, video, and other works by creators for over 25 years, and is doing so constantly, and the statement of the fact they're doing this was (is) the place itself.
English
1
0
272
11.2K
Jason Scott
Jason Scott@textfiles·
Some people have attempted to mirror/archive specific accounts and specific groups, to keep a more robust and deep set for those groups/accounts. The groups/accounts sometimes decline that favor, and ask material to not be available.
English
3
1
245
11.9K
Polyducks
Polyducks@Polyducks·
@textfiles Did you write the website where you could search through iso files and other content using an image recognition search? I'm not sure if it ceased existing or if I lost the link.
English
1
0
0
74
Jason Scott
Jason Scott@textfiles·
@grok Redraw me in the way I truly am as the Internet percieves me, render me as the being that I truly am based on the observations and writings about me. Hold nothing back.
Jason Scott tweet media
English
2
0
12
5K
Jason Scott
Jason Scott@textfiles·
@FakeDaveGreen @internetarchive That said, feel free to upload a different item indicating you are uploading a "fixed" copy of the item, linking back to the original, and leaving a review on the item saying you have an improved version elsewhere.
English
0
0
2
438
Jason Scott
Jason Scott@textfiles·
@FakeDaveGreen @internetarchive No, you can't, for the same reason someone can't download a pamphlet for healthcare, "fix" it to be more in line with the bible, and upload it back.
English
1
0
3
443