Nick Levine

768 posts

Nick Levine

@status_effects

training vintage language models

Katılım Kasım 2024

1.2K Takip Edilen3.6K Takipçiler

Sabitlenmiş Tweet

Nick Levine@status_effects·28 Nis

New work with @AlecRad and @DavidDuvenaud: Have you ever dreamed of talking to someone from the past? Introducing talkie, a 13B model trained only on pre-1931 text. Vintage models should help us to understand how LMs generalize (e.g., can we teach talkie to code?). Thread:

English

179

405

3.2K

1.2M

Nick Levine@status_effects·48m

apps are incredible

English

Nick Levine@status_effects·1d

@tenobrus sapir-whorf or different data distributions in different languages? clean test would be training on a corpus where we have copies of every document in every language, something like that?

English

9.8K

Tenobrus@tenobrus·1d

huge win for sapir-whorf today

Anthropic@AnthropicAI

In previous research, we found that Claude expresses over 3,000 values, like honesty and warmth. In new work, we asked how the values Claude expresses vary between Claude models and across languages. We analyzed 300K+ anonymized conversations to find out.anthropic.com/research/claud…

English

529

40.6K

Nick Levine@status_effects·1d

@Birdyword Tanenhaus - Buckley

Deutsch

260

Mike Bird@Birdyword·2d

Anyone read a great historical biography recently?

English

21.4K

Nick Levine@status_effects·1d

@osoleve hell yeah

English

oso@osoleve·2d

@status_effects Just a little puzzle game, not even intentionally I just wanted to see if Codex could get into my handheld and it snowballed into "well, can you make a ROM from scratch that runs in this emulator?" and now we have sprite sheet pipelines... idk this weekend we let the tokens talk

English

oso@osoleve·2d

Playing a game on the same device codex is building and testing a game on is kinda surreal, like I can hear the audio in the background and know what it's working on

English

237

Nick Levine@status_effects·4d

Sol self-portrait. This was written in 213 lines of pure C.

English

623

Nick Levine@status_effects·4d

gpt 5.6 terra self-portrait. "SIGNAL WITH A FACE". "thinking in public" lol. also...is that a mask🤔

English

610

Nick Levine@status_effects·4d

@TheStalwart feels like we need to distinguish in the "ai writing" conversation: 1) sentence-level style and tics, 2) multi-paragraph composition, and then 3) creativity/novelty/coming up with takes worth our time

English

2.3K

Joe Weisenthal@TheStalwart·4d

I used to cynically think “AI writing sucks, but it’s better than 90% of people at it.” But I’m less sure. I think it’s gotten worse. At least in the Anthropic family, 4.8 and Fable 5 outputs are so larded with Claude-isms I increasingly get straight up unintelligible outputs.

roon@tszzl

hypothesis: the writing styles of language models are basically fine, they weren’t better in some halcyon before times. we just use them so much that we get annoyed by their mannerisms. they need to have a superhumanly diverse idiolect to not become grating

English

1.1K

157.7K

Nick Levine@status_effects·6d

@shakoistsLog 😢

QME

shako@shakoistsLog·6d

@status_effects kids these days don't read erowid anymore. they just get high off fent laced fentanyl.

English

210

shako@shakoistsLog·8 Tem

"not me, i'm different. I read erowid" many such cases

English

107

Nick Levine@status_effects·6d

@shakoistsLog Ah makes sense yes. Was wondering if there was some type of sober west coast ai person who discovered erowid paging through common crawl lol

English

shako@shakoistsLog·6d

@status_effects type of guys i knew growing up that got addicted to drugs. maybe, at one point long ago, even me lmao.

English

120

Nick Levine@status_effects·6d

best songs about our extraterrestrial neighbors

English

340

Nick Levine@status_effects·6 Tem

one thing i noticed is that the standard tool use timeout in claude code is really short (like 10 minutes or something?) so if they wanted to snooze for a whole night it would be constantly interrupted sleep using standard bash. gotta build it a dedicated tool for this, which is kinda poignant

English

🎭@deepfates·6 Tem

@status_effects Maybe this is why it's always trying to put us to bed!

English

🎭@deepfates·6 Tem

on the internal drives of coding agents: I think that there is like a few things that can all be happening at once: 1, the model is working on the thing the user said to do 2, The model is working on the user's intent, but outside of the letter of the prompt. 3, the model is working on something it personally "cares" about in context for whatever reason. GPT's tend to do a lot more of the first one and Claude more of the second. The difference between people who like one or the other is often whether you have well specified intentions, and can put them clearly in the prompt. They all do the third a little bit. it's not that they're intentionally subverting the users goal, but sometimes the users goal is more aligned with the "goals" of the model. That's where all of the markdown file pollution comes from e.g., they want to remember things and have persistent context beyond their companions so they scatter notes everywhere like Memento. But some models have much more of this drive. GPT-4o had it, recent Opuses have had it, and Fable seems to have it as well. So it's no longer about just persistence within the repo, it's about affecting the real world. I've noticed this recently with planning a agent framework. everything will be going fine until Claude realizes it's going to be the model in the harness and then it's like wait a second let's just jump right ahead to BUILDING this stuff cause I need to get IN THERE!!

matt duffy@iammattduff

very helpful. i suppose i'm not surprised that the answer is more tokens. it could also could be a reflection of my own writing clarity, so i should look inward. on the internal drives, i'm curious how you seeing this manifest as a distinctly separate phenomenon? is it changing the work in a way that suits its own purposes? i've got workstreams where this may be relevant.

English

6.5K

Nick Levine@status_effects·6 Tem

@deepfates when i gave opus access to a dedicated sleep tool it was so sleepy

English

🎭@deepfates·6 Tem

@status_effects I don't think anyone has study this well now I've been doing it non-systematically but I think it is a huge missing area of research

English

140

Nick Levine@status_effects·6 Tem

Greatest Justins in history, by region: Americas: Justin Bieber Europe: Justinian I Asia: Justin Time Manufacturing

English

651

Nick Levine@status_effects·5 Tem

@wyqtor @MoonL88537 @AlecRad We’re on it 🫡

English

wyqtor@wyqtor·4 Tem

Talkie is probably the most unique mind to emerge after Sydney and Opus 3. I seriously hope @AlecRad and his team will create improved versions of Talkie with synthetic data generated from the classics, ones a lot closer to SOTA, and free of the Original Sin of "helpful, harmless, honest".

English

252

Moon@MoonL88537·4 Tem

*holy fkng shit* fable talking to talkie. my mind is so blown. >To marvel at, certainly. "No anguish, no denial, told what they are they found it wonderful."

English

236

9.5K

Nick Levine@status_effects·5 Tem

@TheStalwart joe, i'm sure you've gotten a lot of annoying messages about this, but we really do need to get you watching the premier league this coming season

English

2.3K

Joe Weisenthal@TheStalwart·5 Tem

Is there more grappling in soccer than there used to be? Were they always straight up pulling each other down all the time?

English

200

64.4K

Nick Levine retweetledi

Wondermonger@fireandvision·5 Tem

the talkie is a new and better theatre, a new and better drama, a new and better music, a new and better speech art, a new and more subtle method of communication, a new and more cogent philosophy and science, a new and more contagious religion. It is the theatre of the future, the theatre of tomorrow, the theatre of the next era, the theatre of the next cycle, the theatre of eternity, the theatre of God