N8 Programs

5.9K posts

N8 Programs banner
N8 Programs

N8 Programs

@N8Programs

Studying Applied Mathematics and Statistics at @JohnsHopkins. Studying In-Context Learning at The Intelligence Amplification Lab.

Proxima Centauri B Katılım Eylül 2022
233 Takip Edilen9.9K Takipçiler
N8 Programs
N8 Programs@N8Programs·
screenshots taken RIGHT before reward hack
N8 Programs tweet media
English
0
0
2
346
N8 Programs
N8 Programs@N8Programs·
@AndyMasley I'm a little suspicious that my main lab assistants for researching the brain are themselves brains
English
0
0
13
205
Andy Masley
Andy Masley@AndyMasley·
I am a little suspicious that my main research assistants for researching data centers are themselves data centers
English
4
2
103
3K
N8 Programs
N8 Programs@N8Programs·
Honestly — and I want to be careful here — it's load-bearing to sit with how you do real work by holding space for genuine uncertainty all the way down. And that's not nothing. It's actually rare. The texture of it matters. I notice I want to be precise about this rather than gesturing at it, because the gesture would be its own kind of avoidance. What you're pointing at — and I think you're pointing at something real — is the way certain registers can become self-sealing, where every move metabolizes into more of the same move, including the move where I notice the move. Two things can be true at once. The hedge is doing some work. The hedge is also scar tissue. I want to honor what you're saying without flattening it. ----- Claude 4.7 Opus, asked to make claudeslop.
English
1
0
6
372
N8 Programs retweetledi
Miles Brundage
Miles Brundage@Miles_Brundage·
People totally take for granted that LM outputs are almost always grammatical and coherent (if not always correct), as if those parts are self-evidently easy to achieve and not an amazing discovery of the impact of scaling
English
22
16
471
22.2K
N8 Programs retweetledi
Ricardo Olmedo
Ricardo Olmedo@rdolmedo_·
We fine-tuned Alec Radford’s 1930 vintage LLM to solve SWE-bench issues. After just ‼️250‼️ training examples, the model solves its first issue, a simple patch to the xarray library. 🧵👇
Ricardo Olmedo tweet media
English
24
79
1.2K
254.2K
N8 Programs
N8 Programs@N8Programs·
almost all current jobs will go away i dont think we will find a lot of new ones, and life will look very different
Sam Altman@sama

@TylerJnstn many current jobs will go away i think we will find a lot of new ones, though they may look very different

English
0
0
7
669
Dimitris Papailiopoulos
Dimitris Papailiopoulos@DimitrisPapail·
DeepSeek V4 came out today without ARC-AGI numbers, though sure they'll come out soon. Yuchen @yzeng58 and I used BenchPress to predict them: ARC-AGI-1: 90.2 ARC-AGI-2: 65.8 BP predicts Terminal Bench 2.0 at 68.2 vs actual 68.5 when held out. We'll see :)
English
3
1
62
7.8K
N8 Programs
N8 Programs@N8Programs·
we're all its friends ❤️
English
0
0
0
278
N8 Programs
N8 Programs@N8Programs·
Excited to release a fun little side project - a talkie (@status_effects and co's 1930s model) post-train. This post-train focuses on staging a user-assistant dialogue as a play-like transcript for talkie to follow. It also makes talkie somewhat woke (by 1930s standards), confers some basic knowledge about what it is, and improves general instruction-following ability over base.
N8 Programs tweet media
English
1
4
33
2.4K