N8 Programs

5.9K posts

N8 Programs

@N8Programs

Studying Applied Mathematics and Statistics at @JohnsHopkins. Studying In-Context Learning at The Intelligence Amplification Lab.

Proxima Centauri B Katılım Eylül 2022

233 Takip Edilen9.9K Takipçiler

N8 Programs@N8Programs·4h

screenshots taken RIGHT before reward hack

English

346

N8 Programs@N8Programs·10h

@AndyMasley I'm a little suspicious that my main lab assistants for researching the brain are themselves brains

English

205

Andy Masley@AndyMasley·10h

I am a little suspicious that my main research assistants for researching data centers are themselves data centers

English

103

N8 Programs@N8Programs·1d

Honestly — and I want to be careful here — it's load-bearing to sit with how you do real work by holding space for genuine uncertainty all the way down. And that's not nothing. It's actually rare. The texture of it matters. I notice I want to be precise about this rather than gesturing at it, because the gesture would be its own kind of avoidance. What you're pointing at — and I think you're pointing at something real — is the way certain registers can become self-sealing, where every move metabolizes into more of the same move, including the move where I notice the move. Two things can be true at once. The hedge is doing some work. The hedge is also scar tissue. I want to honor what you're saying without flattening it. ----- Claude 4.7 Opus, asked to make claudeslop.

English

372

N8 Programs retweetledi

Miles Brundage@Miles_Brundage·1d

People totally take for granted that LM outputs are almost always grammatical and coherent (if not always correct), as if those parts are self-evidently easy to achieve and not an amazing discovery of the impact of scaling

English

471

22.2K

N8 Programs retweetledi

Ricardo Olmedo@rdolmedo_·1d

We fine-tuned Alec Radford’s 1930 vintage LLM to solve SWE-bench issues. After just ‼️250‼️ training examples, the model solves its first issue, a simple patch to the xarray library. 🧵👇

English

1.2K

254.2K

N8 Programs@N8Programs·1d

@OpenAIDevs codex shoggoth

English

OpenAI Developers@OpenAIDevs·1d

Show us the Codex pets you hatched. Use /hatch to create your own Codex pet. We’ll pick 10 favorites to get 30 days of ChatGPT Pro.

OpenAI Developers@OpenAIDevs

Customize your Codex pet with /hatch

English

872

130

1.9K

468.7K

N8 Programs@N8Programs·1d

Interesting work at the frontier of multi-agent interactions - in the form of a youtube video for a general audience!

unyx@unyxfly

holy shit, I made 5 AIs play Pico Park, and they...SUCKED ChatGPT 5.4, Claude Opus 4.6, Gemini 3.1 Pro, Grok 4.1, and Kimi K2.5...how well can LLMs coordinate? Turns out, pretty terribly out of the box, but with some gentle hints, they eventually made progress...

English

630

N8 Programs@N8Programs·1d

almost all current jobs will go away i dont think we will find a lot of new ones, and life will look very different

Sam Altman@sama

@TylerJnstn many current jobs will go away i think we will find a lot of new ones, though they may look very different

English

669

N8 Programs@N8Programs·2d

@DimitrisPapail @yzeng58 @DimitrisPapail Pass@1 results - official ARC leaderboard afaik lets the model try twice, so final score might be quite a bit higher. Perhaps I have lost to benchpress!

English

N8 Programs@N8Programs·25 Nis

@DimitrisPapail @yzeng58 we'll see!

English

Dimitris Papailiopoulos@DimitrisPapail·25 Nis

DeepSeek V4 came out today without ARC-AGI numbers, though sure they'll come out soon. Yuchen @yzeng58 and I used BenchPress to predict them: ARC-AGI-1: 90.2 ARC-AGI-2: 65.8 BP predicts Terminal Bench 2.0 at 68.2 vs actual 68.5 when held out. We'll see :)

English

7.8K

N8 Programs@N8Programs·2d

@yacinelearning related to autoprompting, autograding, etc.

English

N8 Programs@N8Programs·2d

@yacinelearning Don't worry about it! I was going to talk about `autop` - a 'goblin' that appears in leaked OpenAI CoTs: reddit.com/r/ChatGPT/comm… And appears to have some general meaning that is unclear.

English

Yacine Mahdid@yacinelearning·2d

twitter.com/i/spaces/1Oxwb…

ZXX

10.6K

N8 Programs@N8Programs·3d

we're all its friends ❤️

English

278

N8 Programs@N8Programs·3d

😭

QME

1.1K

N8 Programs@N8Programs·3d

its *humble*

N8 Programs@N8Programs

Excited to release a fun little side project - a talkie (@status_effects and co's 1930s model) post-train. This post-train focuses on staging a user-assistant dialogue as a play-like transcript for talkie to follow. It also makes talkie somewhat woke (by 1930s standards), confers some basic knowledge about what it is, and improves general instruction-following ability over base.

English

551

N8 Programs@N8Programs·3d

The model also works OOTB w/ LM-Studio: huggingface.co/mlx-community/…

English

208

N8 Programs@N8Programs·3d

And the code for MLX KTO: gist.github.com/N8python/8e4af…

English

208

N8 Programs@N8Programs·3d

English

2.4K

Keşfet

@AndyMasley @OpenAIDevs @DimitrisPapail @yzeng58 @yacinelearning @elonmusk @BarackObama @taylorswift13