@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦

45.6K posts

@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦 banner
@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦

@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦

@esteban

エステバン – | Software Engineer at @lancedb. HBase Committer, ex-{Datahub, @redpandadata, @Cloudera, @SismologicoMX,@cires_ac,@GobCDMX}. Swim dad. All views mine.

Austin, TX Katılım Nisan 2007
4.9K Takip Edilen2.5K Takipçiler
@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦
This is a very primitive version of team of rivals paper from @acmurthy but this approach has proven to be quite reliable with the right consensus protocol. Over the last few weeks I’ve been pushing it to the limit and models such as Opus 4.6 can exploit its capabilities.
ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler

Another feature we have in PAI is a /council skill. Problem: You want expert opinions but don't have access to them. /council creates multiple, relevant agent experts on the topic and HAS THEM DEBATE across multiple rounds and return the results to the thread.

English
0
0
0
67
@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦 retweetledi
Boris Cherny
Boris Cherny@bcherny·
Quick thank you to everyone who's been building with Claude Code, both the early crowd and everyone who showed up this year. It’s only thanks to your feedback that we can make the product a little better every day.
English
57
11
603
42.7K
Andrej Karpathy
Andrej Karpathy@karpathy·
I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)
Andrej Karpathy tweet media
English
1K
3.6K
28.1K
10.7M
@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦 retweetledi
Garry Tan
Garry Tan@garrytan·
Karpathy just open-sourced autoresearch. One GPU. 100 ML experiments. Overnight. You never touch the code — just write a Markdown file. The bottleneck isn't compute. It's your program.md. gli.st/z3iakp3f
English
115
327
3.2K
270K
Gwen (Chen) Shapira
Gwen (Chen) Shapira@gwenshap·
I'm still on the fence on Codex vs Claude Code. But one thing Codex does significantly better is context compaction. With Codex, I can mostly continue working after compaction. With Claude Code, I often had to toss out the chat and start a new one.
English
14
2
32
4.3K
@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦 retweetledi
BOOTOSHI 👑
BOOTOSHI 👑@KingBootoshi·
what watching anthropic vs. the department of war feels like
English
50
466
4.5K
170.1K
@egmmx@threads/@𝗲𝘀𝘁𝗲𝗯𝗮𝗻@mastodon 🇺🇦 retweetledi
Mehtaab Sawhney
Mehtaab Sawhney@mehtaab_sawhney·
We just posted a paper solving Erdos #846, which was solved by an internal model at OpenAI (cdn.openai.com/infinite-sets/…). While the problem can also be derived from an earlier paper in the literature, the proof by the internal model was one of the first instances where I smiled reading the proof.
English
8
38
309
73.3K