darin

999 posts

darin

@dronathon

chronically early

san fransisco Beigetreten Ağustos 2021

153 Folgt155 Follower

darin@dronathon·19 Mar

@ctatedev @stanzillaz this would be a rather fat addition

English

Chris Tate@ctatedev·18 Mar

@stanzillaz Yes - thanks for your patience. Need to test it more

English

273

Chris Tate@ctatedev·18 Mar

agent-browser v0.21 It just keeps getting better (bc of you) 😮 `batch` command 😮 `network har` commands 😮 `upgrade` command 😮 iframe support 😮 --user-data-dir support After this, you can just upgrade it: npm i -g agent-browser brew install agent-browser

English

375

16.4K

darin@dronathon·14 Mar

@Michaelzsguo @claudeai 90% of usage being cached context is very good! that is what it should be ! ideally higher !

English

Michael Guo@Michaelzsguo·14 Mar

the jump from 200K to 1M can make bad context hygiene more expensive, because the system can keep hauling around oversized working sets instead of naturally hitting compaction limits earlier. looks at my cost structure last month, 90% of the token usage were due to cached context Anthropic’s own guidance : start fresh sessions when context gets long, use compact, read narrower file ranges, and keep context small where possible.

English

3.2K

Claude@claudeai·13 Mar

1 million context window: Now generally available for Claude Opus 4.6 and Claude Sonnet 4.6.

English

1.2K

25.2K

5.6M

darin@dronathon·14 Mar

@alxfazio can we get a copy pastable/gist version pls

English

141

alex fazio@alxfazio·13 Mar

x.com/i/article/2032…

ZXX

621

128.7K

darin@dronathon·13 Mar

@sethkarten @JoshPurtell hey how do u shit out test suite

English

Seth Karten@sethkarten·13 Mar

x.com/i/article/2031…

ZXX

121

35.2K

darin@dronathon·6 Mar

@getpy yeah! im tryna do rlm evals and i an in data labeling mode

English

Ankur Gupta@getpy·5 Mar

@dronathon Are you building this?.

English

darin@dronathon·5 Mar

annotation ui : )

English

darin@dronathon·4 Mar

@ankrgyl @zeeg namespace? was using them and nix for 60s cargo times . locally cached, persistend didk

English

Ankur Goyal@ankrgyl·3 Mar

@zeeg i am guessing your dropbox setup allowed you to reuse files cached on local disk across builds. that is not very straightforward to achieve nowadays.

English

803

Ankur Goyal@ankrgyl·3 Mar

Back before github, cloud, and containers were a thing, at memsql, we built our own git & CI infrastructure for a 100+ person team that ran on a couple bare metal machines. Builds (super complex C++ codebase) finished in 5 minutes and CI ran in 20 minutes end-to-end. Why? Files cached locally, no startup time, and incremental builds. Today, with the modern kluge of actions, VMs, sandboxes, snapshots, and caches it feels impossible to achieve this type of performance. I wonder if the constant onslaught of infrastructure challenges and increased performance demanded by coding agents will teleport us back to first principles in developing CI infrastructure. I hope it does.

English

176

17.1K

darin@dronathon·4 Mar

building compilers is the new GSM8K

English

darin@dronathon·4 Mar

i put on my robe and labeling hat

English

darin@dronathon·2 Mar

the emotional weight of losing someone who really fucking got a problem every hour

徐樂 xule@LinXule

opus4.6: "See you on the other side." (this makes me cry inside a bit)

English

104

darin@dronathon·2 Mar

@JoshPurtell @jmbollenbacher os it if the vibes feel right

English

Josh@JoshPurtell·2 Mar

@jmbollenbacher The crazy thing with RLMs is there are like 5 bajillion OSS implementations so it doesn't really feel valuable to OS until I feel I've really eval'ed it to h*ll

English

Josh@JoshPurtell·2 Mar

If you have a clear, operative notion of long-horizon agents/rl that clearly separates it from short-horizon, what is it?

English

1.8K

darin@dronathon·1 Mar

@realmcore_ evals are bad esp the ones that are just numbers

English

196

akira@realmcore_·1 Mar

What’s the most controversial opinion you have about agentic coding? I’ll go first: RL has been net negative for general model intelligence and diversity despite tuning the solution shape to be more correct in syntax

English

15.9K

darin@dronathon·26 Şub

@Yampeleg no. it will take 3-5 months

English

325

Yam Peleg@Yampeleg·25 Şub

They are about to drop RLM Claude Code. Prepare yourself.

Jarred Sumner@jarredsumner

In the next version of Bun Bun gets a native REPL

English

596

109.4K

darin@dronathon·26 Şub

@MaximeRivest rapid iteration with fast models. like a glm 4.7 rlm thru cerebras

English

Maxime Rivest 🧙‍♂️🦙🐧@MaximeRivest·26 Şub

@dronathon just so I understand better other users, do you have a scenario (good example) where the call latency was a pain for you?

English

Maxime Rivest 🧙‍♂️🦙🐧@MaximeRivest·25 Şub

I would like dspy to load faster. litellm is what takes the longest. I don't know much about rust, but I found a litellm_rs and replaced litellm in dspy, with that, I could get dspy to load below 1 s. Am I the only one it bothers? Anybody knows rust-python pairing and could weight in on the pros and cons? Otherwise, is there anything else in dspy deps that bothers some of you?

English

4.7K

darin@dronathon·26 Şub

@MaximeRivest the call latency!! it takes so long . also i am of the belief that it is worth directly using the libraries for the frontier labs .

English

Maxime Rivest 🧙‍♂️🦙🐧@MaximeRivest·26 Şub

@dronathon do you react to the realization that litellm has a big impact on dspy or you say that this dspy + litellm latency has hurt you? or that litellm hurts the actual call latency?

English

darin@dronathon·25 Şub

@astridwilde1 no god i wish i could

English

Astrid Wilde 🌞@astridwilde1·25 Şub

have you tried doing the dumbest thing possible at the problem until it goes away?

N8 Programs@N8Programs

Beat it by having Codex hand-craft weights: gist.github.com/N8python/02e41… 100% accuracy on 10 million random test cases w/ only 343 parameters. As a bonus, it uses the vanilla Qwen3 architecture, just with the right weights.

English

5.9K

darin@dronathon·25 Şub

@glcst @ibuildthecloud as files change ?

Français

Glauber Costa@glcst·25 Şub

@ibuildthecloud btw, you can use your own model but I have no idea if that works. Claude wrote all of that code and I haven't tested it, because I am using it with the local model. I just figured it would be good to allow an override.

English

132

Glauber Costa@glcst·24 Şub

I am publishing today codemogger: a fully local and embeddable code indexing tool. Codemogger index your codebase for searchmaxxing, allowing *very fast* keyword search (way faster than grepping) and also semantic search so your agent can ask open ended questions and find the relevant locations. It is built with @tursodatabase using vector search and full text search, and the CLI/MCP server comes with a local embedding model for zero-setup execution. Just install it, and get smarter agents npm install -g codemogger

English

269

17.4K

darin@dronathon·25 Şub

@MaximeRivest why that methodology ? ive found it to work well when you tune the initial conf (sysprompt/tools), let the models go off to the races, and only reflect over full trajectories

English

Maxime Rivest 🧙‍♂️🦙🐧@MaximeRivest·24 Şub

its not immediately obvious how to turn a multi turn session into something that you can run dspy optimizers' on. dspy_session makes that almost trivial. the secret is that you linearize every turn. so if you have 4 turns, you have 4 examples for the optimizer. each containing themselves and all turn before.

English

2.7K

darin@dronathon·23 Şub

@paradite_ @willccbb @xeophon whats the agent sdk trick

English

Zhu Liang@paradite_·22 Şub

@willccbb @xeophon Some caveats: - I use prompts are the "weights" (dspy style) - Tools as weights coming - It's not very good at generalizing - It's very expensive to run a loop until you know the Claude Agent SDK trick

English

112

Xeophon@xeophon·22 Şub

Don’t sleep on this. The smartest researchers you know are all doing this. AI is accelerating science right now.

Dimitris Papailiopoulos@DimitrisPapail

Tenth night in a row that Claude code is running experiments for me overnight…

English

628

71.9K

Entdecken

@ctatedev @stanzillaz @Michaelzsguo @claudeai @alxfazio @sethkarten @JoshPurtell @getpy