darin

999 posts

darin banner
darin

darin

@dronathon

chronically early

san fransisco Beigetreten Ağustos 2021
153 Folgt155 Follower
Chris Tate
Chris Tate@ctatedev·
@stanzillaz Yes - thanks for your patience. Need to test it more
English
2
0
2
273
Chris Tate
Chris Tate@ctatedev·
agent-browser v0.21 It just keeps getting better (bc of you) 😮 `batch` command 😮 `network har` commands 😮 `upgrade` command 😮 iframe support 😮 --user-data-dir support After this, you can just upgrade it: npm i -g agent-browser brew install agent-browser
English
22
10
375
16.4K
darin
darin@dronathon·
@Michaelzsguo @claudeai 90% of usage being cached context is very good! that is what it should be ! ideally higher !
English
1
0
0
42
Michael Guo
Michael Guo@Michaelzsguo·
the jump from 200K to 1M can make bad context hygiene more expensive, because the system can keep hauling around oversized working sets instead of naturally hitting compaction limits earlier. looks at my cost structure last month, 90% of the token usage were due to cached context Anthropic’s own guidance : start fresh sessions when context gets long, use compact, read narrower file ranges, and keep context small where possible.
Michael Guo tweet media
English
2
1
12
3.2K
Claude
Claude@claudeai·
1 million context window: Now generally available for Claude Opus 4.6 and Claude Sonnet 4.6.
Claude tweet media
English
1.2K
2K
25.2K
5.6M
darin
darin@dronathon·
@alxfazio can we get a copy pastable/gist version pls
English
1
0
1
141
darin
darin@dronathon·
@getpy yeah! im tryna do rlm evals and i an in data labeling mode
English
1
0
1
12
darin
darin@dronathon·
annotation ui : )
darin tweet media
English
1
0
1
89
darin
darin@dronathon·
@ankrgyl @zeeg namespace? was using them and nix for 60s cargo times . locally cached, persistend didk
English
0
0
0
19
Ankur Goyal
Ankur Goyal@ankrgyl·
@zeeg i am guessing your dropbox setup allowed you to reuse files cached on local disk across builds. that is not very straightforward to achieve nowadays.
English
2
0
0
803
Ankur Goyal
Ankur Goyal@ankrgyl·
Back before github, cloud, and containers were a thing, at memsql, we built our own git & CI infrastructure for a 100+ person team that ran on a couple bare metal machines. Builds (super complex C++ codebase) finished in 5 minutes and CI ran in 20 minutes end-to-end. Why? Files cached locally, no startup time, and incremental builds. Today, with the modern kluge of actions, VMs, sandboxes, snapshots, and caches it feels impossible to achieve this type of performance. I wonder if the constant onslaught of infrastructure challenges and increased performance demanded by coding agents will teleport us back to first principles in developing CI infrastructure. I hope it does.
English
11
4
176
17.1K
darin
darin@dronathon·
building compilers is the new GSM8K
English
0
0
1
31
darin
darin@dronathon·
i put on my robe and labeling hat
English
0
0
0
16
Josh
Josh@JoshPurtell·
@jmbollenbacher The crazy thing with RLMs is there are like 5 bajillion OSS implementations so it doesn't really feel valuable to OS until I feel I've really eval'ed it to h*ll
English
1
0
1
25
Josh
Josh@JoshPurtell·
If you have a clear, operative notion of long-horizon agents/rl that clearly separates it from short-horizon, what is it?
English
5
0
14
1.8K
darin
darin@dronathon·
@realmcore_ evals are bad esp the ones that are just numbers
English
1
0
1
196
akira
akira@realmcore_·
What’s the most controversial opinion you have about agentic coding? I’ll go first: RL has been net negative for general model intelligence and diversity despite tuning the solution shape to be more correct in syntax
English
16
1
28
15.9K
darin
darin@dronathon·
@Yampeleg no. it will take 3-5 months
English
0
0
1
325
darin
darin@dronathon·
@MaximeRivest rapid iteration with fast models. like a glm 4.7 rlm thru cerebras
English
1
0
0
47
Maxime Rivest 🧙‍♂️🦙🐧
I would like dspy to load faster. litellm is what takes the longest. I don't know much about rust, but I found a litellm_rs and replaced litellm in dspy, with that, I could get dspy to load below 1 s. Am I the only one it bothers? Anybody knows rust-python pairing and could weight in on the pros and cons? Otherwise, is there anything else in dspy deps that bothers some of you?
Maxime Rivest 🧙‍♂️🦙🐧 tweet media
English
10
1
70
4.7K
darin
darin@dronathon·
@MaximeRivest the call latency!! it takes so long . also i am of the belief that it is worth directly using the libraries for the frontier labs .
English
2
0
1
13
Maxime Rivest 🧙‍♂️🦙🐧
@dronathon do you react to the realization that litellm has a big impact on dspy or you say that this dspy + litellm latency has hurt you? or that litellm hurts the actual call latency?
English
1
0
0
52
Glauber Costa
Glauber Costa@glcst·
@ibuildthecloud btw, you can use your own model but I have no idea if that works. Claude wrote all of that code and I haven't tested it, because I am using it with the local model. I just figured it would be good to allow an override.
English
1
0
1
132
Glauber Costa
Glauber Costa@glcst·
I am publishing today codemogger: a fully local and embeddable code indexing tool. Codemogger index your codebase for searchmaxxing, allowing *very fast* keyword search (way faster than grepping) and also semantic search so your agent can ask open ended questions and find the relevant locations. It is built with @tursodatabase using vector search and full text search, and the CLI/MCP server comes with a local embedding model for zero-setup execution. Just install it, and get smarter agents npm install -g codemogger
Glauber Costa tweet media
English
44
22
269
17.4K
darin
darin@dronathon·
@MaximeRivest why that methodology ? ive found it to work well when you tune the initial conf (sysprompt/tools), let the models go off to the races, and only reflect over full trajectories
English
0
0
0
18
Maxime Rivest 🧙‍♂️🦙🐧
its not immediately obvious how to turn a multi turn session into something that you can run dspy optimizers' on. dspy_session makes that almost trivial. the secret is that you linearize every turn. so if you have 4 turns, you have 4 examples for the optimizer. each containing themselves and all turn before.
Maxime Rivest 🧙‍♂️🦙🐧 tweet media
English
4
0
45
2.7K
Zhu Liang
Zhu Liang@paradite_·
@willccbb @xeophon Some caveats: - I use prompts are the "weights" (dspy style) - Tools as weights coming - It's not very good at generalizing - It's very expensive to run a loop until you know the Claude Agent SDK trick
English
1
0
1
112