Jack Friedson

125 posts

Jack Friedson banner
Jack Friedson

Jack Friedson

@JackFriedson

building something new · prev infra/product eng @haizelabs, applied AI @datadog

New York Katılım Kasım 2023
481 Takip Edilen81 Takipçiler
Jack Friedson
Jack Friedson@JackFriedson·
another day of wondering whether the efficiency gains from coding agents really justify the amount of time I spend undoing their bullshit design choices
English
2
0
3
184
Jack Friedson
Jack Friedson@JackFriedson·
one must imagine sisyphus happy
Jack Friedson tweet media
English
0
0
0
26
Jack Friedson
Jack Friedson@JackFriedson·
listen I'm not gonna say you shouldn't dunk on someone in a prototypical leopard face eating situation, but don't pretend that doing so is somehow constructive. you're doing it because it feels good. that's it. end of story.
William B. Fuckley@opinonhaver

A million people have rightly dunked on this guy, & I don’t care, I’m going to do it too, bc these people should have their catastrophic and massively consequential failures in judgement shoved in their faces forever. Sometimes a dog doesn’t learn unless you rub it’s face in it.

English
0
0
1
93
Jack Friedson
Jack Friedson@JackFriedson·
really don't get what all the hype is about, like this is supposed to be hard? I even got one with colors
Jack Friedson tweet media
English
0
0
0
27
Jack Friedson retweetledi
Pay Roll Manager Here
Pay Roll Manager Here@UsingLyft·
He’s typing in a search bar, quick show him the search option he’s looking for. Perfect. He typed the next letter that is also the next letter in the option we just showed him so take that option away and show him an option that doesn’t match at all
Pay Roll Manager Here tweet media
English
155
1.8K
34.5K
521.7K
Trash Panda 🦝
Trash Panda 🦝@trashpandaemoji·
I'm quite pleased with how Neon Pilot's chat interface is coming out. Some neat features: - Uses @pierrecomputer's diffs everywhere (amazing library) - Tries to keep everything in the chat transcript inspect-able, including the system prompt. A lot of harnesses try to hide things. I believe should be able to inspect everything. - Tool calls get hidden in a shelf on every turn, except for tools that have things you might want to see (like diffs). These are pinned under the tool shelf. You can still open it up and see everything. - Tool calls stream in and you can see the tool output but it collapses once its done. You get the sensation stuff is happening, but its all tucked away once the agent is done running so you can actually focus on the agent output and your original request.
Trash Panda 🦝 tweet mediaTrash Panda 🦝 tweet mediaTrash Panda 🦝 tweet media
English
1
0
3
576
Trash Panda 🦝
Trash Panda 🦝@trashpandaemoji·
@pierrecomputer Assistant output. Also notice you can fork or rewind to any user/assistant message. I really like being able to fork into a new conversation, use this feature a lot.
Trash Panda 🦝 tweet media
English
1
0
0
62
Jack Friedson
Jack Friedson@JackFriedson·
"You're right to push back" YES I KNOW
English
0
0
1
31
Jack Friedson
Jack Friedson@JackFriedson·
Rip David Lynch you would've loved this
Jack Friedson tweet media
English
0
0
2
58
Jack Friedson
Jack Friedson@JackFriedson·
"Finding the right footing at the very beginning matters more than ever. In a world where it’s cheap to build almost anything, the real edge is choosing what is actually worth building and staying with it long enough to learn something the market doesn’t know yet."
Aditya Agarwal@adityaag

x.com/i/article/2054…

English
0
0
0
73
Jack Friedson
Jack Friedson@JackFriedson·
codex: I did the refactor but kept the previous interface as a shim to avoid import churn me: I will shoot you with a gun
English
0
0
1
47
Jack Friedson retweetledi
Lakshya A Agrawal
Lakshya A Agrawal@LakshyAAAgrawal·
Learning from rich textual feedback (errors, traces, partial reasoning) beats scalar reward alone for LLM optimization. GEPA demonstrated this for context-space optimization (prompts and agent harnesses), delivering frontier results at a fraction of the cost of RL. But context-only optimization is bounded by the base model's capability ceiling; weight updates can reach further. Very excited about this new line of work on Fast-Slow Training (FST), which interleaves context and model weight optimization! The idea is a clean division of labor between two interleaved loops: 🔹 Fast loop (context): GEPA reads rich rollout feedback updating the context layer. The context becomes a fast-updating scratchpad of what the model needs to know about this task, right now. 🔹 Slow loop (model parameters): RL updates the model's parameters conditioned on the evolving context. Because the prompt already carries task-specific nuances, the model parameters are freed from absorbing them and focus on what actually generalizes across tasks and pushes the frontier. ⦁ 3× more sample-efficient than RL on math, code, and physics reasoning ⦁ ~70% lower KL divergence from base at matched accuracy ⦁ Plasticity preserved: FST checkpoints respond better to additional RL on new tasks than RL-only ones ⦁ Continual learning across changing tasks (HoVer → CodeIO → Physics) where RL stalls the moment the task switches FST is a direction towards: ⦁ Addressing RL's pain points: entropy collapse, sparse rewards, long-horizon exploration ⦁ Providing a clean channel for rich feedback into weight updates ⦁ Demonstrating model-harness co-evolution ⦁ Discovery: Using fast context updates for broad exploration, while leveraging a continually improving model. Check out the full thread below:
Kusha Sareen@KushaSareen

Can LLMs adapt continually without losing base skills? Fast-Slow Training (FST) pairs "slow" weights with "fast" context. FST vs. RL: • 3x more sample-efficient • Higher performance ceiling • Less KL drift (better plasticity) • Continual learning: succeeds where RL stalls

English
13
43
186
33.1K
Jack Friedson
Jack Friedson@JackFriedson·
Great example of why you should 1. Do the tiniest bit of diligence to check if the tweet you're quoting to shill your product is in fact a real example of the problem your product solves or if it's just an elaborate joke 2. That's it, just the first one
Brace@BraceSproul

Great example of why you should 1. Run your agent on a separate machine from the sandbox it uses (e.g. sandbox as a tool) 2. Never set env vars in your sandbox. Instead, use something like LangSmith’s sandbox proxy auth (reqs are intercepted as they leave the sandbox and secrets are injected, that way the secret never enters the sandbox)

English
0
0
0
49
Jack Friedson
Jack Friedson@JackFriedson·
as someone who is (boldly, courageously) building the exact same software factory app as everyone else on this website, I tend to agree but I think there's some nuance. namely that building a bespoke harness can teach you how to use existing ones much more effectively. like if I wrote a custom rdbms I probably wouldn't use it in production, but I'd get a hell of a lot better at using postgres but yeah if the thing you're optimizing for is doing your current work faster or hitting RSI escape velocity before big labs, then building a bespoke harness is not going to get you there (the only notable exception to this is, of course, me, who is special and different)
English
0
0
0
40
Tenobrus
Tenobrus@tenobrus·
this doesn't feel rational to me at all. ime gains from "agentic coding techniques" or scaffolding or orchestrators are *incredibly diminishing*, whatever u settle on the first week is ~as good as its gonna get. and the actual productivity increases come from new model releases, plus first party or well funded third party scaffoldings often timed pretty close to those releases. i think the rational thing to do is mostly to just actually work on on shit with the tools you have, not try to build your own half functioning software factory that doesn't actually make you much more productive, and after two more model iterations just start using whatever the standard swarm scaffolds are that are actually battle-tested. people who spent the last three months making their own hypertuned custom openclaw setups to respond to every inbound email with a PR that gets shipped to prod didn't "build a factory", they kinda just wasted their time
Justin Murphy@jmrphy

The big problem with getting any kind of substantive work done right now is that it feels so much more rational to work on the factory that produces the substantive work, instead of the work itself. It’s like the woodsman who was once asked: “What would you do if you had just five minutes to chop down a tree?" The woodsman: "I would spend the first two and a half minutes sharpening my axe.” Well, what if you only had 30 years left to achieve greatness? "You should probably spend 15 years building the most effectively calibrated and defensible, recursively self-improving systems for the production of great work.” AI psychosis or a perfectly rational strategy?

English
29
9
255
16.1K