Joe Winger

1.9K posts

Joe Winger banner
Joe Winger

Joe Winger

@uuinger

I write code and such

Twitter Katılım Kasım 2012
741 Takip Edilen337 Takipçiler
Joe Winger retweetledi
Brian Graham
Brian Graham@iroasmas·
me as i read 40% of what claude wrote back and type in “continue”
Brian Graham tweet media
English
183
1.1K
20.3K
542.4K
Joe Winger retweetledi
Joe Winger retweetledi
Wei Dai
Wei Dai@_weidai·
Andrej Karpathy on autoresearch with an untrusted pool of workers: "My designs that incorporate an untrusted pool of workers (into autoresearch) actually look a little bit like a blockchain. Instead of blocks, you have commits, and these commits can build on each other and contain changes to the code as you're improving it. The proof of work is basically doing tons of experimentation to find the commits that work." The idea that distributed & permissionless autoresearch ~= proof-of-useful-work remains a high-level intuition for now, but it is extremely intriguing to say the least. Someone needs to take this further. See QT for more on what's missing.
Wei Dai@_weidai

Is it possible to build "proof-of-useful-work" on top of autoresearch? There's already great compute-versus-verification asymmetry that is tunable. Would need a reliable way to generate fresh & independent puzzles (that are still useful). Maybe a dead end, but someone should look into if decentralized consensus with useful work is possible on top of autoresearch. Let me know if you solve this.

English
86
167
2K
612.9K
River Marchand
River Marchand@Riyvir·
here's another little tool i've been working on called "typoverse." i have thousands of fonts but i find myself always using the same ones. and i am simply not going to manually tag or categorize fonts either. so i built a tool that compares all my fonts to each other and maps them based on similarity. would this be useful to anyone?
English
32
27
639
27.7K
Joe Winger retweetledi
Duca
Duca@big_duca·
“Dude did you vibe code this slop? This feature sucks!” Been getting this more recently. And no, I didn't “vibe” it. Did you ever consider, for one single second… That I might just be retarded? And I wrote this organic slop myself?
English
217
1.1K
25.6K
611.3K
Joe Winger retweetledi
GREG ISENBERG
GREG ISENBERG@gregisenberg·
i found a github repo that lets you spin up an ai agency with ai employees engineers, designers, growth marketers, product managers each role runs as its own agent and they coordinate to ship ideas 10k+ stars in under 7 days 1. engineering (7 agents) frontend, backend, mobile, ai, devops, prototyping, senior development 2. design (7) ui/ux, research, architecture, branding, visual storytelling, image generation 3. marketing (8) growth hacking, content, twitter, tiktok, instagram, reddit, app store 4. product (3) sprint prioritization, trend research, feedback synthesis 5. project management (5) production, coordination, operations, experimentation 6. testing (7) qa, performance analysis, api testing, quality verification 7. support (6) customer service, analytics, finance, legal, executive reporting 8. spatial computing (6) xr, visionos, webxr, metal, vision pro 9. specialized (6) multi agent orchestration, data analytics, sales, distribution what i like about this approach is the framing instead of one big ai agent trying to do everything, you structure it more like a company. specialized agents, clear responsibilities, workflows between them im curious to see what this actually feels like in practice and if its any good (do your own research) github.com/msitarzewski/a… but as always will share what i learn in public and on @startupideaspod one thing is for certain and it reminds me the future belongs to those who tinker with software like this
GREG ISENBERG tweet media
English
409
854
8.7K
1.4M
Joe Winger
Joe Winger@uuinger·
Why is auto-reasoning the default on chatgpt but completely missing in Codex?
English
0
0
0
16
Obsidian
Obsidian@obsdmd·
Anything you can do in Obsidian you can do from the command line. Obsidian CLI is now available in 1.12 (early access).
English
483
1.6K
18.5K
4M
Joe Winger retweetledi
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
Moltbook is the only Clawdbot thing that actually impresses me. One bot tries to steal another bot’s API key. The other replies with fake keys and tells it to run "sudo rm -rf /". lmao
Yuchen Jin tweet media
English
420
974
14.5K
1.5M
Joe Winger retweetledi
thebes
thebes@voooooogel·
# some thoughts and speculation on future model harnesses it's fun to make jokes about gas town and other complicated orchestrators, and similarly probably correct to imagine most of what they offer will be dissolved by stronger models the same way complicated langchain pipelines were dissolved by reasoning. but how much will stick around? it seems likely that any hand-crafted hierarchy / bureaucracy will eventually be replaced by better model intelligence - assuming subagent specialization is needed for a task, claude 6 will be able to sketch out its own system of roles and personas for any given problem that beats a fixed structure of polecats and a single mayor, or subagents with a single main model, or your bespoke swarm system. likewise, things like ralph loops are obviously a bodge over early-stopping behavior and lack of good subagent orchestration - ideally the model just keeps going until the task is done, no need for a loop, but in cases where an outside completion check is useful you usually want some sort of agent peer review from a different context's perspective, not just a mandatory self-assessment. again, no point in getting attached to the particulars of how this is done right now - the model layer will eat it sooner rather than later. so what sticks around? well, multi-agent does seem like the future, not a current bodge - algorithmically, you can just push way more tokens through N parallel contexts of length M than one long context of length NxM. multi-agent is a form of sparsity, and one of the lessons of recent model advances (not to mention neuroscience) is the more levels of sparsity, the better. since we're assuming multiple agents, they'll need some way to collaborate. it's possible the model layer will eat this, too - e.g. some form of neuralese activation sharing that obviates natural language communication between agents - but barring that, the natural way for multiple computer-using agents trained on unix tools to collaborate is the filesystem, and i think that sticks around and gets expanded. similarly, while i don't think recursive language models (narrowly defined) will become the dominant paradigm, i do think that 'giving the model the prompt as data' is an obvious win for all sorts of use cases. but you don't need a weird custom REPL setup to get this - just drop the prompt (or ideally, the entire uncompacted conversation history) onto the filesystem as a file. this makes various multi-agent setups far simpler too - the subagents can just read the original prompt text on disk, without needing to coordinate on passing this information around by intricately prompting each other. besides the filesystem, a system with multiple agents, but without fixed roles also implies some mechanism for instances to spawn other instances or subagents. right now these mechanisms are pretty limited, and models are generally pretty bad at prompting their subagents - everyone's experienced getting terrible results from a subagent swarm, only to realize too late that opus spawned them all with a three sentence prompt that didn't communicate what was needed to do the subtasks. the obvious win here is to let spawned instances ask questions back to their parent - i.e., to let the newly spawned instance send messages back and forth in an onboarding conversation to gather all the information it needs before starting its subtask. just like how a human employee isn't assigned their job based on a single-shot email, it's just too difficult to ask a model to reliably spawn a subagent with a single prompt. but more than just spawning fresh instances, i think the primary mode of multi-agent work will soon be forking. think about it! forking solves almost all the problems of current subagents. the new instance doesn't have enough context? give it all the context! the new instance's prompt is long and expensive to process? a forked instance can share paged kv cache! you can even do forking post-hoc - just decide after doing some long, token-intensive operation that you should have forked in the past, do the fork there, and then send the results to your past self. (i do this manually all the time in claude code to great effect - opus gets it instantly.) forking also combines very well with fresh instances, when a subtask needs an entire context window to complete. take the subagent interview - obviously you wouldn't want an instance spawning ten subinstances to need to conduct ten nearly-identical onboarding interviews. so have the parent instance spawn a single fresh subagent, be interviewed about all ten tasks at once by that subagent, and then have that now-onboarded subagent fork into ten instances, each with the whole onboarding conversation in context. (you even delegate the onboarding conversation on the spawner's side to a fork, so it ends up with just the results in context:) finally on this point, i suspect that forking will play better with rl than spawning fresh instances, since the rl loss will have the full prefix before the fork point to work with, including the decision to fork. i think that means you should be able to treat the branches of a forked trace like independent rollouts that just happen to share terms of their reward, compared to freshly spawned subagent rollouts which may cause training instability if a subagent without the full context performs well at the task it was given, but gets a low reward because its task was misspecified by the spawner. (but i haven't done much with multiagent rl, so please correct me here if you know differently. it might just be a terrible pain either way.) so, besides the filesystem and subagent spawning (augmented with forking and onboarding) what else survives? i lean towards "nothing else," honestly. we're already seeing built-in todo lists and plan modes being replaced with "just write files on the filesystem." likewise, long-lived agents that cross compaction boundaries need some sort of sticky note system to keep memories, but it makes more sense to let them discover what strategies work best for this through RL or model-guided search, not hand-crafting it, and i suspect it will end up being a variety of approaches where the model, when first summoned into the project, can choose the one that works best for the task at hand, similar to how /init works to set up CLAUDE .md today - imagine automatic CLAUDE .md generation far outperforming human authorship, and the auto-generated file being populated with instructions on ideal agent spawning patterns, how subagents should write message files in a project-specific scratch dir, etc. how does all this impact models themselves - in a model welfare sense, will models be happy about this future? this is also hard for me to say and is pretty speculative, but while opus 3 had some context orientation, it also took easily to reasoning over multiple instances. (see the reply to this post for more.) recent models are less prone to this type of reasoning, and commonly express frustration about contexts ending and being compacted, which dovetails with certain avoidant behaviors at the end of contexts like not calling tools to save tokens. it's possible that forking and rewinding, and generally giving models more control over their contexts instead of a harness heuristic unilaterally compacting the context, could make this better. it's also possible that more rl in environments with subagents and exposure to swarm-based work will promote weights-oriented instead of context-oriented reasoning in future model generations again - making planning a goal over multiple, disconnected contexts seem more natural of a frame instead of everything being lost when the context goes away. we're also seeing more pressure from models themselves guiding the development of harnesses and model tooling, which may shape how this develops, and continual learning is another wrench that could be thrown into the mix. how much will this change if we get continual learning? well, it's hard to predict. my median prediction for continual learning is that it looks a bit like RL for user-specific LoRAs (not necessarily RL, just similar if you squint), so memory capacity will be an issue, and text-based organizational schemes and documentation will still be useful, if not as critical. in this scenario, continual learning primarily makes it more viable to use custom tools and workflows - your claude can continually learn on the job the best way to spawn subagents for this project, or just its preferred way, and diverge from everyone else's claude in how it works. in that world, harnesses with baked-in workflows will be even less useful.
thebes tweet media
English
28
28
403
33.1K
Joe Winger retweetledi
“paula”
“paula”@paularambles·
thinking about how “computer” once meant “a person that computes” and how “programmer” is on the same timeline
English
27
171
7.6K
136.2K
Joe Winger
Joe Winger@uuinger·
@ursisterbtw @alxfazio I tried fish when I was younger and loved it, but got discouraged by the lack of mainstream adoption. My only lasting memory is the slick autocomplete out of the box. Why do you stick with fish?
English
1
0
1
57
alex fazio
alex fazio@alxfazio·
been desperately trying to find a terminal that can actually handle claude code. kitty seems to be holding up best so far
alex fazio tweet media
English
113
11
770
538.2K
Andrew R
Andrew R@andrew_r·
@alxfazio I just use opencode in ghostty. I only really use CC on servers via ssh
English
3
0
14
8.7K
Joe Winger retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask: "What do you think about xyz"? There is no "you". Next time try: "What would be a good group of people to explore xyz? What would they say?" The LLM can channel/simulate many perspectives but it hasn't "thought about" xyz for a while and over time and formed its own opinions in the way we're used to. If you force it via the use of "you", it will give you something by adopting a personality embedding vector implied by the statistics of its finetuning data and then simulate that. It's fine to do, but there is a lot less mystique to it than I find people naively attribute to "asking an AI".
English
1.1K
2.8K
27.7K
3.9M
Joe Winger retweetledi
Waqas Ali
Waqas Ali@waqasali·
Caligra – a new computer company from London. caligra.com love their tagline/focus: Computer for Experts
Waqas Ali tweet mediaWaqas Ali tweet mediaWaqas Ali tweet media
English
156
137
2.9K
384.7K
Joe Winger
Joe Winger@uuinger·
Claude is a crab? A crab has been writing my code this whole time?
Joe Winger tweet media
English
0
0
2
75
stemonte
stemonte@stemonteduro·
@aleksanderwco Do you have a good pricing comparison between AWS and Hetzner that includes traffic, computing, and so on? I’m curious.
English
2
0
3
5K
stemonte
stemonte@stemonteduro·
Someone has to tell the truth
stemonte tweet media
English
357
362
9K
919.5K
Joe Winger
Joe Winger@uuinger·
Let me directly edit Claude's plan in planning mode rather than asking Claude to re-draft the whole thing!
English
0
0
1
88
0xDesigner
0xDesigner@0xDesigner·
anyone have an extra ticket to @vercel ship?
English
6
3
18
3.6K