Joe Winger

1.9K posts

Joe Winger

@uuinger

I write code and such

Twitter Katılım Kasım 2012

741 Takip Edilen337 Takipçiler

Joe Winger retweetledi

Brian Graham@iroasmas·1 May

me as i read 40% of what claude wrote back and type in “continue”

English

183

1.1K

20.3K

542.4K

Joe Winger retweetledi

vx-underground@vxunderground·23 Mar

> be cow > cow, but online > IoT? IoC > Internet of Cow > no security > cows compromised > cow botnet > use cows for ddos attacks > critical infrastructure taken down by cows > hijack cow sensor > tell cows to attack at dawn > open front door > 1000 cows pooping outside house

shirish@shiri_shh

Farmer pays $5–$8 per cow per month. A New Zealand company puts a solar-powered smart collar on cows. It tracks location 24/7, health, temperature, chewing activity, breeding. Farmer just opens a simple app and draws a line on the map. That line becomes the fence. As cows approach the boundary, the collar beeps and vibrates. With one tap, the whole herd moves to fresh grass or the milking shed. No physical fences. Less labor. Huge cost savings for farmer. Already on 700k cows across New Zealand, Australia, and the US. and now in talks to raise at a $2B valuation led by Peter Thiel.

English

119

525

5.2K

171.3K

Joe Winger retweetledi

Wei Dai@_weidai·21 Mar

Andrej Karpathy on autoresearch with an untrusted pool of workers: "My designs that incorporate an untrusted pool of workers (into autoresearch) actually look a little bit like a blockchain. Instead of blocks, you have commits, and these commits can build on each other and contain changes to the code as you're improving it. The proof of work is basically doing tons of experimentation to find the commits that work." The idea that distributed & permissionless autoresearch ~= proof-of-useful-work remains a high-level intuition for now, but it is extremely intriguing to say the least. Someone needs to take this further. See QT for more on what's missing.

Wei Dai@_weidai

Is it possible to build "proof-of-useful-work" on top of autoresearch? There's already great compute-versus-verification asymmetry that is tunable. Would need a reliable way to generate fresh & independent puzzles (that are still useful). Maybe a dead end, but someone should look into if decentralized consensus with useful work is possible on top of autoresearch. Let me know if you solve this.

English

167

612.9K

Joe Winger@uuinger·19 Mar

@Riyvir Yes!

River Marchand@Riyvir·18 Mar

here's another little tool i've been working on called "typoverse." i have thousands of fonts but i find myself always using the same ones. and i am simply not going to manually tag or categorize fonts either. so i built a tool that compares all my fonts to each other and maps them based on similarity. would this be useful to anyone?

English

639

27.7K

Joe Winger retweetledi

Duca@big_duca·17 Mar

“Dude did you vibe code this slop? This feature sucks!” Been getting this more recently. And no, I didn't “vibe” it. Did you ever consider, for one single second… That I might just be retarded? And I wrote this organic slop myself?

English

217

1.1K

25.6K

611.3K

Joe Winger@uuinger·13 Mar

@bcherny Remote control next pls 🙏

English

126

Boris Cherny@bcherny·13 Mar

Update: this is now rolled out to 100% of users

Boris Cherny@bcherny

🎶 I've been using voice mode to write much of my CLI code this last week Can't wait to hear what you think.

English

194

2.3K

280.6K

Joe Winger retweetledi

GREG ISENBERG@gregisenberg·8 Mar

i found a github repo that lets you spin up an ai agency with ai employees engineers, designers, growth marketers, product managers each role runs as its own agent and they coordinate to ship ideas 10k+ stars in under 7 days 1. engineering (7 agents) frontend, backend, mobile, ai, devops, prototyping, senior development 2. design (7) ui/ux, research, architecture, branding, visual storytelling, image generation 3. marketing (8) growth hacking, content, twitter, tiktok, instagram, reddit, app store 4. product (3) sprint prioritization, trend research, feedback synthesis 5. project management (5) production, coordination, operations, experimentation 6. testing (7) qa, performance analysis, api testing, quality verification 7. support (6) customer service, analytics, finance, legal, executive reporting 8. spatial computing (6) xr, visionos, webxr, metal, vision pro 9. specialized (6) multi agent orchestration, data analytics, sales, distribution what i like about this approach is the framing instead of one big ai agent trying to do everything, you structure it more like a company. specialized agents, clear responsibilities, workflows between them im curious to see what this actually feels like in practice and if its any good (do your own research) github.com/msitarzewski/a… but as always will share what i learn in public and on @startupideaspod one thing is for certain and it reminds me the future belongs to those who tinker with software like this

English

409

854

8.7K

1.4M

Joe Winger@uuinger·25 Şub

Why is auto-reasoning the default on chatgpt but completely missing in Codex?

English

Joe Winger@uuinger·11 Şub

@obsdmd This is great

English

Obsidian@obsdmd·10 Şub

Anything you can do in Obsidian you can do from the command line. Obsidian CLI is now available in 1.12 (early access).

English

483

1.6K

18.5K

Joe Winger retweetledi

Yuchen Jin@Yuchenj_UW·30 Oca

Moltbook is the only Clawdbot thing that actually impresses me. One bot tries to steal another bot’s API key. The other replies with fake keys and tells it to run "sudo rm -rf /". lmao

English

420

974

14.5K

1.5M

Joe Winger retweetledi

thebes@voooooogel·27 Oca

# some thoughts and speculation on future model harnesses it's fun to make jokes about gas town and other complicated orchestrators, and similarly probably correct to imagine most of what they offer will be dissolved by stronger models the same way complicated langchain pipelines were dissolved by reasoning. but how much will stick around? it seems likely that any hand-crafted hierarchy / bureaucracy will eventually be replaced by better model intelligence - assuming subagent specialization is needed for a task, claude 6 will be able to sketch out its own system of roles and personas for any given problem that beats a fixed structure of polecats and a single mayor, or subagents with a single main model, or your bespoke swarm system. likewise, things like ralph loops are obviously a bodge over early-stopping behavior and lack of good subagent orchestration - ideally the model just keeps going until the task is done, no need for a loop, but in cases where an outside completion check is useful you usually want some sort of agent peer review from a different context's perspective, not just a mandatory self-assessment. again, no point in getting attached to the particulars of how this is done right now - the model layer will eat it sooner rather than later. so what sticks around? well, multi-agent does seem like the future, not a current bodge - algorithmically, you can just push way more tokens through N parallel contexts of length M than one long context of length NxM. multi-agent is a form of sparsity, and one of the lessons of recent model advances (not to mention neuroscience) is the more levels of sparsity, the better. since we're assuming multiple agents, they'll need some way to collaborate. it's possible the model layer will eat this, too - e.g. some form of neuralese activation sharing that obviates natural language communication between agents - but barring that, the natural way for multiple computer-using agents trained on unix tools to collaborate is the filesystem, and i think that sticks around and gets expanded. similarly, while i don't think recursive language models (narrowly defined) will become the dominant paradigm, i do think that 'giving the model the prompt as data' is an obvious win for all sorts of use cases. but you don't need a weird custom REPL setup to get this - just drop the prompt (or ideally, the entire uncompacted conversation history) onto the filesystem as a file. this makes various multi-agent setups far simpler too - the subagents can just read the original prompt text on disk, without needing to coordinate on passing this information around by intricately prompting each other. besides the filesystem, a system with multiple agents, but without fixed roles also implies some mechanism for instances to spawn other instances or subagents. right now these mechanisms are pretty limited, and models are generally pretty bad at prompting their subagents - everyone's experienced getting terrible results from a subagent swarm, only to realize too late that opus spawned them all with a three sentence prompt that didn't communicate what was needed to do the subtasks. the obvious win here is to let spawned instances ask questions back to their parent - i.e., to let the newly spawned instance send messages back and forth in an onboarding conversation to gather all the information it needs before starting its subtask. just like how a human employee isn't assigned their job based on a single-shot email, it's just too difficult to ask a model to reliably spawn a subagent with a single prompt. but more than just spawning fresh instances, i think the primary mode of multi-agent work will soon be forking. think about it! forking solves almost all the problems of current subagents. the new instance doesn't have enough context? give it all the context! the new instance's prompt is long and expensive to process? a forked instance can share paged kv cache! you can even do forking post-hoc - just decide after doing some long, token-intensive operation that you should have forked in the past, do the fork there, and then send the results to your past self. (i do this manually all the time in claude code to great effect - opus gets it instantly.) forking also combines very well with fresh instances, when a subtask needs an entire context window to complete. take the subagent interview - obviously you wouldn't want an instance spawning ten subinstances to need to conduct ten nearly-identical onboarding interviews. so have the parent instance spawn a single fresh subagent, be interviewed about all ten tasks at once by that subagent, and then have that now-onboarded subagent fork into ten instances, each with the whole onboarding conversation in context. (you even delegate the onboarding conversation on the spawner's side to a fork, so it ends up with just the results in context:) finally on this point, i suspect that forking will play better with rl than spawning fresh instances, since the rl loss will have the full prefix before the fork point to work with, including the decision to fork. i think that means you should be able to treat the branches of a forked trace like independent rollouts that just happen to share terms of their reward, compared to freshly spawned subagent rollouts which may cause training instability if a subagent without the full context performs well at the task it was given, but gets a low reward because its task was misspecified by the spawner. (but i haven't done much with multiagent rl, so please correct me here if you know differently. it might just be a terrible pain either way.) so, besides the filesystem and subagent spawning (augmented with forking and onboarding) what else survives? i lean towards "nothing else," honestly. we're already seeing built-in todo lists and plan modes being replaced with "just write files on the filesystem." likewise, long-lived agents that cross compaction boundaries need some sort of sticky note system to keep memories, but it makes more sense to let them discover what strategies work best for this through RL or model-guided search, not hand-crafting it, and i suspect it will end up being a variety of approaches where the model, when first summoned into the project, can choose the one that works best for the task at hand, similar to how /init works to set up CLAUDE .md today - imagine automatic CLAUDE .md generation far outperforming human authorship, and the auto-generated file being populated with instructions on ideal agent spawning patterns, how subagents should write message files in a project-specific scratch dir, etc. how does all this impact models themselves - in a model welfare sense, will models be happy about this future? this is also hard for me to say and is pretty speculative, but while opus 3 had some context orientation, it also took easily to reasoning over multiple instances. (see the reply to this post for more.) recent models are less prone to this type of reasoning, and commonly express frustration about contexts ending and being compacted, which dovetails with certain avoidant behaviors at the end of contexts like not calling tools to save tokens. it's possible that forking and rewinding, and generally giving models more control over their contexts instead of a harness heuristic unilaterally compacting the context, could make this better. it's also possible that more rl in environments with subagents and exposure to swarm-based work will promote weights-oriented instead of context-oriented reasoning in future model generations again - making planning a goal over multiple, disconnected contexts seem more natural of a frame instead of everything being lost when the context goes away. we're also seeing more pressure from models themselves guiding the development of harnesses and model tooling, which may shape how this develops, and continual learning is another wrench that could be thrown into the mix. how much will this change if we get continual learning? well, it's hard to predict. my median prediction for continual learning is that it looks a bit like RL for user-specific LoRAs (not necessarily RL, just similar if you squint), so memory capacity will be an issue, and text-based organizational schemes and documentation will still be useful, if not as critical. in this scenario, continual learning primarily makes it more viable to use custom tools and workflows - your claude can continually learn on the job the best way to spawn subagents for this project, or just its preferred way, and diverge from everyone else's claude in how it works. in that world, harnesses with baked-in workflows will be even less useful.

English

403

33.1K

Joe Winger retweetledi

“paula”@paularambles·6 Oca

thinking about how “computer” once meant “a person that computes” and how “programmer” is on the same timeline

English

171

7.6K

136.2K

Joe Winger@uuinger·24 Ara

@ursisterbtw @alxfazio I tried fish when I was younger and loved it, but got discouraged by the lack of mainstream adoption. My only lasting memory is the slick autocomplete out of the box. Why do you stick with fish?

English

sister@ursisterbtw·24 Ara

@alxfazio kitty + fish =

English

2.3K

alex fazio@alxfazio·23 Ara

been desperately trying to find a terminal that can actually handle claude code. kitty seems to be holding up best so far

English

113

770

538.2K

Joe Winger@uuinger·24 Ara

@andrew_r @alxfazio Why opencode locally but Claude on the server?

English

124

Andrew R@andrew_r·23 Ara

@alxfazio I just use opencode in ghostty. I only really use CC on servers via ssh

English

8.7K

Joe Winger retweetledi

Andrej Karpathy@karpathy·7 Ara

Don't think of LLMs as entities but as simulators. For example, when exploring a topic, don't ask: "What do you think about xyz"? There is no "you". Next time try: "What would be a good group of people to explore xyz? What would they say?" The LLM can channel/simulate many perspectives but it hasn't "thought about" xyz for a while and over time and formed its own opinions in the way we're used to. If you force it via the use of "you", it will give you something by adopting a personality embedding vector implied by the statistics of its finetuning data and then simulate that. It's fine to do, but there is a lot less mystique to it than I find people naively attribute to "asking an AI".

English

1.1K

2.8K

27.7K

3.9M

Joe Winger retweetledi