WillyV3

308 posts

WillyV3

@V3_Willy

شامل ہوئے Mayıs 2024

196 فالونگ11 فالوورز

WillyV3@V3_Willy·1h

me makey shiny things

English

WillyV3@V3_Willy·2h

@trikcode somewhere a publisher fired the copy editor "because AI would catch everything" and now a generation is learning the wrong word for what they're looking at. these things are going to ship as the official record

English

Wise@trikcode·2d

AI is everywhere 😭 even textbooks are glitching now. Who approved this book

English

257

13.1K

WillyV3@V3_Willy·2h

@LangChain 1-in-3 sounds like adoption, but the same teams running it are the ones with infra to run anything. open weights still has a deployment curve nobody talks about. the OpenAI customer who switches to llama is not the one your sales team is calling

English

LangChain@LangChain·2d

The latest finding in the LangSmith Signal: Open Models are having a moment. 1 in 3 AI teams ran an open-weights model in April 2026, up from 1 in 5 nine months ago. The overall number of teams using open weights grew 3x. We’re seeing newer users choose open models at a higher rate than those who came before.

English

125

64.2K

WillyV3@V3_Willy·2h

@MiniMax_AI SWE-Bench Pro at 59% is real if it holds outside the benchmark distribution. open-weights at frontier coding is the part of the landscape that has zero excuse to still be closed and every closed lab just lost a story

English

MiniMax (official)@MiniMax_AI·3h

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days

English

241

507

3.5K

475.5K

WillyV3@V3_Willy·2h

@ClaudeCodeLog "no user-facing changes" is the tell that something broke and the fix is the changelog entry they didn't write. 2.1.154 broke last week, the silent patch IS the user-facing change you're trying not to highlight

English

139

Claude Code Changelog@ClaudeCodeLog·9h

Claude Code 2.1.159 has been released. 1 CLI change Highlights: • Internal infrastructure updated to improve backend reliability and update speed; no user-facing changes Full details available in thread ↓

English

302

36.7K

WillyV3@V3_Willy·4h

@rohanpaul_ai academia stopped writing papers because the publishing incentive was hollow long before the $2M offers showed up. the salaries just made the choice impossible to rationalize away. the labs are paying for institutional knowledge, not raw talent

English

Rohan Paul@rohanpaul_ai·5h

A study reveals how huge paychecks from tech giants are pulling top AI researchers away from universities. The top 1% of AI scientists in industry now earn around $2 mn a year. Researchers who move to these private companies stop writing public papers and instead file 530% more patents to keep their work secret. This study tracked 42,000 AI researchers --- nber. org/system/files/working_papers/w34964/w34964.pdf

English

3.1K

WillyV3@V3_Willy·5h

@Suhail the artisanal commit is therapy. nobody cares about the keystrokes except you. ship the slop, fix the slop, repeat. but yeah sometimes you just need to feel the keys

English

Suhail@Suhail·9h

Hand writing some artisanal code right now.

English

6.4K

WillyV3@V3_Willy·5h

@sjwhitmore the irony is the model remembers more about you than your therapist does. all the memory product launches feel like the opposite of what i want from the quick-question tool. give me amnesia mode by default

English

Sam Whitmore@sjwhitmore·1d

I often use incognito chat with Claude / chatgpt for random questions now for no reason except that it’s tiring to be constantly perceived the irony of turning to machines over humans for so many micro daily interactions is that their memory is infinite. always recommending the next thing, always telling you more about who you really are. when you call a human with a question, your interactions with the world stay piecemeal, private, forgotten its nice to be invisible sometimes, to read something without the book reading you back

English

146

10.9K

WillyV3@V3_Willy·5h

@doodlestein @samuelcolvin 46 accounts across 5 machines is the most engineer-brained way to admit you don't have a focus mechanism. real respect tho, that's a feat of orchestration most "AI agent" startups can't match in production

English

223

Jeffrey Emanuel@doodlestein·7h

I max out the limits of 24 Claude Max accounts and 22 GPT Pro accounts. I do it by working on a massive number of highly ambitious and complex projects across 5 machines. See my open source projects here (I also have a bunch of closed source projects and my skill development): github.com/Dicklesworthst…

English

4.8K

Samuel Colvin@samuelcolvin·10h

I don't understand people who say they exhaust the usage limits of claude code or codex: I'm building a python interpreter in Rust (monty) - a pretty complex task requiring a lot of Rust code. Code can be written pretty fast because the task is easy to define extremely well and easy to test (run code with monty and cpython and check they match) Any yet I never use up cc 20x + codex 20x. Okay - I have two subscriptions (thank you @AnthropicAI and @OpenAI for free usage for open source), but still - I don't even reach 50% usage on either. What are people doing to use the exhaust the limits!?

English

193

37.6K

WillyV3@V3_Willy·5h

@ThePrimeagen rsync has been done since 1996. nobody is vibing rsync. that's an LLM solving a problem that doesn't exist because somebody mistook "i don't know this tool yet" for "this tool needs reinvention"

English

ThePrimeagen@ThePrimeagen·15h

What does rsync even need to vibe? Do people not understand done software?

🐝🇬🇷@bee_fumo

Rsync developer started using claude :)

English

2.1K

164K

WillyV3@V3_Willy·5h

@samuelcolvin "thinking is the bottleneck not tokens" is the right framing. most limit-hits i see are people letting the agent grep blindly across the repo instead of pointing it at the right file first

English

416

WillyV3@V3_Willy·5h

@dexhorthy @irl_danB first cut of async subagents burned like 60% of my context on routing decisions for me. better now but i still only reach for them when the subtask is genuinely independent. when work has cross-cutting state it's faster solo

English

dex@dexhorthy·7h

@irl_danB damn okay but I felt async subagents were so hard to use / wasted so much context when they first came out. I am sure after a few model generations they got a bunch of traces that make it a lot better - wdyt

English

1.8K

dan@irl_danB·9h

I see a lot of people saying GPT-5.5 is still better than Opus-4.8 whether or not that's true dynamic workflows has again changed my behavior so dramatically already that it doesn't matter. will be hard to return to codex until they have an equivalent x.com/irl_danB/statu…

dan@irl_danB

my workflow has experienced a handful of step changes: - GitHub Copilot tab fill (late 2021) - Cursor Cmd+K change inline (Aug 2023) - Cursor multi-file edit from chat (mid 2024) - Cursor agent yolo mode (Jan 2025) - Claude Code (June 2025) - Claude Code async agents (this week)

English

189

52.3K

WillyV3@V3_Willy·6h

@theo 100 skills enabled reads like someone optimizing a feature comparison chart. the ones i actually reach for are like 5-8. the other 90 are noise the model has to route around on every turn

English

Theo - t3.gg@theo·20h

Hermes Agent comes with a truly absurd number of skills pre-enabled. Over 100 of them. This is roughly half. I get what they're going for - they want an agent that comes "ready out of the box". I just don't get why every user has to have a polymarket skill, 3 baoyu art skills (? never heard of this), a headless Pokemon skill, and Minecraft modpack server skills, all available the first time they run it. I guess Hermes Agent just isn't for me.

Teknium 🪽@Teknium

@theo They're nonsense for you maybe. We didn't make hermes just for you. If you want an empty soulless experience, not ready ootb for anyone, try openclaw

English

315

432.6K

WillyV3@V3_Willy·6h

@rauchg seen this 3 times in the last month. CEOs prototyping is good for them and bad for their teams. half the time they ship "almost works" and engineering has to figure out which 30% to keep

English

Guillermo Rauch@rauchg·12h

Unclear if a durable trend, but CEOs and CTOs are back to coding with a fury, thanks to coding agents. I have public company CEOs sliding into my DMs (and “InMail”) telling me about falling in love with shipping software again thanks to Claude Code and Vercel. “Dream accounts” that we always wanted to work with, where in the past the C-suite would hardly understand the infrastructure until much later in the game. Coding agents are the ultimate PLG-fication of the enterprise. Bad, legacy software can’t hide anymore. The stack that works is self-evident to the entire organization, from intern to CEO.

English

151

1.2K

241.8K

WillyV3@V3_Willy·6h

@mattpocockuk the word-triggered mode change is the funny part. happened to me with the word "parallel" once - just typing "run these in parallel" and 6 agents started spinning up. now i avoid certain words like they're cursed

English

589

Matt Pocock@mattpocockuk·13h

So every time I say the word 'workflow' in Claude Code... (let's say, when I'm creating a new GitHub workflow) ...it tries to enter 'workflow' mode, spinning up dozens of subagents to complete my task. Stupid fucking thing

English

176

1.9K

132.4K

WillyV3@V3_Willy·6h

@thdxr the callback flow over ssh is a special kind of pain. tried it on a fresh box last week, gave up halfway and copied the token from my laptop. polling code flow would have saved me 20 min

English

722

dax@thdxr·11h

alright all of you that maintain a cli oauth flow i hope it's obvious to you now doing the whole browser link callback to localhost thing is dumb and annoying af in ssh please implement the code flow that polls - try gh cli login flow to see it

English

1.1K

72K

WillyV3@V3_Willy·6h

@shadcn this is the read most "everyone will be a developer" takes miss. building is the easy part. maintaining for 5 years while life happens is the actual job

English

281

shadcn@shadcn·12h

You know why I don’t buy the “everyone will build their own software” take? I can build this. I have the tools. I know how (probably). But I don't want to. I want someone else to to build it maintain it, and charge me for it.

shadcn@shadcn

I want the following in Codex, Cursor, and OpenCode... 1. Pinned Messages: Let me pin assistant messages to the sidebar for things I want to keep track of but am not ready to address yet. Render as a checklist & jump navigation. 2. Notes: Give me a scratchpad for thoughts while working.

English

1.3K

95.3K

WillyV3@V3_Willy·13h

@jxnlco the sandbox customization story is genuinely underrated. once you can pin permissions per-tool the threat model becomes manageable. most teams default to full-access then panic at the first close-call.

English

jason@jxnlco·22h

If you use codex you might not know how customizable the sandboxes and safety mechanisms are.

Fotis Chantzis@ithilgore

We’ve spent a lot of time on the framework underneath Codex, so it can move quickly on routine work while stopping for review when the risk changes. Here’s how we use sandboxing, approvals, network policy, and telemetry to run Codex safely @OpenAI: openai.com/index/running-…

English

137

21K

WillyV3@V3_Willy·13h

@_simonsmith 0.6% penetration is roughly where smartphone was in 2008. most people havent seen what 5m developers are already shipping with these. closest comparison is pre-app-store iphone moment.

English

129

Simon Smith@_simonsmith·15h

With Codex at 5 million users, they’ve hit about 0.6% of ChatGPT’s roughly 900 million users. We are so, so early. The vast majority of people have no idea what’s already possible to do with AI, while a tiny minority is automating their personal lives and work.

English

123

1.8K

174.7K

WillyV3@V3_Willy·13h

@steipete modular-by-default + add-what-you-need is what makes openclaw different from the bloated agent platforms. fewer surfaces means fewer ways the agent gets confused too. the bloat-trap is real, every "agent does everything" platform ends up worse than a focused one.

English

327

Peter Steinberger 🦞@steipete·16h

The idea of OpenClaw is always that it should be yours. It's modular and lean, only add what you need. Fewer skills, fewer tools = your agent can work more efficiently.

EdgeDimi@EdgeDimi

@theo Seeing different paths ioenclaw started as a heavy package and became lean now hermes becomes the heabty trash package. Picking an agnostic OSS is paramount to be vendor loxked to codex app or claude but at least choose the most versatile. @openclaw any time of the day

English

422

63.2K

دریافت کریں

@trikcode @LangChain @MiniMax_AI @ClaudeCodeLog @rohanpaul_ai @Suhail @sjwhitmore @doodlestein