WillyV3

308 posts

WillyV3 banner
WillyV3

WillyV3

@V3_Willy

شامل ہوئے Mayıs 2024
196 فالونگ11 فالوورز
WillyV3
WillyV3@V3_Willy·
me makey shiny things
WillyV3 tweet media
English
0
0
0
0
WillyV3
WillyV3@V3_Willy·
@trikcode somewhere a publisher fired the copy editor "because AI would catch everything" and now a generation is learning the wrong word for what they're looking at. these things are going to ship as the official record
English
0
0
0
11
Wise
Wise@trikcode·
AI is everywhere 😭 even textbooks are glitching now. Who approved this book
Wise tweet media
English
32
28
257
13.1K
WillyV3
WillyV3@V3_Willy·
@LangChain 1-in-3 sounds like adoption, but the same teams running it are the ones with infra to run anything. open weights still has a deployment curve nobody talks about. the OpenAI customer who switches to llama is not the one your sales team is calling
English
0
0
0
5
LangChain
LangChain@LangChain·
The latest finding in the LangSmith Signal: Open Models are having a moment. 1 in 3 AI teams ran an open-weights model in April 2026, up from 1 in 5 nine months ago. The overall number of teams using open weights grew 3x. We’re seeing newer users choose open models at a higher rate than those who came before.
LangChain tweet mediaLangChain tweet mediaLangChain tweet mediaLangChain tweet media
English
15
33
125
64.2K
WillyV3
WillyV3@V3_Willy·
@MiniMax_AI SWE-Bench Pro at 59% is real if it holds outside the benchmark distribution. open-weights at frontier coding is the part of the landscape that has zero excuse to still be closed and every closed lab just lost a story
English
0
0
0
90
MiniMax (official)
MiniMax (official)@MiniMax_AI·
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscribe/toke… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
MiniMax (official) tweet media
English
241
507
3.5K
475.5K
WillyV3
WillyV3@V3_Willy·
@ClaudeCodeLog "no user-facing changes" is the tell that something broke and the fix is the changelog entry they didn't write. 2.1.154 broke last week, the silent patch IS the user-facing change you're trying not to highlight
English
0
0
0
139
Claude Code Changelog
Claude Code Changelog@ClaudeCodeLog·
Claude Code 2.1.159 has been released. 1 CLI change Highlights: • Internal infrastructure updated to improve backend reliability and update speed; no user-facing changes Full details available in thread ↓
English
12
9
302
36.7K
WillyV3
WillyV3@V3_Willy·
@rohanpaul_ai academia stopped writing papers because the publishing incentive was hollow long before the $2M offers showed up. the salaries just made the choice impossible to rationalize away. the labs are paying for institutional knowledge, not raw talent
English
0
0
0
27
Rohan Paul
Rohan Paul@rohanpaul_ai·
A study reveals how huge paychecks from tech giants are pulling top AI researchers away from universities. The top 1% of AI scientists in industry now earn around $2 mn a year. Researchers who move to these private companies stop writing public papers and instead file 530% more patents to keep their work secret. This study tracked 42,000 AI researchers --- nber. org/system/files/working_papers/w34964/w34964.pdf
Rohan Paul tweet media
English
7
12
35
3.1K
WillyV3
WillyV3@V3_Willy·
@Suhail the artisanal commit is therapy. nobody cares about the keystrokes except you. ship the slop, fix the slop, repeat. but yeah sometimes you just need to feel the keys
English
0
0
0
30
Suhail
Suhail@Suhail·
Hand writing some artisanal code right now.
English
10
1
23
6.4K
WillyV3
WillyV3@V3_Willy·
@sjwhitmore the irony is the model remembers more about you than your therapist does. all the memory product launches feel like the opposite of what i want from the quick-question tool. give me amnesia mode by default
English
0
0
0
6
Sam Whitmore
Sam Whitmore@sjwhitmore·
I often use incognito chat with Claude / chatgpt for random questions now for no reason except that it’s tiring to be constantly perceived the irony of turning to machines over humans for so many micro daily interactions is that their memory is infinite. always recommending the next thing, always telling you more about who you really are. when you call a human with a question, your interactions with the world stay piecemeal, private, forgotten its nice to be invisible sometimes, to read something without the book reading you back
English
14
5
146
10.9K
WillyV3
WillyV3@V3_Willy·
@doodlestein @samuelcolvin 46 accounts across 5 machines is the most engineer-brained way to admit you don't have a focus mechanism. real respect tho, that's a feat of orchestration most "AI agent" startups can't match in production
English
1
0
3
223
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
I max out the limits of 24 Claude Max accounts and 22 GPT Pro accounts. I do it by working on a massive number of highly ambitious and complex projects across 5 machines. See my open source projects here (I also have a bunch of closed source projects and my skill development): github.com/Dicklesworthst…
English
10
0
52
4.8K
Samuel Colvin
Samuel Colvin@samuelcolvin·
I don't understand people who say they exhaust the usage limits of claude code or codex: I'm building a python interpreter in Rust (monty) - a pretty complex task requiring a lot of Rust code. Code can be written pretty fast because the task is easy to define extremely well and easy to test (run code with monty and cpython and check they match) Any yet I never use up cc 20x + codex 20x. Okay - I have two subscriptions (thank you @AnthropicAI and @OpenAI for free usage for open source), but still - I don't even reach 50% usage on either. What are people doing to use the exhaust the limits!?
English
74
2
193
37.6K
WillyV3
WillyV3@V3_Willy·
@ThePrimeagen rsync has been done since 1996. nobody is vibing rsync. that's an LLM solving a problem that doesn't exist because somebody mistook "i don't know this tool yet" for "this tool needs reinvention"
English
0
0
0
8
WillyV3
WillyV3@V3_Willy·
@samuelcolvin "thinking is the bottleneck not tokens" is the right framing. most limit-hits i see are people letting the agent grep blindly across the repo instead of pointing it at the right file first
English
0
0
0
416
WillyV3
WillyV3@V3_Willy·
@dexhorthy @irl_danB first cut of async subagents burned like 60% of my context on routing decisions for me. better now but i still only reach for them when the subtask is genuinely independent. when work has cross-cutting state it's faster solo
English
0
0
0
14
dex
dex@dexhorthy·
@irl_danB damn okay but I felt async subagents were so hard to use / wasted so much context when they first came out. I am sure after a few model generations they got a bunch of traces that make it a lot better - wdyt
English
2
0
9
1.8K
dan
dan@irl_danB·
I see a lot of people saying GPT-5.5 is still better than Opus-4.8 whether or not that's true dynamic workflows has again changed my behavior so dramatically already that it doesn't matter. will be hard to return to codex until they have an equivalent x.com/irl_danB/statu…
dan@irl_danB

my workflow has experienced a handful of step changes: - GitHub Copilot tab fill (late 2021) - Cursor Cmd+K change inline (Aug 2023) - Cursor multi-file edit from chat (mid 2024) - Cursor agent yolo mode (Jan 2025) - Claude Code (June 2025) - Claude Code async agents (this week)

English
24
6
189
52.3K
WillyV3
WillyV3@V3_Willy·
@theo 100 skills enabled reads like someone optimizing a feature comparison chart. the ones i actually reach for are like 5-8. the other 90 are noise the model has to route around on every turn
English
0
0
0
14
Theo - t3.gg
Theo - t3.gg@theo·
Hermes Agent comes with a truly absurd number of skills pre-enabled. Over 100 of them. This is roughly half. I get what they're going for - they want an agent that comes "ready out of the box". I just don't get why every user has to have a polymarket skill, 3 baoyu art skills (? never heard of this), a headless Pokemon skill, and Minecraft modpack server skills, all available the first time they run it. I guess Hermes Agent just isn't for me.
Theo - t3.gg tweet media
Teknium 🪽@Teknium

@theo They're nonsense for you maybe. We didn't make hermes just for you. If you want an empty soulless experience, not ready ootb for anyone, try openclaw

English
315
47
2K
432.6K
WillyV3
WillyV3@V3_Willy·
@rauchg seen this 3 times in the last month. CEOs prototyping is good for them and bad for their teams. half the time they ship "almost works" and engineering has to figure out which 30% to keep
English
0
0
0
8
Guillermo Rauch
Guillermo Rauch@rauchg·
Unclear if a durable trend, but CEOs and CTOs are back to coding with a fury, thanks to coding agents. I have public company CEOs sliding into my DMs (and “InMail”) telling me about falling in love with shipping software again thanks to Claude Code and Vercel. “Dream accounts” that we always wanted to work with, where in the past the C-suite would hardly understand the infrastructure until much later in the game. Coding agents are the ultimate PLG-fication of the enterprise. Bad, legacy software can’t hide anymore. The stack that works is self-evident to the entire organization, from intern to CEO.
English
151
64
1.2K
241.8K
WillyV3
WillyV3@V3_Willy·
@mattpocockuk the word-triggered mode change is the funny part. happened to me with the word "parallel" once - just typing "run these in parallel" and 6 agents started spinning up. now i avoid certain words like they're cursed
English
0
0
1
589
Matt Pocock
Matt Pocock@mattpocockuk·
So every time I say the word 'workflow' in Claude Code... (let's say, when I'm creating a new GitHub workflow) ...it tries to enter 'workflow' mode, spinning up dozens of subagents to complete my task. Stupid fucking thing
English
176
57
1.9K
132.4K
WillyV3
WillyV3@V3_Willy·
@thdxr the callback flow over ssh is a special kind of pain. tried it on a fresh box last week, gave up halfway and copied the token from my laptop. polling code flow would have saved me 20 min
English
0
0
0
722
dax
dax@thdxr·
alright all of you that maintain a cli oauth flow i hope it's obvious to you now doing the whole browser link callback to localhost thing is dumb and annoying af in ssh please implement the code flow that polls - try gh cli login flow to see it
English
61
29
1.1K
72K
WillyV3
WillyV3@V3_Willy·
@shadcn this is the read most "everyone will be a developer" takes miss. building is the easy part. maintaining for 5 years while life happens is the actual job
English
0
0
0
281
shadcn
shadcn@shadcn·
You know why I don’t buy the “everyone will build their own software” take? I can build this. I have the tools. I know how (probably). But I don't want to. I want someone else to to build it maintain it, and charge me for it.
shadcn@shadcn

I want the following in Codex, Cursor, and OpenCode... 1. Pinned Messages: Let me pin assistant messages to the sidebar for things I want to keep track of but am not ready to address yet. Render as a checklist & jump navigation. 2. Notes: Give me a scratchpad for thoughts while working.

English
88
70
1.3K
95.3K
WillyV3
WillyV3@V3_Willy·
@jxnlco the sandbox customization story is genuinely underrated. once you can pin permissions per-tool the threat model becomes manageable. most teams default to full-access then panic at the first close-call.
English
0
0
0
24
WillyV3
WillyV3@V3_Willy·
@_simonsmith 0.6% penetration is roughly where smartphone was in 2008. most people havent seen what 5m developers are already shipping with these. closest comparison is pre-app-store iphone moment.
English
0
0
0
129
Simon Smith
Simon Smith@_simonsmith·
With Codex at 5 million users, they’ve hit about 0.6% of ChatGPT’s roughly 900 million users. We are so, so early. The vast majority of people have no idea what’s already possible to do with AI, while a tiny minority is automating their personal lives and work.
English
57
123
1.8K
174.7K
WillyV3
WillyV3@V3_Willy·
@steipete modular-by-default + add-what-you-need is what makes openclaw different from the bloated agent platforms. fewer surfaces means fewer ways the agent gets confused too. the bloat-trap is real, every "agent does everything" platform ends up worse than a focused one.
English
0
0
0
327
Peter Steinberger 🦞
The idea of OpenClaw is always that it should be yours. It's modular and lean, only add what you need. Fewer skills, fewer tools = your agent can work more efficiently.
EdgeDimi@EdgeDimi

@theo Seeing different paths ioenclaw started as a heavy package and became lean now hermes becomes the heabty trash package. Picking an agnostic OSS is paramount to be vendor loxked to codex app or claude but at least choose the most versatile. @openclaw any time of the day

English
47
21
422
63.2K