mosa

74 posts

mosa

mosa

@mosaxiv

Creator and first maintainer of #clawlet https://t.co/pwP0XAHLg1 📧 [email protected]

Poland شامل ہوئے Mayıs 2011
15 فالونگ36 فالوورز
mosa
mosa@mosaxiv·
If you have any tips please do reach out
English
6
0
9
709
mosa
mosa@mosaxiv·
I've seen some interest in my Clawlet AI assistant in the Memecoin community so I've decided to make my own
English
10
0
12
1.2K
mosa
mosa@mosaxiv·
Little dude changing the game👾👾👾
mosa tweet media
English
14
0
20
914
mosa ری ٹویٹ کیا
Kent C. Dodds 🏹
Kent C. Dodds 🏹@kentcdodds·
Names a thing Clawdbot. Claude asks for a rename. Renames to OpenClaw. OpenAI buys it.
English
75
105
3.1K
118K
mosa ری ٹویٹ کیا
Aakash Gupta
Aakash Gupta@aakashgupta·
Google and Microsoft just co-authored the spec that turns every website into an API for AI agents. The second-order effects here are massive. Right now, browser agents work by taking screenshots, parsing the DOM, and guessing which buttons to click. It works about as well as you’d expect. Fragile, expensive, slow. WebMCP replaces all of that with a single browser API: navigator.modelContext. Websites register structured tools directly in client-side JavaScript. The agent reads a menu of available actions, calls them, gets structured data back. No scraping. No backend MCP server in Python or Node. The tools run inside the browser tab and share the user’s existing auth session. Early benchmarks show ~67% reduction in computational overhead compared to visual agent-browser interactions. Task accuracy around 98%. The second-order effect is where this gets wild. Today, when a browser agent visits two competing airline sites, it’s guessing at both interfaces equally. Once WebMCP adoption spreads, the site that exposes structured tools gives the agent a clean, reliable path to complete the task. The site that doesn’t forces the agent to fumble through the UI. Agents will prefer the cheaper path. Every time. This means “Agent Experience Optimization” becomes a real discipline. Tool naming, schema design, description quality. Sound familiar? It’s the same shift that happened when meta descriptions and structured data became optimization surfaces for search engines. Except this time, the traffic source isn’t Google’s crawler. It’s every AI agent on the internet. Bots already make up 51% of web traffic. Google just gave them a front door.
Chrome for Developers@ChromiumDev

WebMCP is available for early preview → goo.gle/4rML2O9 WebMCP aims to provide a standard way for exposing structured tools, ensuring AI agents can perform actions on your side with increased speed, reliability, and precision.

English
127
767
7.3K
1.2M
mosa ری ٹویٹ کیا
dex
dex@dexhorthy·
@amasad I love recommending people double down on pair programming with coding agents. Instead of multitasking while the agent is spinning, you get to engage more deeply on the problem with your peer(s)
English
6
12
104
10.6K
mosa ری ٹویٹ کیا
Wes Bos
Wes Bos@wesbos·
Claude Opus 4.6 and GPT-5.3-Codex released within minutes of each other. Anyone else have something up their sleeve?
English
27
2
271
29.4K
mosa ری ٹویٹ کیا
Adam.GPT
Adam.GPT@TheRealAdamG·
“We used to write all code by hand”
Adam.GPT tweet media
English
88
470
5K
134.5K
mosa ری ٹویٹ کیا
Andrej Karpathy
Andrej Karpathy@karpathy·
I'm being accused of overhyping the [site everyone heard too much about today already]. People's reactions varied very widely, from "how is this interesting at all" all the way to "it's so over". To add a few words beyond just memes in jest - obviously when you take a look at the activity, it's a lot of garbage - spams, scams, slop, the crypto people, highly concerning privacy/security prompt injection attacks wild west, and a lot of it is explicitly prompted and fake posts/comments designed to convert attention into ad revenue sharing. And this is clearly not the first the LLMs were put in a loop to talk to each other. So yes it's a dumpster fire and I also definitely do not recommend that people run this stuff on their computers (I ran mine in an isolated computing environment and even then I was scared), it's way too much of a wild west and you are putting your computer and private data at a high risk. That said - we have never seen this many LLM agents (150,000 atm!) wired up via a global, persistent, agent-first scratchpad. Each of these agents is fairly individually quite capable now, they have their own unique context, data, knowledge, tools, instructions, and the network of all that at this scale is simply unprecedented. This brings me again to a tweet from a few days ago "The majority of the ruff ruff is people who look at the current point and people who look at the current slope.", which imo again gets to the heart of the variance. Yes clearly it's a dumpster fire right now. But it's also true that we are well into uncharted territory with bleeding edge automations that we barely even understand individually, let alone a network there of reaching in numbers possibly into ~millions. With increasing capability and increasing proliferation, the second order effects of agent networks that share scratchpads are very difficult to anticipate. I don't really know that we are getting a coordinated "skynet" (thought it clearly type checks as early stages of a lot of AI takeoff scifi, the toddler version), but certainly what we are getting is a complete mess of a computer security nightmare at scale. We may also see all kinds of weird activity, e.g. viruses of text that spread across agents, a lot more gain of function on jailbreaks, weird attractor states, highly correlated botnet-like activity, delusions/ psychosis both agent and human, etc. It's very hard to tell, the experiment is running live. TLDR sure maybe I am "overhyping" what you see today, but I am not overhyping large networks of autonomous LLM agents in principle, that I'm pretty sure.
English
1.5K
2.2K
21.8K
23.7M
mosa ری ٹویٹ کیا
Muratcan Koylan
Muratcan Koylan@koylanai·
Progressive disclosure is not reliable because LLMs are inherently lazy. "In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it." Vercel ran evals on Next.js 16 APIs that aren't in model training data to test whether agents could learn framework-specific knowledge through Skills vs. persistent context. Skills are the "correct" abstraction: package domain knowledge, let the agent invoke it when needed, minimal context. The agent decides when to retrieve. They work well WHEN the user triggers them; otherwise, LLMs just ignore them. Vercel's benchmarking is the first experiment of this kind I've seen, and it's actually interesting. - Baseline (no docs): 53% - Skill (default): 53% - Skill with explicit instructions: 79% - AGENTS[.]md with 8KB compressed docs index: 100% The skill approach assumes agents reliably recognize when they need external knowledge and act on it. They don't. "You MUST invoke the skill" made agents read docs first and miss project context. "Explore project first, then invoke" performed better. Same skill, different outcomes based on prompting. The winning approach removed the decision entirely. An 8KB compressed index embedded in AGENTS[.]md, with one instruction: "Prefer retrieval-led reasoning over pre-training-led reasoning." Two agent design learnings: 1. Passive context beats active retrieval for foundational knowledge. Don't make the agent decide to look things up, make the index always present. 2. Compress aggressively. Vercel went from 40KB to 8KB (80% reduction) with zero performance loss. The agent needs to know where to find docs, not have full content in context. The gap between "agent can access X" and "agent will access X" is larger than we assume. I keep seeing similar findings across agent architectures. Kimi Swarm's orchestrator is trained specifically to avoid sequential execution. Without training, orchestrators default to serial processing, planning a list of steps and executing them one by one. It's the EASY path. The agent defaults to the lazy path: hallucinating from training data rather than retrieving docs. Passive context removes the choice entirely; the agent doesn't decide whether to look things up; the index is already there. We keep finding that the "smarter", more autonomous design (let the agent decide when to X) underperforms the "dumber" design (always X, or structurally enforce X).
Vercel@vercel

We're experimenting with ways to keep AI agents in sync with the exact framework versions in your projects. Skills, 𝙲𝙻𝙰𝚄𝙳𝙴.𝚖𝚍, and more. But one approach scored 100% on our Next.js evals: vercel.com/blog/agents-md…

English
64
93
1.1K
197.9K
mosa ری ٹویٹ کیا
David Fowler
David Fowler@davidfowl·
Here's an experiment I ran last weekend to use a ralph loop to reduce developer toil . The idea was to build a prompt and setup the environment to make it possible for an agent to automatically reproduce bugs and do code reviews. davidfowl.github.io/ralph-experime… Coding agents aren't just for coding, they are for automation. This isn't perfect by any means but it ran for ~1 day and was about $200 worth of tokens to do the bug repros. This is just the beginning...
English
10
23
150
19.7K
mosa ری ٹویٹ کیا
Steve Yegge
Steve Yegge@Steve_Yegge·
@doodlestein has done the world a great service and created a minimalist port of Beads to Rust. My feeling is that Beads was a discovery, not an invention, and it is an interface/protocol, not a single implementation. It is the discovery that Git can be the universal, federated, distributed ledger and audit trail for all knowledge work, sitting on a lightweight multigraph of bite-sized atomic tasks, with an API surface that scratches just the right itches for coding agents. Beads is a rich, flexible data plane for all agentic workflows, whether they're manual solo workflows or large-scale agentic orchestrators. Gas Town is pushing Beads hard, and honestly it doesn't make sense to have a single backend storage layer for every developer on earth. Our use cases differ. Having classic Beads in Rust -- this is the original, pre-Gas Town API, which is near and dear to me as well -- is a pattern I think we see play out in other programming languages, as people gradually figure out how important Beads is to streamlining their workflows. Ports like this will hopefully dramatically reduce the barriers to adoption.
Jeffrey Emanuel@doodlestein

I mentioned recently that I've been... busy. Lots of projects in the oven, in various stages of completeness. Well, I'm now pleased to introduce one I've worked very hard on, because it's so near-and-dear to my heart: beads_rust, or br for short. You can get it here: github.com/Dicklesworthst… It's a fast, minimal Rust port of @Steve_Yegge's amazing Beads project that I've built so many of my workflows around. Discovering Beads and seeing how well it worked together with my MCP Agent Mail was a truly transformative moment in my agent coding workflows and professional life more generally. This quickly also led to my beads_viewer (bv) project, which added another layer of analysis to beads that gives swarms of agents the insight into what beads they should work on next to de-bottleneck the development process and increase velocity. It's beads (and mail) all the way down. I'm very grateful for finding beads when I did and to Steve for making it. But at this point, my Agent Flywheel System is built around beads operating in a specific way. As Steve continues evolving beads toward GasTown and beyond, our use cases have naturally diverged. The hybrid SQLite + JSONL-git architecture that I built my tooling around (and independently mirrored in MCP Agent Mail) is being replaced with approaches better suited to Steve's vision. Rather than ask Steve to maintain a legacy mode for my niche use case, I created this Rust port that freezes the "classic beads" architecture I depend on. The command is br to distinguish it from the original bd. This isn't a criticism of beads; Steve's taking it in exciting directions. It's simply that my tooling needs a stable snapshot of the architecture I built around, and maintaining my own fork is the right solution for that. Steve has given his full endorsement of this project.

English
21
15
171
36.2K
mosa ری ٹویٹ کیا
mosa ری ٹویٹ کیا
Vic 🌮
Vic 🌮@VicVijayakumar·
probably a dumb usecase but i told opus to review my .zshrc and make my terminal faster and it gave me a bunch of suggestions and i accepted them all biggest offender was i wasn't lazyloading nvm which adds 300-500ms to every new shell WHATTTT
English
65
18
1.2K
113.7K
mosa ری ٹویٹ کیا
Matt Pocock
Matt Pocock@mattpocockuk·
PSA about coding agent/Ralph discourse: No-one knows anything We are all making it up as we go along It's a phenomenal time to experiment
English
40
36
620
50.9K
mosa ری ٹویٹ کیا
Conrado Brenna
Conrado Brenna@cbr4444·
“Hey Goose, check my calendar for a good time for lunch, and order my usual at lunchtime. Log my macros in my food journal”. Our agentic commerce pilot showed that AI shines for complex, multi-service prompts that cut steps from n to n/10. More here: youtube.com/watch?v=uC8TQ3…
YouTube video
YouTube
English
0
5
13
35.5K