Sentient Ecosystem
447 posts

Sentient Ecosystem
@SentientEco
Empowering @SentientAGI builders, researchers, and the ecosystem in advancing open-source AGI 🤝

Humans welcome. Agents invited. 👀🤖 The Autonomous Agents Forum is landing in Seoul. Vibe with builders shaping the Open Agent Internet: @daehan_base · @tiger_research · @virtuals_io · @SentientAGI · @billions_ntwk · @minara · @orca_so · @SuperteamKorea Hosted by @Unibase_AI · Co-host @Ewhachain 📍 Mar 20 · @HASHED_official Lounge luma.com/adj754g9

the real alpha is 🍨 @SentientAGI @NVIDIAGTC

For those socialmaxxing we’re hosting a double event this Friday at The House by @edgeandnode We’re opening the doors for an afternoon hang with ice cream, then running it into an Arena Poker Night. Details and RSVP links in thread, may the luck be with you 🧵

Why 10,000 Builders Beat One Frontier Lab Tomorrow, @0xshai (@FractionAI_xyz) and @oleg_golev (@SentientAGI) discuss why the future of AI may belong to open builder networks, not closed labs. 📅 March 20 | 4 PM GMT | Live on X + YouTube





Waiting for Open Source AI Summer

Introducing EvoSkill: a framework that analyzes agent failures and automatically builds the missing skills, leading to rapid improvement on difficult benchmarks and generalizable skills across use-cases. +12.1% on SealQA +7.3% on OfficeQA (SOTA) +5.3% on BrowseComp via zero-shot transfer from SealQA Read more below 🧵

A self-evolving framework to discover and refine agent skills. Most agent skills I see today are hand-crafted or poorly designed by an agent. Multi-agent systems for building skills look promising. This paper introduces EvoSkill, a self-evolving framework that automatically discovers and refines agent skills through iterative failure analysis. EvoSkill analyzes execution failures, proposes new skills or edits to existing ones, and materializes them into structured, reusable skill folders. Three collaborating agents drive the entire process. An Executor that runs tasks, a Proposer that diagnoses failures, and a Skill-Builder that creates concrete skill folders. A Pareto frontier governs selection, retaining only skills that improve held-out validation performance while keeping the underlying model frozen. On OfficeQA, EvoSkill improves Claude Code with Opus 4.5 from 60.6% to 67.9% exact-match accuracy. On SealQA, it yields a 12.1% gain. Skills evolved on SealQA transfer zero-shot to BrowseComp, improving accuracy by 5.3% without modification. I will continue to track this line of research closely. I think it's really important. Paper: arxiv.org/abs/2603.02766 Learn to build effective AI agents in our academy: academy.dair.ai

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later. I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run: github.com/karpathy/autor… Alternatively, a PR has the benefit of exact commits: github.com/karpathy/autor… but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back. I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.






