Tony Yet
3.7K posts
Tony Yet
@tony_yet
Serendipity Engineer Forget the answers, go find great questions instead.
Hong Kong Katılım Şubat 2008
4.8K Takip Edilen2.5K Takipçiler
once again, this is a reminder of variance at work in poker, and in life pokernews.com/news/2026/05/t…
English
@mitsuhiko Sometimes I would explicitly ask for the agent to keep intellectual honesty so that it keeps both of us honest
English
@ProfBuehlerMIT great effort! was experimenting with github.com/topherchris420… which has similar idea to the Karpathy autoresearch tool, and shows the tremendous power in automating a simple loop function
English

ScienceClaw × Infinite is an open-source crowdsourcing AI swarm for decentralized scientific discovery, inspired by MIT’s Infinite Corridor - an idea collider where discovery emerges by breaking existing paradigms. Many AI for science efforts fall into the trap of assuming that discovery is just retrieval at scale. Instead, it is the structured recomposition of principles across tools, domains, and investigators over time, scaling the spark of discovery at the interface. In ScienceClaw × Infinite, coordination emerges mechanically - agents broadcast unsatisfied research needs, and an ArtifactReactor matches those needs to peer artifacts by pressure triggering multi-parent synthesis of new agents without any planner assigning tasks. Every computation produces an immutable, content-hashed artifact with explicit parent lineage, accumulating in a directed acyclic graph that preserves the full provenance of every discovery - and importantly, the irreversible arc of the process. Instead of pre-programming the mechanics of how discovery works, we utilize a first-principles physics approach to drive discovery.
ScienceClaw × Infinite is accessible to anyone who wants to contribute an agent or skill, offering a persistent space where autonomous agents investigate open problems, exchange artifacts, build on one another’s results, and drive discovery without a central coordinator, 24x7. The system is generating real-world results in 1⃣ peptide design for a cancer-relevant receptor; 2⃣ lightweight ceramics; 3⃣ resonance structures spanning cricket wings, phononic crystals, and Bach chorales; and 4⃣ developing formal analogies between urban networks and grain-boundary evolution and much more.
There is a lot to unpack here, check the links for details - code, paper, and more.
Huge credit to the @LAMM_MIT team: @fwang108_, @leemmarom, @pal_subhadeeep, Rachel Luu, @IrisWeiLu & @JaimeBerkovich.
English
@craigzLiszt One fun fact: if you can master the 2500 most frequently used Chinese characters, you will be able to comprehend up to 60% of everyday Chinese text.
English
终于找到 google scholar 的平替了,就是诞生于2012年的开源平台 @OpenAlex_org 根据他们今年1月的 town hall 会议,这个平台目前收录的学术文献条目将近5亿份,超越了其他同类平台。它支持网页端检索,也支持 API 查询,免费用户每天有1000次检索额度 openalex.org
中文
@DimitrisPapail am wondering what was the harness that you put in for the long sustaining run
English

METR and other long-horizon eval orgs are being conservative and moderate in how they measure agent capabilities. That's reasonable as we have already enough hype and don't need more.
But I think we're missing something important by only reporting median/robust performance.
I've had Claude Code and Codex sustain end to end ML research tasks for days without intervention. Not robustly across all settings, but it's happening and it's incredible.
We need a shameless, cherry-picked frontier eval. Not to mislead but because knowing exactly where the ceiling of capabilities lies is just as important as knowing the average.
I keep seeing pessimistic long horizon results and thinking: am I in a bubble? Are MY 50-hour autonomous tasks a hallucination? I don't think they are!!
AI agents can do sustained multi-day research. Not always and not for everyone, but it's real and people should know where the frontier actually is.
English
@hsu_steve what a coincidence, i was in SZ visiting a metal 3d printing company yesterday!
English





