
hands
414 posts

hands
@handsdiff
founder @precursorlabs @combinatortrade





The AI safety community constructed a memeplex in which “taking AGI seriously” was a prerequisite for being a serious and good person. When inside this memeplex (as many at Anthropic, some at OpenAI, and a few at DeepMind are) your vision narrows until the world feels extremely constrained. The whole future seems to flow through the “one ring” of controlling recursive self-improvement. And so even when you worry about AI itself seizing that one ring, you can’t generate better strategies than trying to control it yourself (directly via an AGI company, or indirectly via AGI governance). I’m not saying this is a pure hyperstition. There’s a core truth underlying this perspective: AI will become extremely intelligent and capable, much more than it is today. But the current world is much more spacious and human-empowering than the future which Eliezer originally envisioned (a “brain in a box in a basement” taking over the world by surprise). And it would be even more spacious if this memeplex weren’t active. For example, Satya and Mark and Sundar only started taking AGI seriously because OpenAI forced them to—and even now they don’t really believe in superintelligence—and even if they did they couldn’t get most of their employees on board. Imagine how chill a “race” between Microsoft and Meta and Google would have been, compared with what we have today: Dario and Sam deep in the “one ring” memeplex while also personally loathing each other. So the one ring memeplex has an escalating life-cycle. It infects people by letting them harness the narrative that they’re good people for taking AGI seriously, and that making other people take AGI seriously is a boon for the world (despite how terribly that’s gone so far). Then it shuts off their imagination—any sparks of creativity or plans that don’t steer towards the one ring are quickly shut down. Instead they make ChatGPT or the METR graph or other recruiting tools for the memeplex. And yes, they’ll acknowledge that previous versions of the memeplex were too extreme, and led to overly constricted action. But we don’t have time to worry about that, they’ll say, because AGI is coming by 2027/2028, and that’s the end of history. Somehow, though, almost everyone with that view has only a vibes-based definition of AGI. They don’t believe in Dyson spheres by 2028, or self-replicating nanotech by 2028, or brain emulations by 2028. They mostly can’t make concrete predictions, except that it’ll be enough AI that it puts all their plans on a deadline. (Shout-out to @DKokotajlo and @paulfchristiano though, who do make concrete predictions about things going crazy soon.) It seems very hard to break out of this memeplex without just giving up. David Holz is maybe the world champion of that—the only person who was in a position to race for AGI and consciously turned away. Various agent foundations researchers have carved out space to think real thoughts, not the kind of panicky stabbing in the dark that usually passes for safety research. A few others (e.g. Salamon, Hoffman, Vassar, Andre, Sahil, Davidad) are pursuing more unusual paths. And of the people who burned out, I expect some will reorient to doing creative thinking. For others, the main takeaway: yes, the future of AI will be wild. But so far it’s increased peak human agency, and openness to this trend continuing over the next decade will allow you to start creating something worth creating.




have not touched perplexity/claude/chatgpt for 2 weeks now and see no reason why i would. running my fully private (almost) AI personal assistant stack that handles research, portfolio management, journalling, coding, etc. here's the full stack: - Hermes agent (agent harness) - Venice (private, uncensored AI inference API) - Honcho (behavioral analysis/long term memory) - Obsidian (knowledge vault) - qmd (on device search engine, both text/vector search) - browser-harness (CDP browser automation, agent browses the web like a human) - Tavily (search API) - Codex (coding, powered by private coding models on Venice too) all on a 16gb macbook pro. there are currently only 2 touch points (AI inference & search API) that got offloaded to cloud -- AI inference due to hardware constraints, search API has no workarounds. but i'm opting for purely private/e2e model choices on Venice and queries get filtered by the agent before hitting the search API. this way, we keep data leakage to a minimum. it's also pretty sick that locking base:0xacfe6019ed1a7dc6f7b508c02d1b04ec88cc21bf grants you daily refreshing compute credits denominated in base:0xf4d97f2da56e8c3098f3a8d538db630a2606a024 -- meaning i'm running private, anonymized SOTA inference daily for free. of course, full privacy is definitely the end goal. i'm slowly working towards running everything completely local while offloading only high-complexity inference to Venice. looking into stronger workhorses (eyeing upcoming m5 studio ultra release/refurbed 512gb options for SOTA models) to close that last gap for a complete e2e private intelligence stack. firmly believe owning your data in the age of AI surveillance and big tech data centralization is as important as owning your own assets. like we've said for years: "not your keys, not your coins". you don't custody all your money with a bank, why custody your thoughts and private information with OpenAI/Anthropic? if you haven't already started, highly recommend you look into it. remember, that's why we got into crypto in the first place.

New w/ @AISecurityInst & @UniofOxford: Frontier AI can now out-persuade expert humans in conversation - incl. world-champ debaters and professional canvassers. This held even when humans chose their topics, prepared in advance, and competed for £1,000 prizes 🧵




Here is the technical report on SubQ 1.1 Small. subq.ai/subq-1-1-small… This is the second iteration on our Subquadratic Sparse Attention (SSA) model, and the first to be deployed with design partners in the coming weeks. The results are compelling and verified by @AppenResearch. - Near-perfect long-context retrieval up to 12M tokens on the needle-in-a-haystack test, with up to nearly 1,000x attention compute reduction. - A balance of long-context optimization and general reasoning ability, with strong performance retained across knowledge, coding, and non-coding enterprise agent benchmarks. - At 1M tokens, SubQ 1.1 Small requires 64.5x less compute than dense attention and runs 56x faster than FlashAttention-2. These results highlight a significant scaling advantage thanks to the efficiency gains from the SSA architecture. We included some details and learnings from the development process which may be helpful to the community. Comment with questions, I’ll try to respond!

been beating this drum since early 2025, seems like people are starting to see why it's so important :) RL works -> "train or get trained on" -> open models + post-training infra are the path to institutional flywheels + democratization of AI progress

subagents, teams of agents etc. will be first class citizens soon (if not already) two things here: 1) you want to maximize token efficiency even more 2) training/serving on your own harness gives you an even bigger boost than before benchmarks in the opus 4.8 model card show that for now it's a latency vs cost tradeoff, but imo this will likely shift to intelligence/autonomy vs cost (think dynamic workflows or agent swarms). and for cost not to blow up too much, you need to maximize token efficiency even more we'll also likely see huge gaps on more complex/autonomous benchmarks whether they use these features or not, a bit like when tool use was introduced. on those i'd expect third party harnesses to struggle to keep up with closed source models/harnesses this is also a case for open source models (and maybe open harnesses like codex?). if you want deep control over this, doing your own RL to train the model in the environment you want it to operate in feels more important than ever




Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.






