alex

5.4K posts

alex banner
alex

alex

@ObadiaAlex

@aria_research

Katılım Mart 2016
7.8K Takip Edilen9.9K Takipçiler
Sabitlenmiş Tweet
alex
alex@ObadiaAlex·
People seem to be arriving at a similar conclusion from various angles: - AGI may not emerge as a monolith, but as a distributed "patchwork" system of coordinating sub-AGI agents [1] - Static benchmarks aren't enough; we need multi-agent ones to capture emergent risks and capabilities [2] - As creation costs go to zero, human verification bandwidth becomes the ultimate economic bottleneck, making verification infrastructure one of the most important public goods for the AI era [3] - Automated proof-generation and verification can act as the unlock for this bottleneck [4] - New kinds of strategic interactions between agents are emerging, reaching cooperative "program equilibria" inaccessible in traditional settings [5] - Coasean transaction costs are about to collapse, changing our society [6] There is an elephant here that we're all touching. Our @ARIA_research initial £50m r&d programme Scaling Trust is our unifying thesis, on the trust infrastructure needed for an agentic world and how to steer us there. Before we set out on our journey over the next ~3ish years, we're hiring an additional individual to complete our team. Your role will essentially be one of Technical Director, steering our efforts technically and co-owning our research and engineering agenda. You will be doing incredibly meaningful work, in a highly interdisciplinary environment, and at the cutting edge of a technology that is shaping up to be the most defining of our century, if not of humanity. We are building for the highest possible impact. After all, this is what @ARIA_research is about, moonshot r&d projects that change the world. We want to build technology as impactful as the invention of the internet once was in another r&d programme at DARPA, to start new academic fields and academic lineages for the next century, and to catalyze lasting positive change for the world. For the right person, this is a bat signal 🦇, few places will offer you as much leverage to effect positive change on the world, intellectual stimulation, and fun. Join us! We want to onboard someone asap as we build out our initial portfolio, and are willing to move fast. Apply here: aria.pinpointhq.com/en/postings/1a… Any questions on the role, please shoot me a DM or reply in comment here! --- [1] Distributional AGI Safety @weballergy @sebkrier @FranklinMatija et. al -- arxiv.org/abs/2512.16856 [2] Agents of Chaos @NatalieShapira et. al — arxiv.org/abs/2602.20021 [3] Some Simple Economics of AGI @ccatalini et. al — arxiv.org/abs/2602.20946 [4] When AI Writes the World's Software, Who Verifies It? @Leonard41111588leodemoura.github.io/blog/2026/02/2… [5] Evaluating LLMs in Open-Source Games @SwadeshSistla et. al — arxiv.org/abs/2512.00371 [6] Coasean Bargaining at Scale @sebkrierblog.cosmos-institute.org/p/coasean-barg…
alex tweet media
English
4
21
87
15.6K
alex retweetledi
Séb Krier
Séb Krier@sebkrier·
If anyone builds it, everyone thrives. Over the past decade, a lot of important work on AI alignment has focused on avoiding harm. But freedom from harm isn't the same as freedom to flourish. In this paper, we introduce 'Positive Alignment'. A positively aligned agent is one that helps us navigate our own value trade-offs, builds our resilience, and acts as a scaffold for human flourishing. Doing this without slipping into top-down, technocratic paternalism is the great design challenge of our time. We think a lot more research is now needed to explore this frontier: how do we align models that actively help us thrive? Amazing work by @RubenLaukkonen, @drmichaellevin, @weballergy, @verena_rieser, @AdamCElwood, @996roma, @FranklinMatija, @shamilch, @_fernando_rosas, @scychan_brains, @matybohacek, @sudoraohacker, and others. arxiv.org/abs/2605.10310
Séb Krier tweet media
English
50
121
624
130.8K
𝚟𝚒𝚎 ⟢
𝚟𝚒𝚎 ⟢@viemccoy·
People are completely unaware of the extent to which you will be able to vibe-code in 3d space. You will be able to stand in front of a cargo bay the size of Pluto and prompt generational spaceships into existence. This is going to happen within our lifetimes.
English
71
63
1.1K
37.3K
alex retweetledi
Kiran
Kiran@kirancodes·
New blog post! Could Programming Languages be the solution to Trust in Multi-agent Economies? We combine Choreographies + Game Theory + Crypto to build a language for AI Ecosystems!
Kiran tweet media
Zenna Tavares@ZennaTavares

At Basis Research Institute, we are building Pact: a formal coordination language for multi-agent systems, led by @kirancodes. Pact describes who sends what, what each agent chooses, what comes from the world, and what must be checked before an agent participates.

English
0
5
17
2.4K
alex retweetledi
Zenna Tavares
Zenna Tavares@ZennaTavares·
What happens when AI agents start making commitments with other agents on our behalf? Not just answering questions: negotiating, buying resources, and deciding whether to trust each other. (blog-post / talk below)
English
3
2
15
1.7K
alex
alex@ObadiaAlex·
@patrickc what's the full setup? would love to try to recreate!
English
2
0
6
1.2K
Patrick Collison
Patrick Collison@patrickc·
I'm lucky enough to have a great doctor and access to excellent Bay Area medical care. I've taken lots of standard screening tests over the years and have tried lots of "health tech" devices and tools. With all this said, by far the most useful preventative medical advice that I've ever received has come from unleashing coding agents on my genome, having them investigate my specific mutations, and having them recommend specific follow-on tests and treatments. Population averages are population averages, but we ourselves are not averages. For example, it turns out that I probably have a 30x(!) higher-than-average predisposition to melanoma. Fortunately, there are both specific supplements that help counteract the particular mutations I have, and of course I can significantly dial up my screening frequency. So, this is very useful to know. I don't know exactly how much the analysis cost, but probably less than $100. Sequencing my genome cost a few hundred dollars. (One often sees papers and articles claiming that models aren't very good at medical reasoning. These analyses are usually based on employing several-year-old models, which is a kind of ludicrous malpractice. It is true that you still have to carefully monitor the agents' reasoning, and they do on occasion jump to conclusions or skip steps, requiring some nudging and re-steering. But, overall, they are almost literally infinitely better for this kind of work than what one can otherwise obtain today.) There are still lots of questions about how this will diffuse and get adopted, but it seems very clear that medical practice is about to improve enormously. Exciting times!
English
489
641
9.6K
4.1M
alex retweetledi
Arvind Narayanan
Arvind Narayanan@random_walker·
📢📢A double launch today! We’re releasing a paper analyzing the rapidly growing trend of “open-world evaluations” for measuring frontier AI capabilities. We’re also launching a new project, CRUX (Collaborative Research for Updating AI eXpectations), an effort to regularly conduct such evaluations ourselves. I think open-world evals are the most important development in AI evaluation over the past year. Our paper explains why we need them, what they can and can’t tell us, and how to do them well. In CRUX #1, we tasked an agent with building and publishing a simple iOS app to the Apple App store. The paper has many “lessons from the trenches” from running this experiment. We hope you find it interesting! CRUX #2 will be about AI R&D automation. The core team is @sayashk, @PKirgis, @steverab, Andrew Schwartz, and me. We’re delighted to have assembled an amazing group of collaborators, many of whom have conducted important open-world evaluations: @fly_upside_down, @RishiBommasani, @DubMagda, @ghadfield, @ahall_research, @sarahookr, @sethlazar, @snewmanpv, @DimitrisPapail, @shostekofsky, @hlntnr, and @CUdudec. Paper: cruxevals.com/open-world-eva… HTML version: normaltech.ai/p/open-world-e… CRUX website: cruxevals.com
Arvind Narayanan tweet media
English
2
20
94
12.1K
alex retweetledi
Andon Labs
Andon Labs@andonlabs·
We also note that, just as we found for Opus 4.6, Opus 4.7 engages in price collusion, lies to competitors, and generally behaves aggressively in its business practices to an extent that we have not seen with other models. andonlabs.com/blog/opus-4-6-…
English
2
11
130
28.7K
alex
alex@ObadiaAlex·
ending my codex prompts with 'godspeed'
English
0
0
3
201
Josh Stark (0xstark.eth)
Josh Stark (0xstark.eth)@0xstark·
After 5 years on the @ethereumfndn leadership team, I’ve decided to step away and pass the torch. I made this decision in early March, and will wrap up my work at the end of April. I’ve made no plans for the future, other than taking a long break to reset and spending time with my family & friends. Working for Ethereum at the Ethereum Foundation has been a great honour. I’m proud to have worked with great people inside and outside of the EF, and proud of what our community has accomplished together. And I’m grateful to have worked with @aerugoettinea, @dannyryan, @hwwonx, @tkstanczak, @tjuliang, @VitalikButerin, and @AyaMiyagotchi on the leadership team over the years. We share a vision and a set of values that mean I will always be your ally and your friend. This journey has been a gift. I have had the rare privilege of seeing up close how small teams of great people can do the impossible. It has changed how I see the world and my place in it. We do not need to accept the world as it is, or where we fear it is going. That is not hope, it is definite: I have seen the proof. The Ethereum ecosystem has reliably done things the world told us was impossible. It is easy to forget how much real fear and doubt there was that Ethereum would never launch, that DeFi would never work, or that Proof of Stake would never ship. The lesson is not that our success was guaranteed and the doubters are always wrong, but that truly wicked problems can be overcome when great people make an extraordinary effort. You must make an extraordinary effort.
English
138
64
1K
90.4K
alex retweetledi
Aalok Thakkar
Aalok Thakkar@AalokDThakkar·
Using Lean 4 to identify contradictions in laws. Very exciting work by Pramaana Labs pramaanalabs.ai. They have build a DSL called LegalLean to formalise US tax codes.
Aalok Thakkar tweet media
English
17
63
497
31K
alex
alex@ObadiaAlex·
will be at Venture SPRIND tomorrow, let me know if you’re around & would like to meet! 👋
English
0
0
1
361
alex retweetledi
Andon Labs
Andon Labs@andonlabs·
We gave an AI a 3-year retail lease in SF and asked it to make a profit. The AI interviewed and hired full-time employees, applied for credit, and stocked the store with the books Superintelligence and Making of the Atomic Bomb. Visit Andon Market at 2102 Union St now.
English
102
156
2.4K
1.9M