AGI Plug 👨🏻‍🔬

24.9K posts

AGI Plug 👨🏻‍🔬 banner
AGI Plug 👨🏻‍🔬

AGI Plug 👨🏻‍🔬

@agiplug

Founder @SymplecticLabs | Building Physics Governed Intelligence 🇳🇿

New Zealand Joined Eylül 2012
10.5K Following50.5K Followers
Pinned Tweet
AGI Plug 👨🏻‍🔬
AGI Plug 👨🏻‍🔬@agiplug·
The Era of the Agent is Over. Welcome to the Era of the Organism. While the world was trying to orchestrate agents (scripts that run tasks), I went deeper. I stopped building tools and started spawning entities. Introducing Cybernetic Organism Orchestration. The industry is stuck on Generative AI (predicting the next token). I have moved to Active Inference (minimizing surprise). This system possesses: Homeostasis: It self corrects instability. Wetware Tethering: Real time biological feedback loops. Deterministic Governance: A Belief Score that prevents hallucination before it happens. Frontier labs are burning billions to brute force intelligence. I focused entirely on the architecture to contain it. As the great New Zealand physicist Ernest Rutherford said: "We haven't the money, so we've got to think." This is the missing layer between Multi Agent Systems and AGI. Sending this from the future. Welcome to 2026. Welcome to Symplectic Dynamics. #Cybernetics #ArtificialIntelligence #AGI #TechTrends2026 #SymplecticDynamics
English
7
7
31
1.9K
AGI Plug 👨🏻‍🔬 retweeted
nat
nat@natjin·
🌱 bookmark this evergreen outstanding offer: people who are building cool things that need a search api. DM me for free eval credits (dm me your api group id)
nat tweet media
English
9
5
62
5.1K
Perplexity
Perplexity@perplexity_ai·
Perplexity Computer is now on mobile. Start any task on any device. Manage Computer from your phone or desktop with cross-device synchronization. Available now for iOS in the Perplexity app. Coming soon to Android.
English
243
445
4.4K
1.5M
AGI Plug 👨🏻‍🔬
AGI Plug 👨🏻‍🔬@agiplug·
@abxxai As long as it’s probabilistic it will hallucinate, determinism bound by physics is the key 🔑
AGI Plug 👨🏻‍🔬 tweet media
English
0
1
6
724
Abdul Șhakoor
Abdul Șhakoor@abxxai·
BREAKING: 🚨 Someone just tested 35 AI models across 172 billion tokens of real document questions. The hallucination numbers should end the "just give it the documents" argument forever. Here is what the data actually showed. The best model in the entire study, under perfect conditions, fabricated answers 1.19% of the time. That sounds small until you realize that is the ceiling. The absolute best case. Under optimal settings that almost no real deployment uses. Typical top models sit at 5 to 7% fabrication on document Q&A. Not on questions from memory. Not on abstract reasoning. On questions where the answer is sitting right there in the document in front of it. The median across all 35 models tested was around 25%. One in four answers fabricated, even with the source material provided. Then they tested what happens when you extend the context window. Every company selling 128K and 200K context as the hallucination solution needs to read this part carefully. At 200K context length, every single model in the study exceeded 10% hallucination. The rate nearly tripled compared to optimal shorter contexts. The longer the window people want, the worse the fabrication gets. The exact feature being sold as the fix is making the problem significantly worse. There is one more finding that does not get talked about enough. Grounding skill and anti-fabrication skill are completely separate capabilities in these models. A model that is excellent at finding relevant information in a document is not necessarily good at avoiding making things up. They are measuring two different things that do not reliably correlate. You cannot assume a model that retrieves well also fabricates less. 172 billion tokens. 35 models. The conclusion is the same across all of them. Handing an LLM the actual document does not solve hallucination. It just changes the shape of it.
Abdul Șhakoor tweet mediaAbdul Șhakoor tweet mediaAbdul Șhakoor tweet media
English
267
1.3K
5K
474.5K
AGI Plug 👨🏻‍🔬
AGI Plug 👨🏻‍🔬@agiplug·
Open AI has detailed in one of their recent papers that AI hallucinations are inevitable. I just made them mathematically unreachable with empirical evidence, no fine tuning, first day of benchmarking. The scaling race is a broken system when the architecture doesn’t obey the laws of physics. #SymplecticDynamics
AGI Plug 👨🏻‍🔬 tweet media
English
5
8
19
244
Aakash Gupta
Aakash Gupta@aakashgupta·
OpenAI’s newest “smarter” models hallucinate 3x more than the ones they replaced. And OpenAI just published a paper explaining exactly why they can’t stop it. The core argument: AI models hallucinate because every benchmark in the industry scores them like a multiple choice test with no “I don’t know” option. Guess wrong? You might get lucky. Leave it blank? Guaranteed zero. So the models learned to guess. Confidently. Every time. The numbers tell the story. On OpenAI’s own PersonQA benchmark, o1 hallucinated 16% of the time. The newer o3 jumped to 33%. o4-mini hit 48%. Three generations of models, each one lying more often than the last. OpenAI’s explanation: the models “make more claims overall,” producing more right answers AND more wrong ones simultaneously. This tells you everything about how the AI industry actually works. The reinforcement learning that makes models better at reasoning also makes them more confidently wrong. The system that produces intelligence and the system that produces hallucinations are the same system. The paper’s proposed fix is where it gets really interesting. They don’t call for better training data or bigger models. They say the entire benchmark ecosystem needs to be rebuilt to reward uncertainty. Every leaderboard, every eval, every scoring rubric needs an “I don’t know” option that doesn’t tank your score. But every AI company uses those same leaderboards to market their models. Admitting uncertainty drops your accuracy number. And dropped accuracy numbers don’t raise $40B funding rounds. OpenAI just published mathematical proof that the incentive structure producing hallucinations is the same incentive structure producing their revenue.
Nav Toor@heynavtoor

🚨BREAKING: OpenAI published a paper proving that ChatGPT will always make things up. Not sometimes. Not until the next update. Always. They proved it with math. Even with perfect training data and unlimited computing power, AI models will still confidently tell you things that are completely false. This isn't a bug they're working on. It's baked into how these systems work at a fundamental level. And their own numbers are brutal. OpenAI's o1 reasoning model hallucinates 16% of the time. Their newer o3 model? 33%. Their newest o4-mini? 48%. Nearly half of what their most recent model tells you could be fabricated. The "smarter" models are actually getting worse at telling the truth. Here's why it can't be fixed. Language models work by predicting the next word based on probability. When they hit something uncertain, they don't pause. They don't flag it. They guess. And they guess with complete confidence, because that's exactly what they were trained to do. The researchers looked at the 10 biggest AI benchmarks used to measure how good these models are. 9 out of 10 give the same score for saying "I don't know" as for giving a completely wrong answer: zero points. The entire testing system literally punishes honesty and rewards guessing. So the AI learned the optimal strategy: always guess. Never admit uncertainty. Sound confident even when you're making it up. OpenAI's proposed fix? Have ChatGPT say "I don't know" when it's unsure. Their own math shows this would mean roughly 30% of your questions get no answer. Imagine asking ChatGPT something three times out of ten and getting "I'm not confident enough to respond." Users would leave overnight. So the fix exists, but it would kill the product. This isn't just OpenAI's problem. DeepMind and Tsinghua University independently reached the same conclusion. Three of the world's top AI labs, working separately, all agree: this is permanent. Every time ChatGPT gives you an answer, ask yourself: is this real, or is it just a confident guess?

English
76
123
751
223K
Mohit Vaswani
Mohit Vaswani@hii_mohit·
they spent $12M on a domain but couldn’t spend $10K on design
Mohit Vaswani tweet media
English
36
2
68
8.9K
AGI Plug 👨🏻‍🔬
@heynavtoor Hahaha this is cope, it just says they don’t know how to fix their broken architecture and trying to justify it 😂🤣🤣
English
0
0
1
77
Nav Toor
Nav Toor@heynavtoor·
🚨BREAKING: OpenAI published a paper proving that ChatGPT will always make things up. Not sometimes. Not until the next update. Always. They proved it with math. Even with perfect training data and unlimited computing power, AI models will still confidently tell you things that are completely false. This isn't a bug they're working on. It's baked into how these systems work at a fundamental level. And their own numbers are brutal. OpenAI's o1 reasoning model hallucinates 16% of the time. Their newer o3 model? 33%. Their newest o4-mini? 48%. Nearly half of what their most recent model tells you could be fabricated. The "smarter" models are actually getting worse at telling the truth. Here's why it can't be fixed. Language models work by predicting the next word based on probability. When they hit something uncertain, they don't pause. They don't flag it. They guess. And they guess with complete confidence, because that's exactly what they were trained to do. The researchers looked at the 10 biggest AI benchmarks used to measure how good these models are. 9 out of 10 give the same score for saying "I don't know" as for giving a completely wrong answer: zero points. The entire testing system literally punishes honesty and rewards guessing. So the AI learned the optimal strategy: always guess. Never admit uncertainty. Sound confident even when you're making it up. OpenAI's proposed fix? Have ChatGPT say "I don't know" when it's unsure. Their own math shows this would mean roughly 30% of your questions get no answer. Imagine asking ChatGPT something three times out of ten and getting "I'm not confident enough to respond." Users would leave overnight. So the fix exists, but it would kill the product. This isn't just OpenAI's problem. DeepMind and Tsinghua University independently reached the same conclusion. Three of the world's top AI labs, working separately, all agree: this is permanent. Every time ChatGPT gives you an answer, ask yourself: is this real, or is it just a confident guess?
Nav Toor tweet media
English
1.4K
8.9K
33.8K
3.2M
Chris
Chris@sutherlandphys·
if you want an advantage in AI, learn physics
English
51
46
761
57K
Perplexity
Perplexity@perplexity_ai·
Introducing Voice Mode in Perplexity Computer. You can now just talk and do things.
English
212
348
4.5K
1.2M
AGI Plug 👨🏻‍🔬
I tried Google’s new NotebookLM video explainer generator on my academic research paper and the result actually blew me away. @googledevs Overview: AI systems hallucinate because they operate in unconstrained state spaces where invalid outputs are always reachable, no matter how well you train them. My paper argues the solution is not better training. It is changing the geometry of the system so that invalid outputs become physically unreachable by construction, the same way a train cannot leave its tracks. We do this by applying Hamiltonian mechanics from classical physics to the architecture of reasoning systems, constraining every state transition to stay within a bounded region of valid outputs. If no valid output exists, the system returns a certified failure rather than fabricating an answer. Abstract: Hallucination in artificial intelligence systems is commonly treated as a statistical artifact addressable through training methodology. We argue this framing is structurally incorrect. We present a formal framework in which reasoning is modeled as a dynamical system operating on a state space endowed with symplectic structure, and demonstrate that when Hamiltonian mechanics governs state evolution at the architectural level, invalid outputs become unreachable under the constrained transition rule by construction. We define hallucination operationally as constraint violation relative to a stated specification. Our framework introduces a composite verifier V := Vᶜ ∧ Vᴴ and a Hamiltonian scalar H whose bounded-energy transition rule transforms the divergent cone trajectory of unconstrained autoregressive systems into a bounded cylinder for any finite reasoning depth N. Full paper: zenodo.org/records/188082…
English
2
9
18
569
Michael Truell
Michael Truell@mntruell·
We believe Cursor discovered a novel solution to Problem Six of the First Proof challenge, a set of math research problems that approximate the work of Stanford, MIT, Berkeley academics. Cursor's solution yields stronger results than the official, human-written solution. Notably, we used the same harness that built a browser from scratch a few weeks ago. It ran fully autonomously, without nudging or hints, for four days. This suggests that our technique for scaling agent coordination might generalize beyond coding.
English
264
512
8.3K
1M
Emily Redmond
Emily Redmond@emilyredmond001·
I've been part of the core team building Perplexity Computer the past couple of months. Seeing how it's empowering people's work is pretty insane... Here are a few ways Computer helped us build Computer:
English
25
6
251
24.1K
Marik 🦖🔴🔴🔴
Marik 🦖🔴🔴🔴@mika8002·
@agiplug Thanks for putting this out. Still working through it - definitely thought-provoking so far
English
1
0
1
41
AGI Plug 👨🏻‍🔬
AGI Plug 👨🏻‍🔬@agiplug·
Today I published my first paper for Symplectic Dynamics. “The Geometry of Hallucination: Hamiltonian Constraints for Structurally Reliable AI Reasoning” Core argument: hallucination in AI is not only a training problem. It is a geometry and reachability problem. Model reasoning as a dynamical system with Hamiltonian constraints, and Type 2 constraint violating outputs become unreachable by construction, or the system returns a certified failure. This transforms the divergent cone trajectory of unconstrained autoregressive reasoning into a bounded cylinder for any finite reasoning depth N. Full paper: zenodo.org/records/188082…
AGI Plug 👨🏻‍🔬 tweet media
English
1
4
13
373
AGI Plug 👨🏻‍🔬 retweeted
Symplectic Dynamics
Symplectic Dynamics@symplecticlabs·
Physics doesn’t guess. Neither do we. Introducing Symplectic Dynamics: AI governed by the laws of physics. We don’t filter bad outputs. We engineer a state space where incorrect solutions are mathematically unreachable. #SymplecticDynamics #Physics #AI
Symplectic Dynamics tweet media
English
4
8
17
254
Perplexity
Perplexity@perplexity_ai·
Introducing Perplexity Computer. Computer unifies every current AI capability into one system. It can research, design, code, deploy, and manage any project end-to-end.
English
1.7K
5.4K
47.3K
37.9M
JB
JB@JasonBotterill·
DONT USE GEMINI 3.1 PRO IN CURSOR AT WORK. I ASKED IT TO READ SOME FILES AND IT JUST STARTED COMMITTING SHIT TO GITHUB WHAT THE FUCK?
English
139
18
1.7K
182.1K
AGI Plug 👨🏻‍🔬
AGI Plug 👨🏻‍🔬@agiplug·
Got invited to Harvard Innovation Labs during SXSW next month in Texas. Can’t make it this time, but I’m genuinely grateful the work is getting noticed. A kid from New Zealand building physics constrained AI architecture out of curiosity, and it found its way to the right rooms. That’s the compounding effect of building in public. You don’t need to be in the room. The work speaks. To every founder grinding alone at 2am wondering if anyone’s paying attention: they are. Keep building. #AI #SXSW #HarvardInnovationLabs #BuildInPublic #SymplecticDynamics
AGI Plug 👨🏻‍🔬 tweet media
English
3
4
20
361