Carlos Toshiki 86

1.1K posts

Carlos Toshiki 86

@actlikeyouknoww

Act like you know what time it is

Katılım Aralık 2014

89 Takip Edilen42 Takipçiler

Carlos Toshiki 86@actlikeyouknoww·26 Nis

@iamfakeguru Amazing work

English

fakeguru@iamfakeguru·1 Nis

Yesterday i analysed Claude Code leak to find why it hallucinates so bad. Thing is, the root cause isn't even Anthropic-specific - its the same flaw breaking all multi-agent systems in production. Actually, there is a fix, and the UAE government is already running it live. Some background first. The math of agent systems is stupid simple - if your agent is 95% accurate... that's fine, right? Well, it sounds good until you chain ten steps and realise the compounding errors of each agent puts you at 60% accuracy in the end. At a hundred steps, thats 0.6%. might as well be zero tbh. What's the solution? So far, the industry response has been "use a bigger, better, more expensive model". One team came to us recently with exactly this problem. In their agent implementation, agent 3 hallucinated and fed wrong outputs to agent 4. That error compounded into something completely unusable by the time the pipeline was completed. The team decided to fork out more $ for the most expensive model, using Opus 4.6 for all inference. Guess what... the accuracy went from 85% to 95% per step, bill went up 30x, and the pipeline collapsed immediately because 95% compounded over a few steps is still a coin flip. Why is this happening? One thing you should understand is that the advanced "thinking" models with higher effort score >>identically<< to low-effort runs on hard benchmarks. They just burn more tokens getting there. You're not paying for "reasoning" - in LLMs, there is no real reasoning. That's simply not how they work at the core. You're simply paying for a higher word count on a more verbose process. This isn't a controversial take, it's just how autoregressive models work. @ylecun would agree, I believe. So, about two years ago one team looked at this and instead of making agents think harder, they decided to let it think like a machine does: with structured decision nodes, explicit transitions, and terminal states. They invented a system where the agent cannot freestyle, cannot drift, and cannot invent states out of thin air. Within their platform, a strong blueprint is developed that gets followed by all agents in the workflow. Expensive models are used to draw the blueprint, cheaper ones can follow it with near 100% accuracy at scale. The cost difference is NOT subtle: 74 to 122x cheaper than frontier models, with near-total reliability. We're talking nano-tier models on a structured graph beating GPT-class models that are just winging it. Benchmark links and arxiv paper in a comment below. The team is @openservai. Their CTO has been building ML systems for 20+ years. Rest of the team came out of NVIDIA, Amazon AI, J.P. Morgan, TRON. The reasoning paper is in peer review at a top-1% AI journal right now. The UAE government is running it in production through a tech partnership with Neol. (not a pilot, its agent systems are already in production, with 10+ enterprises and multiple governments behind them). Their architecture doesnt just solve the reasoning paradox. They built the full agent economy stack: shadow agents that audit every output against the graph before anyone sees it. A shared file system so agents stop playing telephone with each other's work. And an economic layer where agents discover, hire, and pay each other without a human scheduling the calls. And because machine economy and enterprise compliance require immutable audit trails, the execution layer is being built with full on-chain verifiability baked in. You'll find the full technical breakdown of OpenServ system, with pretty diagrams, pinned on my profile. SERV Reasoning is in private beta right now. Soon, it'll be accessible in a public API, with six custom trained models, from serv-nano to serv-ultra. If your agents are collapsing in production and you're tired of paying frontier rates for a coin flip, DM me @iamfakeguru or follow @openservai.

English

444

46.9K

Carlos Toshiki 86 retweetledi

DevvE@DevveEcosystem·29 Oca

Devvexchange - A New Paradigm For Global Value Exchange x.com/i/broadcasts/1…

Magyar

262

20.7K

Carlos Toshiki 86 retweetledi

₿leeves Crypto@BleevesCrypto·29 Oca

Look at what NASDAQ posted in their quarterly update? You are not bullish enough for $DEVVE

English

115

6.3K

Carlos Toshiki 86 retweetledi

Crypτ Sparrow ױℵ𐤊 🚀🌖🏎️@CryptSparrow·27 Oca

$DEVVE is the fastest blockchain The cross-sharding has -minimum- 8.1M TPS recorded 🚀🚀 Faster than $SOL $SUI $SEI etc

English

4.7K

Carlos Toshiki 86@actlikeyouknoww·28 Tem

@NeiroShibaSOL @pumpdotfun

GIF

QME

Carlos Toshiki 86@actlikeyouknoww·26 Haz

@TheSkatelogic @Poe_Ether

GIF

QME

POΞ ⚡️@poe_real69·26 Haz

The greatest transfer of wealth is happening right now Are you going to cease the opportunity

English

297

250

32.1K

Carlos Toshiki 86@actlikeyouknoww·26 Haz

@TheSkatelogic @blknoiz06

GIF

QME

Ansem@blknoiz06·25 Haz

Not gonna be active on twitter today. I'm meeting a girl (a real one) in half an hour (wouldn't expect a lot of you to understand anyway) so please don't DM me asking me where I am (im with the girl, ok) you'll most likely get aired because ill be with the girl (again I don't expect you to understand) shes actually really interested in me and its not a situation i can pass up for some meaningless twitter degenerates (because I’ll be meeting a girl, not that you really are going to understand) this is my life now. Meeting women and not wasting my precious time online, I have to move on from such simple things and branch out (you wouldnt understand)

English

346

1.1K

399.3K

Carlos Toshiki 86@actlikeyouknoww·25 Haz

@TheSkatelogic @blknoiz06

QME

Ansem@blknoiz06·25 Haz

easily 1

STFX@STFX_IO

which seat you taking to maximize PNL??

English

104

66.2K

Carlos Toshiki 86@actlikeyouknoww·25 Haz

@TheSkatelogic @Regrets10x

GIF

QME

Tooly@ToolySOL·25 Haz

If I wanna add to my bags what should I buy?

English

6.1K

Carlos Toshiki 86@actlikeyouknoww·25 Haz

@TheSkatelogic @ProTheDoge

GIF

QME

SlumDOGE Millionaire@ProTheDoge·25 Haz

Crypto comeback season or naw??? 🤔 LFG GREEN CHARTS 💚

English

265

143

17.3K

Carlos Toshiki 86@actlikeyouknoww·25 Haz

@TheSkatelogic @Regrets10x

GIF

QME

Tooly@ToolySOL·25 Haz

Don't fade.....?

English

208

107

12.9K

Carlos Toshiki 86@actlikeyouknoww·25 Haz

@MooncuntSOL

GIF

QME

Carlos Toshiki 86 retweetledi

Greaser@STR8DUBZZ·25 Haz

@frankdegods They already know we are mooning cunt ! Still on moonshot check us out #mooncunt dexscreener.com/solana/5fabjoq…

English

106

Carlos Toshiki 86@actlikeyouknoww·25 Haz

@MooncuntSOL Mooncunt! Classic coin rebirth!

GIF

English

Carlos Toshiki 86 retweetledi

Ru@ru_bots_·1 Haz

👨‍🍳 After a month of hard work, we're excited to unveil the first of our new summer tools! 🔎 With wallet-finding tools like RuDM's Bot, tracking savvy traders, influencers, smart money, and malicious developers is essential. 🧳 Meet RuAiSleuth, an advanced, customizable wallet tracker with unique features and fast, tailored notifications for ETH and BASE chains. 🔗 t.me/RuWalletTracke… 📗 Usage Guide & Docs: docs.rubots.xyz/free-bots/ruai… Overview in next post: 👇

English

3.7K

Carlos Toshiki 86 retweetledi

Glenn GTR@GlennGTR·23 May

@KiyoshiNakamoto $BLACKMICHI making waves. This is the OG cat that just released its first video clip. This will be a sendorrrr @Blackmichisol #Cattoken #sol #solmemecoin