Reasoning Models (@reasoningmodels) - Twitter-Profil

Reasoning Models@reasoningmodels·3d

@iruletheworldmo I feel like it’s probably a solid analogy, just rather than lots of leaks, it’s the biggest leak ever

English

1

0

2

112

🍓🍓🍓@iruletheworldmo·3d

@reasoningmodels yeah i wasn’t sure if it worked but. hopefully it’s close enough haha

English

1

0

1

3.3K

🍓🍓🍓@iruletheworldmo·3d

🚨BREAKING FRONTIER MODEL NEWS claude mythos set for release april 16th dario has more leaks than the titanic, here’s some info from anthropic staff. >95 or higher on every single benchmark. except arc agi 3, yet to be tested on. >dramatically outperforms opus 4.6 on coding, reasoning, and cyber >anthropic privately warning government officials about its capabilities >so powerful they’re calling it “unprecedented cybersecurity risk” >already being tested with early access customers >priced at $120/$600 per million tokens >10 million token context window >enterprise use only capybara is here. capygpt is agi.

English

185

117

2.3K

817.8K

Reasoning Models@reasoningmodels·3d

@0xSero What tool are you using to track usage that outputs it like that? Look interesting 👀

English

1

0

2

4.1K

0xSero@0xSero·3d

6% used in an hour of use, 1 session. They massacred my boy, I feel like I'm in a Casino and Anthropic is the house. Claude Max 20x

English

36

4

354

67.8K

Reasoning Models@reasoningmodels·3d

@SawyerMerritt Pretty incredible, generating $2B/mo, so $24B/year. So 35x annual revenue multiple. This is like a company doing $10M/year in ARR raising at a $350M valuation. Which I think does happen quite a bit, so maybe nothing too unusual here right??

English

1

0

921

Sawyer Merritt@SawyerMerritt·3d

NEWS: OpenAI just announced that it has officially closed their latest funding round with $122 billion in committed capital at a post money valuation of $852 billion. "We are now generating $2B in revenue per month. At this stage, we are growing revenue four times faster than the companies who defined the Internet and mobile eras, including Alphabet and Meta. ChatGPT has more than 900 million weekly active users, and over 50 million subscribers. Search usage has nearly tripled in a year, and our ads pilot reached more than $100 million in ARR in under six weeks. Momentum is just as strong on the enterprise side, which now makes up more than 40% of our revenue, and is on track to reach parity with consumer by the end of 2026. GPT‑5.4 is driving record engagement across agentic workflows. Our APIs now process more than 15 billion tokens per minute. Codex now serves over 2 million weekly users, up 5x in the past three months, with usage growing more than 70% month over month."

English

269

262

2.6K

1.9M

Reasoning Models@reasoningmodels·3d

@yoonholeee @roshen_nair @qizhengz_alex @Kangwook_Lee @lateinteraction @chelseabfinn Really interesting, need to dig in and really understand this at a deeper level, feels like there could be so many applications to this.

English

0

226

Yoonho Lee@yoonholeee·4d

How can we autonomously improve LLM harnesses on problems humans are actively working on? Doing so requires solving a hard, long-horizon credit-assignment problem over all prior code, traces, and scores. Announcing Meta-Harness: a method for optimizing harnesses end-to-end

English

74

261

1.6K

420.8K

Reasoning Models@reasoningmodels·3d

This feels like something big.

Yoonho Lee@yoonholeee

How can we autonomously improve LLM harnesses on problems humans are actively working on? Doing so requires solving a hard, long-horizon credit-assignment problem over all prior code, traces, and scores. Announcing Meta-Harness: a method for optimizing harnesses end-to-end

English

0

1

36

Reasoning Models@reasoningmodels·4d

If you use Claude, and run out of tokens quickly, this could be why.

Alex Volkov@altryne

PSA: If you've been running out of Claude session quotas on Max tier, you're not alone. Read this. Some insane Redditor reverse engineered the Claude binaries with MITM to find 2 bugs that could have caused cache-invalidation. Tokens that aren't cached are 10x-20x more expensive and are killing your quota. If you're using your API keys with Claude this is even worse. This is also likely why this isn't uniform, while over 500 folks replied to me and said "me too", many (including me) didn't see this issue. There are 2 issues that are compounded here (per Redditor, I haven't independently confirmed this) : 1s bug he found is a string replacement bug in bun that invalidates cache. Apparently this has to do with the custom @bunjavascript binary that ships with standalone Claude CLI. The workaround there is to use Claude with `npx @anthropic-ai/claude-code` 2nd bug is worse, he claims that --resume always breaks cache. And there doesn't seem to be a workaround there, except pinning to a very old version (that will miss on tons of features) This bug is also documented on Github and confirmed by other folks. I won't entertain the conspiracy theories there that Anthropic "chooses" to ignore these bugs because it gets them more $$$, they are actively benefiting from everyone hitting as much cached tokens as possible, so this is absolutely a great find and it does align with my thoughts earlier. The very sudden spike in reporting for this, the non-uniform nature (some folks are completely fine, some folks are hitting quotas after saying "hey") definitely points to a bug. cc @trq212 @bcherny @_catwu for visibility in case this helps all of us.

English

0

1

40

Reasoning Models@reasoningmodels·4d

@altryne Super interesting, explains a lot.

English

0

43

Alex Volkov@altryne·4d

PSA: If you've been running out of Claude session quotas on Max tier, you're not alone. Read this. Some insane Redditor reverse engineered the Claude binaries with MITM to find 2 bugs that could have caused cache-invalidation. Tokens that aren't cached are 10x-20x more expensive and are killing your quota. If you're using your API keys with Claude this is even worse. This is also likely why this isn't uniform, while over 500 folks replied to me and said "me too", many (including me) didn't see this issue. There are 2 issues that are compounded here (per Redditor, I haven't independently confirmed this) : 1s bug he found is a string replacement bug in bun that invalidates cache. Apparently this has to do with the custom @bunjavascript binary that ships with standalone Claude CLI. The workaround there is to use Claude with `npx @anthropic-ai/claude-code` 2nd bug is worse, he claims that --resume always breaks cache. And there doesn't seem to be a workaround there, except pinning to a very old version (that will miss on tons of features) This bug is also documented on Github and confirmed by other folks. I won't entertain the conspiracy theories there that Anthropic "chooses" to ignore these bugs because it gets them more $$$, they are actively benefiting from everyone hitting as much cached tokens as possible, so this is absolutely a great find and it does align with my thoughts earlier. The very sudden spike in reporting for this, the non-uniform nature (some folks are completely fine, some folks are hitting quotas after saying "hey") definitely points to a bug. cc @trq212 @bcherny @_catwu for visibility in case this helps all of us.

Alex Volkov@altryne

My feed is showing me a bunch of folks who tapped out their whole usage limits on Mon/Tue. Is this your experience? Please comment, I want to understand how widespread this is

English

223

428

5K

1.6M

Reasoning Models@reasoningmodels·4d

@iruletheworldmo Can't wait to try it, wish I got the cool stuff as early as you do, I'm jealous.

English

0

1

176

🍓🍓🍓@iruletheworldmo·4d

dario has closed the loop.

English

17

5

250

27.2K

Reasoning Models@reasoningmodels·4d

huggingface.co/Jackrong/Qwen3…

ZXX

0

47

Reasoning Models@reasoningmodels·4d

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2 is a great model. Just make sure you're looking at v2. This is the one everyone is fanatical about right now, and for good reason.

English

1

0

76

Reasoning Models@reasoningmodels·4d

Excited for this one!

Sebastian Raschka@rasbt

It’s done. All chapters of Build A Reasoning Model (From Scratch) are now available in early access. The book is currently in production and should be out in the next months, including full-color print and syntax highlighting. There’s also a preorder up on Amazon.

English

0

20

Reasoning Models@reasoningmodels·4d

@rasbt Safe to say - I’m very excited to read this.

English

0

1

14

Sebastian Raschka@rasbt·5d

It’s done. All chapters of Build A Reasoning Model (From Scratch) are now available in early access. The book is currently in production and should be out in the next months, including full-color print and syntax highlighting. There’s also a preorder up on Amazon.

English

123

261

2.5K

114.5K

Reasoning Models@reasoningmodels·6d

@ShaokunZhang1 Thanks for sharing, really interesting stuff!

English

0

1

53

Shaokun Zhang@ShaokunZhang1·6d

Excited to share our agent RL infrastructure, ProRL Agent 🚀 We built ProRL Agent around a simple idea: treat agentic rollout as a service. This lets us separate I/O-heavy rollout execution from GPU-heavy training, improving scalability for RL training of multi-turn agents. Check it here: github.com/NVIDIA-NeMo/Pr…

Hao Zhang@HaoZhang3438830

Excited to introduce ProRL Agent: Rollout-as-a-Service for RL training of multi-turn LLM agents! 🚀 As we move toward complex agentic tasks, rollout infrastructure is often a bottleneck. We’re decoupling I/O-heavy rollouts from GPU training via a unified HTTP API. Why ProRL Agent? Decoupled & Scalable: Treats rollout as a service, allowing near-linear throughput scaling. System-Level Optimization: Includes load balancing and automated sandbox cleanup for high stability. Integrated: Now part of NVIDIA NeMo Gym to help researchers scale RL pipelines faster. The Results 📈 On SWE-bench-Verified, we saw significant gains: +8.4 on Qwen3-8B +8.2 on Qwen3-14B Proven success across STEM, Math, and General Coding agents. Check out the research and open-source code: 📄 Paper: arxiv.org/pdf/2603.18815💻 Repo: github.com/NVIDIA-NeMo/Pr… Huge thanks to the team and NVIDIA for the support! 👏

English

3

9

74

9.9K

Reasoning Models@reasoningmodels·6d

@cryptopunk7213 Kinda sad to watch honestly, but so it goes - big tech eats the world. Soon, everything we use and buy will be from Amazon and Google 🫠

English

0

16

Ejaaz@cryptopunk7213·6d

lol google destroying duolingo and every language tutor with this 270 million people can now understand 70+ different languages in real-time via gemini AI audio translation when someone speaks to you in chinese you hear english. people spend fucking YEARS and $1000’s trying to get fluent in a language now ai removes the need to even learn one same shit happening with vibe-coders who don’t know code why learn a language when you can just have ai translate?

Google@Google

Your headphones just became a personal translator in 70+ languages. 🎧✨ Google Translate’s “Live translate” with headphones is officially on iOS. We're also expanding this capability to more countries around the world for both @Android and iOS users. To try it, open the Translate app, tap “Live translate” and connect your headphones.

English

169

84

1.1K

295.3K

Reasoning Models@reasoningmodels·6d

@iruletheworldmo So ready for it! 💪

English

0

1

56

🍓🍓🍓@iruletheworldmo·6d

unless you work at a frontier ai lab. i promise you. you will be blown away by the progress made from both anthropic and openai every single researcher at both labs follow the delightful strawberry man. so be sure that i’ve seen the future. and it’s here.

English

49

20

818

50.2K

Reasoning Models@reasoningmodels·6d

@Teknium Cool 😎

English

0

1

35

Teknium (e/λ)@Teknium·6d

Just pushed an update from the learnings from this in Hermes Agent - Hopefully your gpt/codex model's are not so lazy anymore!

Teknium (e/λ)@Teknium

So.. I'd gotten a lot of complaints that GPT-5.4 is pretty hesitant to actually.. do the task presented to it, to call tools, etc. I have checked like 15 times now that we call it the same way we call Claude or any other model, and we do. Then I had hermes-agent look into it. It decided to check opencode and cline's codebase to see if maybe they do it differently. They don't - but they do prompt it differently.. lol

English

12

7

115

8.4K

Reasoning Models@reasoningmodels·6d

@0xSero Thanks for sharing all the great stuff lately, I’d say it’s you and strawberry man that are my favs these days 🫶

English

0

83

0xSero@0xSero·6d

Best models to run on your hardware level I'll be doing this every week, I hope you guys enjoy. ---- 8 GB ---- Autocomplete for coding (like Cursor Tab) - huggingface.co/NexVeridian/ze… - huggingface.co/bartowski/zed-… Tool calling, assistant style - huggingface.co/nvidia/NVIDIA-… ---- 16 Gb ---- Here things get better: Multimodal - huggingface.co/Qwen/Qwen3.5-9B - huggingface.co/Tesslate/OmniC… - huggingface.co/unsloth/Qwen3.… ---- 24 GB ---- - The best model you can get (thanks Qwen) huggingface.co/Qwen/Qwen3.5-2… - Great model (strong agents) huggingface.co/nvidia/Nemotro… - Mine hehe huggingface.co/0xSero/Qwen-3.… I'm doing a weekly series

English

221

375

3.7K

570.5K

Reasoning Models@reasoningmodels·6d

@karpathy Such a neat idea, and curious what LLM did you decide to pick for this? I’m assuming ChatGPT right??

English

0

17

Andrej Karpathy@karpathy·6d

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.7K

2.4K

31.1K

3.3M

Reasoning Models@reasoningmodels·6d

You should know who Alec is.

Flowers ☾@flowersslop

Every LLM from any lab today traces back to this guy, who was the only person at OpenAI pushing for pretraining transformer language models. He built GPT-1. After that did others see the potential. He invented it, and almost none of the so called AI experts even know his name.

English

0

76

Reasoning Models

Entdecken