Tom Leranth

2.4K posts

Tom Leranth

@xkey

Father; Applied Mathematician; Dev (C/C++/Julia/Python/Go/smidgeons of Elixir & Rust); Algorithms; Cryptography; H/w & S/w; Amat Photographer; Powerlifter

Midwest เข้าร่วม Mart 2008

489 กำลังติดตาม201 ผู้ติดตาม

ทวีตที่ปักหมุด

Tom Leranth@xkey·18 Eki

“A conversation is a kind of illusion. I don’t know what is going on in your brain. All I can know is what I’m thinking.”-Hiroshi Ishiguro

English

Tom Leranth รีทวีตแล้ว

Boost C++ | Open Source Libraries@Boost_Libraries·5d

Boost Blueprint 017: Boost.Bloom. Provides probabilistic filtering for massive datasets. O(1) lookups, configurable false positive rates, and memory footprint that scales with your tolerance for error, not your data size It’s the perfect architectural choice for high-speed caches and network filtering Level up your C++ architecture. Follow @Boost_Libraries for the #BoostBlueprint series #cpp

Boost C++ | Open Source Libraries tweet media

English

1.2K

Tom Leranth รีทวีตแล้ว

Louis-François Bouchard 🎥🤖@Whats_AI·5d

It’s got a point…

English

758

Tom Leranth รีทวีตแล้ว

Wolfram@WolframResearch·13 Mar

Laplace transforms come to life like never before with computation and visualization everywhere in the new Wolfram eTextbook "Laplace Transforms in Theory and Practice." Download it for free today: wolfram-media.com/products/lapla…

English

189

1.2K

41.4K

Tom Leranth รีทวีตแล้ว

Phoronix@phoronix·8 Mar

FFmpeg 8.1 Preparing For Release With Vulkan Improvements, JPEG-XS & More phoronix.com/news/FFmpeg-8.…

English

2.3K

Tom Leranth รีทวีตแล้ว

patrickogrady.xyz@_patrickogrady·2 Mar

👀👀👀

QME

132

9.3K

Tom Leranth รีทวีตแล้ว

Shu Lynn Liu@shulynnliu·3 Mar

AlphaEvolve is closed-source. We release 🌟SkyDiscover🌟, a flexible, modular open-source framework with two new adaptive algorithms that match or exceed AlphaEvolve on many benchmarks and outperform OpenEvolve, GEPA, and ShinkaEvolve across 200+ optimization tasks. Our new algorithms dynamically adapt their search strategy, and can even let the AI optimize its own optimization process on the fly! Results: 📊 +34% median score improvement on 172 Frontier-CS problems. 🧮 Matches/exceeds AlphaEvolve on many math benchmarks ⚙️ Discovers system optimizations beyond human-designed SOTA 🧵👇

GIF

English

108

583

138.7K

Tom Leranth รีทวีตแล้ว

Sukh Sroay@sukh_saroy·28 Şub

Zhipu AI and Tsinghua University just dropped one of the most significant open-source AI papers of 2026. It's called GLM-5. And it basically marks the moment "vibe coding" died and "agentic engineering" was born. This isn't another incremental model update. It's a 744 billion parameter system that doesn't just write code when you ask it to. It plans entire software projects, builds them, tests them, debugs them, and iterates on them. Autonomously. For hours. No human in the loop. Here's why the AI community is losing its mind: GLM-5 is the first open-weights model to score 50 on the Artificial Analysis Intelligence Index v4.0. No open model has ever hit that number before. It jumped 8 points from its predecessor in a single generation. On LMArena, the benchmark where millions of real humans judge AI models head-to-head, GLM-5 is the #1 open model. In both text and code. Sitting alongside Claude Opus 4.5 and Gemini 3 Pro. Not below them. Next to them. And here's the part most people will miss: They released it anonymously first. Under the name "Pony Alpha" on OpenRouter. No brand. No hype. Just raw capability. Within days, developers were convinced it was a secret Claude Sonnet 5 release. Others guessed DeepSeek V4. Or a leaked Grok update. 25% of users guessed it was Claude. 20% guessed DeepSeek. Only a small fraction guessed it was a Chinese model. Then Zhipu confirmed it was GLM-5. That moment matters. Because it proved that when you strip the brand away, this model competes at the absolute frontier. The community embraced it because it worked. Not because of where it came from. Now the technical part that engineers actually care about: They trained it on 28.5 trillion tokens. 744B total parameters with 40B active at any time, using a Mixture of Experts architecture. It handles 200K token context windows, meaning it can hold entire codebases in memory while working. The training pipeline is unlike anything in the open-source world right now. First, they adopted DeepSeek's Sparse Attention, which cuts attention computation by half on long sequences. 90% of attention entries in long contexts turned out to be redundant. GLM-5 just skips them. Then they built a fully asynchronous reinforcement learning system. Most RL training wastes massive GPU time waiting for the slowest agent to finish its task. GLM-5's system decouples generation from training entirely. The inference engine runs continuously generating agent trajectories while the training engine updates weights in parallel. No waiting. No idle GPUs. The RL happens in three stages: Reasoning RL first, then Agentic RL for coding and search tasks, then General RL for human-style alignment. And they use cross-stage distillation at the end so the model doesn't forget what it learned in earlier stages. But here's what actually separates this from every other model paper: They gave it a simulated vending machine business to run for an entire year. GLM-5 finished with $4,432 in its bank account. That's #1 among all open-source models. It planned inventory, managed cash flow, made purchasing decisions, and adapted to changing conditions over 365 simulated days. Autonomously. They also built over 10,000 real-world software engineering environments for it to train in. Not toy problems. Real GitHub issues across Python, Java, Go, C++, Rust, JavaScript, TypeScript, PHP, and Ruby. Real test suites. Real codebases. On SWE-bench Verified, the gold standard for measuring whether an AI can actually fix real software bugs, GLM-5 scores 77.8%. That beats Gemini 3 Pro and GPT-5.2. It's approaching Claude Opus 4.5's 80.9%. On Terminal-Bench 2.0, it matches Claude Opus 4.5 when you fix ambiguous instructions in the benchmark. On BrowseComp, the web browsing agent benchmark, GLM-5 scores 75.9% with context management. That's #1 across every model tested, open or proprietary. And perhaps most impressively, it's fully optimized to run on seven different Chinese chip platforms. Huawei Ascend, Moore Threads, Hygon, Cambricon, Kunlunxin, MetaX, and Enflame. On a single Chinese node, it matches the performance of dual-GPU international clusters while cutting deployment costs by 50%. This is the part where the geopolitical implications get uncomfortable: A Chinese lab just released an open-weights model that matches Claude and GPT-5.2 on real-world coding, runs natively on Chinese hardware, and was good enough to fool the Western developer community into thinking it was made by Anthropic. The AI race isn't coming. It's here. And the gap between open and closed, between East and West, is narrowing faster than anyone projected. GLM-5 is open-weights. Code and models available now.

English

342

22.1K

Tom Leranth รีทวีตแล้ว

Mark Gadala-Maria@markgadala·24 Şub

This story is actually insane: • dude drops $2000 on a DJI robot vacuum like a lunatic • refuses to use the normal app like a peasant • Sammy Azdoufal fires up Claude to crack the API so he can drive it with an xbox controller • Claude delivers the goods • pulls an auth token from their servers, connects successfully • except the system thinks he controls 7000 vacuums • checks again • yep, seven thousand • DJI built authentication with zero device ownership verification • any valid token works for any unit on the planet • Sammy now has eyes inside homes across 24 countries • live vacuum camera feeds everywhere • full floor plans from the mapping data • some guy in germany eating cereal at 3am, unaware his roomba is snitching • one API call away from being the most informed burglar in history • all he wanted was to steer his vacuum with a joystick • does the right thing and reports it • DJI fixes it in two days • back to normal life with his stupidly expensive floor cleaner • IoT companies stay undefeated at shipping garbage security

English

1.1K

9.9K

64.5K

8.6M

Tom Leranth รีทวีตแล้ว

Mathieu@miniapeur·11 Şub

ZXX

959

39K

Tom Leranth@xkey·10 Şub

@MsVictoriaVixen Nice hair

English

121

torii 🤍@MsVictoriaVixen·9 Şub

this so ain’t 2015 💙💚

English

333

15.7K

Tom Leranth รีทวีตแล้ว

ShitpostGateway@ShitpostGate·7 Şub

ZXX

555

11.2K

225.1K

Tom Leranth รีทวีตแล้ว

Razia Aliani@RaziaAliani·7 Şub

Google just dropped 145 pages documenting how researchers use Gemini to tackle scientific problems. 𝘚𝘢𝘷𝘦 & 𝘙𝘦𝘵𝘸𝘦𝘦𝘵 (𝘵𝘰 𝘩𝘦𝘭𝘱 𝘺𝘰𝘶𝘳 𝘯𝘦𝘵𝘸𝘰𝘳𝘬) A few things that stood out to me (in simple terms): - In one case, the AI was used as an adversarial reviewer and caught a serious flaw in a cryptography proof that had passed human review. That’s a very different use than “summarise this PDF.” - The model links tools from very different fields (for example, using theorems from geometry/measure theory to make progress on algorithms questions). This is where its wide reading really matters. - They don’t let the model run wild. Humans still choose the problems, check every proof, and decide what’s actually new. The model is there to suggest ideas, spot gaps, and do the heavy algebra. - Agentic loops, not just chat In some projects, they plug Gemini into a loop where it: -- proposes a mathematical expression, -- writes code to test it, -- reads the error messages, and -- fixes itself. (humans only step in when something promising appears) We are moving past the era of simple chat prompts and into a more sophisticated era of research. ⮑ If your institution is interested in hosting an AI session or a workshop, request your training here: forms.gle/dbRtc7j2W4zZyL…

English

435

171.8K

Tom Leranth รีทวีตแล้ว

Lance Fortnow@fortnow·3 Şub

Thomas Watson has a new computational complexity textbook about to be published by Cambridge University Press. There's a free version online for personal use. complexityincs.com

English

418

32.4K

Tom Leranth รีทวีตแล้ว

Sakana AI@SakanaAILabs·23 Oca

We are thrilled to announce a strategic partnership with Google! Google is also making a financial investment in Sakana AI to strengthen this collaboration. This underscores their recognition of our technical depth and our mission to advance AI in Japan. We are combining Google’s world-class products with our agile R&D to tackle complex challenges. By leveraging models like Gemini and Gemma, we will accelerate our breakthroughs in automated scientific discovery. Our work on The AI Scientist and ALE-Agent has already demonstrated the power of these models. Now we are going further. We are scaling our deployment of reliable AI in mission-critical sectors. We are working with financial institutions and government organizations to deliver solutions that meet the highest standards of security and data sovereignty. We are excited to drive the widespread adoption of reliable AI and advance Japan’s AI ecosystem together!

GIF

English

805

259.5K

Tom Leranth รีทวีตแล้ว

Jia Li@JiaLi52524397·20 Oca

Announcing Numina-Lean-Agent: an open-source framework achieving SOTA in formal theorem proving using generic models. Two major breakthroughs: 🏆 12/12 on Putnam 2025 📜 Formalized the "Effective Brascamp-Lieb inequalities" research paper (>90% AI-generated) Code: github.com/project-numina… Demo: demo.projectnumina.ai Paper: github.com/project-numina…

English

183

20.6K

Tom Leranth รีทวีตแล้ว

0.005 Seconds (3/694)@seconds_0·15 Ara

There's an entire parallel scientific corpus most western researches never see. Today i'm launching chinarxiv.org, a fully automated translation pipeline of all Chinese preprints, including the figures, to make that available.

English

199

1.1K

7.4K

838.1K

Tom Leranth รีทวีตแล้ว

Micah Goldblum@micahgoldblum·11 Ara

For a long time, Yann LeCun and others believed in gradient-based planning, but it didn’t work very well … until now. Here’s how we did it using incredibly simple techniques. But first, an introduction to gradient-based planning: 🧵1/11

English

173

1.4K

158.5K

Tom Leranth รีทวีตแล้ว

Chelsea Finn@chelseabfinn·9 Ara

All of my Deep RL course lecture videos from Spring 2025 are now online! 🥳 Youtube playlist: youtube.com/watch?v=EvHRQh…

YouTube

English

392

3.4K

233.6K

Tom Leranth@xkey·6 Ara

@VessOnSecurity ...and here I thought the US Healthcare/insurance system was crappily expensive (it is!). Everyone likes to tout the Canadian system or European. Your current situation sucks - keep hanging on and go enjoy more of your favorite things in life!

English

Vess@VessOnSecurity·5 Ara

bontchev.nlcv.bas.bg/bye.html Archived link: archive.is/htZOu If you can't be assed to read it (it's rather long), the TL;DR version is that I'm dying.

English

620

Vess@VessOnSecurity·5 Ara

Hello folks, Today's my birthday (0x41 years old, yikes!) and since it's very likely going to be my last one, I've decided to post this. I can't be assed to chop it into parts and format it for the various social networks I'm on. I've put it on a web page and am posting a link.

English

2.4K

Tom Leranth รีทวีตแล้ว

Docker@Docker·5 Ara

Semantic search without sending your data to the cloud? Yes, please! In this issue of Docker’s AI Newsletter, see how to generate embeddings locally with Docker Model Runner - no API fees, no third-party access, just full control. Try it → bit.ly/48xpg9a #Docker #AI #SemanticSearch

English

4.9K

ค้นพบ

@Boost_Libraries @MsVictoriaVixen @VessOnSecurity @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates