Nimrod Berman

7 posts

Nimrod Berman

Nimrod Berman

@NimrodBe

CS PhD | DL Researcher

Katılım Eylül 2017
114 Takip Edilen24 Takipçiler
Nimrod Berman retweetledi
Amnon Shashua
Amnon Shashua@AmnonShashua·
DoubleAI’s AI system just beat a decade of expert GPU engineering WarpSpeed just beat a decade of expert-engineered GPU kernels — every single one of them. cuGraph is one of the most widely used GPU-accelerated libraries in the world. It spans dozens of graph algorithms, each written and continuously refined by some of the world’s top performance engineers. @_doubleAI_'s WarpSpeed autonomously rewrote and re-optimized these kernels across three GPU architectures (A100, L4, A10G). Today, we released the hyper-optimized version on GitHub — install it with no change to your code. The numbers: - 3.6x average speedup over human experts - 100% of kernels benefit from speedup - 55% see more than 2x improvement. But hasn’t AI already achieved expert-level status — winning gold medals at IMO, outperforming top programmers on CodeForces? Not quite. Those wins share three hidden crutches: abundant training data, trivial validation, and short reasoning chains. Where all three hold, today’s AI shines. Remove any one of them and it falls apart (as Shai Shalev Shwartz wrote in his post). GPU performance engineering breaks all three. Data is scarce. Correctness is hard to validate. And performance comes from a long chain of interacting choices — memory layout, warp behavior, caching, scheduling, graph structure. Even state-of-the-art agents like Claude Code, Codex, and Gemini CLI fail dramatically here, often producing incorrect implementations even when handed cuGraph’s own test suite. Scaling alone can’t break this barrier. It took new algorithmic ideas — our Diligent framework for learning from extremely small datasets, our PAC-reasoning methodology for verification when ground truth isn’t available, and novel agentic search structures for navigating deep decision chains. This is the beginning of Artificial Expert Intelligence (AEI) — not AGI, but something the world needs more: systems that reliably surpass human experts in the domains where expertise is rarest, slowest, and most valuable. If AI can surpass the world’s best GPU engineers, which domain falls next? For the full blog: doubleai.com/research/doubl… CuGraph: docs.rapids.ai/api/cugraph/st… Winning Gold at IMO 2025: arxiv.org/abs/2507.15855 Codeforces benchmarks: rdworldonline.com/openai-release… @shai_s_shwartz post: x.com/shai_s_shwartz… From Reasoning to Super-Intelligence: A Search-Theoretic Perspective arxiv.org/abs/2507.15865 Artificial Expert Intelligence through PAC-reasoning arxiv.org/abs/2412.02441
Amnon Shashua tweet media
English
17
34
193
66.1K
Nimrod Berman retweetledi
Uri Eliabayev
Uri Eliabayev@urieli17·
עבודה מאוד מרתקת ומעניינת של AAI שקצת שונה בנוף שאנחנו רגילים אליו. חברת AAIהציגה את WarpSpeed, מערכת שהצליחה לשכתב באופן אוטונומי ולייעל את ליבות החישוב (Kernels) של cuGraph. למי שלא מכיר, מדובר בספריית האצת הגרפים המפורסמת של NVIDIA.
Amnon Shashua@AmnonShashua

DoubleAI’s AI system just beat a decade of expert GPU engineering WarpSpeed just beat a decade of expert-engineered GPU kernels — every single one of them. cuGraph is one of the most widely used GPU-accelerated libraries in the world. It spans dozens of graph algorithms, each written and continuously refined by some of the world’s top performance engineers. @_doubleAI_'s WarpSpeed autonomously rewrote and re-optimized these kernels across three GPU architectures (A100, L4, A10G). Today, we released the hyper-optimized version on GitHub — install it with no change to your code. The numbers: - 3.6x average speedup over human experts - 100% of kernels benefit from speedup - 55% see more than 2x improvement. But hasn’t AI already achieved expert-level status — winning gold medals at IMO, outperforming top programmers on CodeForces? Not quite. Those wins share three hidden crutches: abundant training data, trivial validation, and short reasoning chains. Where all three hold, today’s AI shines. Remove any one of them and it falls apart (as Shai Shalev Shwartz wrote in his post). GPU performance engineering breaks all three. Data is scarce. Correctness is hard to validate. And performance comes from a long chain of interacting choices — memory layout, warp behavior, caching, scheduling, graph structure. Even state-of-the-art agents like Claude Code, Codex, and Gemini CLI fail dramatically here, often producing incorrect implementations even when handed cuGraph’s own test suite. Scaling alone can’t break this barrier. It took new algorithmic ideas — our Diligent framework for learning from extremely small datasets, our PAC-reasoning methodology for verification when ground truth isn’t available, and novel agentic search structures for navigating deep decision chains. This is the beginning of Artificial Expert Intelligence (AEI) — not AGI, but something the world needs more: systems that reliably surpass human experts in the domains where expertise is rarest, slowest, and most valuable. If AI can surpass the world’s best GPU engineers, which domain falls next? For the full blog: doubleai.com/research/doubl… CuGraph: docs.rapids.ai/api/cugraph/st… Winning Gold at IMO 2025: arxiv.org/abs/2507.15855 Codeforces benchmarks: rdworldonline.com/openai-release… @shai_s_shwartz post: x.com/shai_s_shwartz… From Reasoning to Super-Intelligence: A Search-Theoretic Perspective arxiv.org/abs/2507.15865 Artificial Expert Intelligence through PAC-reasoning arxiv.org/abs/2412.02441

עברית
3
3
75
14.8K
Nimrod Berman retweetledi
Shai Shalev-Shwartz
Shai Shalev-Shwartz@shai_s_shwartz·
1/ Software was eating the world - and now AI is eating software. AI already beats humans at math/coding (IMO, CodeForces). Right? So let's test the strongest coding agents on a real domain: optimizing cuGraph (GPU graph analytics kernels). Spoiler: * The strongest coding agents crash. * And @_doubleAI_ built WarpSpeed - an AI that beat a decade of expert-engineered GPU kernels. 🧵
Shai Shalev-Shwartz tweet media
English
10
18
126
57.3K
Nimrod Berman retweetledi
Assaf Shocher
Assaf Shocher@AssafShocher·
They tell you neural nets are non-linear. What does "linear" even mean?! Linearity is only defined given two vector spaces, X → Y. What if we could find a different pair of spaces where NNs ARE linear? 🤯 We do it and use it for many apps, such as one-step diffusion! 🧵
Assaf Shocher tweet media
English
22
60
562
45K
Nimrod Berman retweetledi
Ivan Skorokhodov
Ivan Skorokhodov@isskoro·
I think this paper [arxiv.org/abs/2510.08570] wins the "strangest" (in a good sense) 1-step diffusion award of this year. They parametrize a model as an invertible network, which maps from the sample space to the representation space, which is assumed to be linear: i.e. we assume that estimating the (unconditional) velocity from a noised sample is a linear operation (which we learn as well) in that "latent" space. This can be seen as a latent diffusion model, where your encoder is a (large) invertible neural network, your decoder is its inverse, and your diffusion model is a linear function in the latent space. Quite a curious construction to be honest. Why do we need invertibility instead of training a separate "decoder" to map the "latents" back (starts smelling with vanilla LDMs here)? I find it to be just a convenient design constraint to make the theory clean (otherwise, you need to bother with upper/lower bounds and dozens of weaker assumptions everywhere). (Note: the paper presents are general idea of mapping data to a linear space, but I was mainly reading through this nice diffusion example.)
English
9
47
386
33.6K
Nimrod Berman retweetledi
fly51fly
fly51fly@fly51fly·
[LG] Who Said Neural Networks Aren't Linear? N Berman, A Hallak, A Shocher [Ben-Gurion University & NVIDIA & Technion] (2025) arxiv.org/abs/2510.08570
fly51fly tweet mediafly51fly tweet mediafly51fly tweet mediafly51fly tweet media
English
2
26
185
14.2K