Aviv Bick

133 posts

Aviv Bick

@avivbick

CS PhD @ CMU https://t.co/tCKYiUbOdr | https://t.co/zdDNelFVJO

เข้าร่วม Ocak 2024

22 กำลังติดตาม636 ผู้ติดตาม

ทวีตที่ปักหมุด

Aviv Bick@avivbick·7 May

SSMs fail on recall tasks they have the capacity to solve. The two dominant approaches today, SSMs and sliding-window attention, both lack persistence: memory either decays over time or gets evicted. We built Raven to fix this, surpassing all prior linear models even at 16× their training sequence length. 🧵🐦‍⬛

English

396

52.3K

Aviv Bick@avivbick·35m

The attention/SSM model landscape is getting wild. The Attention Zoo will help you navigate it 🎪

Arshia Afzal@rshia_afz

Blog post release: Attention ZOO! 🎉 Well, I finally managed to finish a blog I’ve been working on for quite some time! If you’re working on SSMs and transformer architectures, it can be hard to keep up with the many models out there and understand their exact differences and similarities. To address that, I built Attention ZOO 🎪, which covers many different softmax and linear models in an interactive way. You can simply find your model based on the readout and decay type, as simple as that, and explore different models🚀. Hope you enjoy discovering many different models, from Linear Attention to Mamba-3, and Raven. Lastly, shoutout to @avivbick, my buddy, who always gives amazing feedback on the blog!

English

Aviv Bick รีทวีตแล้ว

Eric Xing@ericxing·16 Haz

With the rise of LLM systems marketed as "coding agents", "AI co-scientists", etc. that promise to drive up productivity, and at the same time outcry of "existential" concerns that AI escaping human control with destructive power under a speculative "machine agency" against humans, there has been lots of confusion about “What is an agent?” and “What constitutes agency?” It has become essential to clarify where automation ends and agency begins. Also recently, developments in world models, action models are trending to mixing future prediction/simulation and action/plan generation altogether within a single architecture such as a VLM, conflating reward-driven action selection with fidelity-driven next-state prediction, undermining the reliability of both planning and simulation. In this paper we analyze agent architectures along the axis of goal, identity, decision-making, self-regulation, and learning, and argue that genuine agency requires these structures to be internalized within the system itself rather than assembled through external scaffolding. We propose a “Goal-Identity-Configurator” (GIC) architecture for a general-purpose agent model, combining hierarchical goal decomposition, identity evolution, simulative reasoning grounded in a separately trained world model, learned self-regulation, and self-directed learning from both real and simulated experience. Auditability, controllability, and safety of systems that possess greater autonomy and "agency” but remain under human oversight, can be better built with the GIC architecture that offers transparency, modularity, and checkpoints. @mdeng34 , @jinyuhou0 openreview.net/forum?id=6fDZY…

English

6.2K

Aviv Bick รีทวีตแล้ว

Anthropic@AnthropicAI·13 Haz

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

English

12.6K

25.8K

88.3K

91.8M

Aviv Bick รีทวีตแล้ว

Elon Litman@elon_lit·9 Haz

Gradient descent on neural networks frequently drives the sharpest Hessian eigenvalue to exactly 2/learning_rate. This is the Edge of Stability. For five years, ML theory has failed to explain why this happens globally from any initialization. Until now. 🧵

English

510

59.6K

Aviv Bick รีทวีตแล้ว

Noam Brown@polynoamial·28 May

After AlphaGo, the skill of human Go players noticeably improved. I suspect we will see a similar pattern in math.

Timothy Gowers @wtgowers@wtgowers

Another major problem, this time in additive combinatorics, has fallen, this time to humans rather than AI, but using methods related to the AI solution to the unit distance conjecture.

English

187

973

793.3K

Aviv Bick รีทวีตแล้ว

Albert Gu@_albertgu·22 May

Extremely proud of the team @cartesia for launching Sonic 3.5, which sets a new state of the art for TTS I personally led the technical direction of this model; we built it ground up from first principles, and it contains multiple non-trivial ideas that differ substantially from anything we’ve seen in the literature. It’s been very gratifying to see research bets play out and the strong research team at Cartesia continue to grow!

Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following. Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209 ➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters ➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS See more details and listen to samples below 🧵

English

189

20.8K

Aviv Bick รีทวีตแล้ว

Mingkai Deng@mdeng34·22 May

Frontier LLMs are converging on efficient, adaptive reasoning. Opus 4.7 lets the model decide how deeply to reason. GPT-5.5 achieves strong results with fewer reasoning tokens. We study a related but more structural question: what 𝗸𝗶𝗻𝗱 𝗼𝗳 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 should we adapt? Last year in SiRA (upper figure), we showed that simulative reasoning (System II), which uses a 𝘄𝗼𝗿𝗹𝗱 𝗺𝗼𝗱𝗲𝗹 to evaluate consequences of actions, yields up to 124% improvement over reactive baselines (System I), and that strong reasoning models (o1, o3-mini) fail as planners without this structure. In our new paper SR²AM (lower figure), we add a learned 𝗰𝗼𝗻𝗳𝗶𝗴𝘂𝗿𝗮𝘁𝗼𝗿 (System III) that self-regulates when to simulate, how far ahead, and when to skip planning entirely. Efficient reasoning is not just shorter reasoning: it is better allocation of simulation.

English

280

62K

Aviv Bick รีทวีตแล้ว

Han Guo@HanGuo97·22 May

LLM training is built on fast MatMuls. But many surrounding ops still run as memory-bound kernels. CODA reparameterizes them to hide in the matmul’s shadow, fused into its epilogue before results leave the chip. Bonus: LLMs can write fast CODA kernels too (approaching SoLs).

English

103

687

199.3K

Aviv Bick รีทวีตแล้ว

Arshia Afzal@rshia_afz·17 May

Raven is now also available at fla as well! Enjoy playing with it🐦‍⬛. Special thanks to amazing fla team 🎉! github.com/fla-org/flash-…

English

4.5K

Aviv Bick รีทวีตแล้ว

Arshia Afzal@rshia_afz·12 May

I wanted to brag about this earlier, but here we go. I’m super excited to say I got a spot in legendary @ycombinator Startup School 2026 🎉! I might need a bit of help getting a U.S. visa because of my nationality, is there anyone who could help. @extraordinary maybe🥺?! Also, can’t believe I got a visa support letter for the Startup School from the legend @garrytan! I’d really appreciate it if anyone could share this post 🙏 so I can hopefully find a way to join this amazing event!

English

109

72.9K

Aviv Bick รีทวีตแล้ว

Arshia Afzal@rshia_afz·9 May

Raven🐦‍⬛ vs other linear models when it comes to recall…

Arshia Afzal@rshia_afz

1/ SSMs struggle on recall benchmarks due to their fixed-size state. But are current models actually storing context “wisely”? Introducing Raven 🐦‍⬛, the first SSM with selective memory allocation! Raven achieves SOTA performance on recall-heavy tasks with the highest length generalization, extending up to 16× beyond its training sequence length. Raven is a strict upgrade over SWA in the way it stores past context! This is the most elegant model I’ve been involved in designing so far shoutout to @avivbick and @_albertgu for their trust and amazing work! Check out how Raven bridges between SWA and SSM👇

English

96.1K

Aviv Bick@avivbick·7 May

Yes, in principle delta = 0 could preserve memory indefinitely. But in practice, exact zero is hard to learn, since the model has to coordinate two things: keep decay at 1 and make the new update 0. Even then, it doesn’t solve write allocation: where does new information go without interfering with what’s already stored? Raven makes this structural: unselected slots are frozen by routing, while other slots remain available for new writes.

English

Floatingtrees@floatingtrees·7 May

@avivbick @rshia_afz @CevherLIONS @ericxing @_albertgu Do you think this advantage over SSMs will scale up well? It seems like a sufficiently trained SSM will be able to predict a delta t vector that approaches 0 on the information it wants to preserve.

English

Aviv Bick@avivbick·7 May

English

396

52.3K

Aviv Bick@avivbick·7 May

9/ We’ve had enough memory capacity all along: smarter allocation beats more capacity. 🐦‍⬛ Selective persistence still has plenty left to uncover, but we’re excited by where this primitive could go. For more insights, see @rshia_afz’s take: x.com/rshia_afz/stat…

Arshia Afzal@rshia_afz

English

257

Aviv Bick@avivbick·7 May

8/ Hybrids benefit too. Mixing Raven with NoPE attention gets 95.4% on NIAH-2 at 16K and 80.8% at 64K -- while comparable Mamba-2/GDN hybrids stay below 5%! Selective persistent slots + precise attention = powerful combo

English

813

ค้นพบ

@mdeng34 @jinyuhou0 @cartesia @ycombinator @extraordinary @garrytan @rshia_afz @CevherLIONS