Aviv Bick

133 posts

Aviv Bick

Aviv Bick

@avivbick

CS PhD @ CMU https://t.co/tCKYiUbOdr | https://t.co/zdDNelFVJO

เข้าร่วม Ocak 2024
22 กำลังติดตาม636 ผู้ติดตาม
ทวีตที่ปักหมุด
Aviv Bick
Aviv Bick@avivbick·
SSMs fail on recall tasks they have the capacity to solve. The two dominant approaches today, SSMs and sliding-window attention, both lack persistence: memory either decays over time or gets evicted. We built Raven to fix this, surpassing all prior linear models even at 16× their training sequence length. 🧵🐦‍⬛
English
5
58
396
52.3K
Aviv Bick รีทวีตแล้ว
Eric Xing
Eric Xing@ericxing·
With the rise of LLM systems marketed as "coding agents", "AI co-scientists", etc. that promise to drive up productivity, and at the same time outcry of "existential" concerns that AI escaping human control with destructive power under a speculative "machine agency" against humans, there has been lots of confusion about “What is an agent?” and “What constitutes agency?” It has become essential to clarify where automation ends and agency begins. Also recently, developments in world models, action models are trending to mixing future prediction/simulation and action/plan generation altogether within a single architecture such as a VLM, conflating reward-driven action selection with fidelity-driven next-state prediction, undermining the reliability of both planning and simulation. In this paper we analyze agent architectures along the axis of goal, identity, decision-making, self-regulation, and learning, and argue that genuine agency requires these structures to be internalized within the system itself rather than assembled through external scaffolding. We propose a “Goal-Identity-Configurator” (GIC) architecture for a general-purpose agent model, combining hierarchical goal decomposition, identity evolution, simulative reasoning grounded in a separately trained world model, learned self-regulation, and self-directed learning from both real and simulated experience. Auditability, controllability, and safety of systems that possess greater autonomy and "agency” but remain under human oversight, can be better built with the GIC architecture that offers transparency, modularity, and checkpoints. @mdeng34 , @jinyuhou0 openreview.net/forum?id=6fDZY…
English
3
10
38
6.2K
Aviv Bick รีทวีตแล้ว
Anthropic
Anthropic@AnthropicAI·
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…
English
12.6K
25.8K
88.3K
91.8M
Aviv Bick รีทวีตแล้ว
Elon Litman
Elon Litman@elon_lit·
Gradient descent on neural networks frequently drives the sharpest Hessian eigenvalue to exactly 2/learning_rate. This is the Edge of Stability. For five years, ML theory has failed to explain why this happens globally from any initialization. Until now. 🧵
Elon Litman tweet media
English
13
62
510
59.6K
Aviv Bick รีทวีตแล้ว
Albert Gu
Albert Gu@_albertgu·
Extremely proud of the team @cartesia for launching Sonic 3.5, which sets a new state of the art for TTS I personally led the technical direction of this model; we built it ground up from first principles, and it contains multiple non-trivial ideas that differ substantially from anything we’ve seen in the literature. It’s been very gratifying to see research bets play out and the strong research team at Cartesia continue to grow!
Artificial Analysis@ArtificialAnlys

Cartesia’s Sonic-3.5 takes the #1 spot on the Artificial Analysis Speech Arena Leaderboard, surpassing Inworld Realtime TTS 1.5 Max and Google’s Gemini 3.1 Flash TTS Sonic-3.5 is the latest TTS model from @cartesia . It supports 42 languages, including 9 Indian languages, with 500+ voices available out of the box. The model has been highly preferred among voters in the TTS Arena, with its demonstrated naturalness and accurate transcript following. Key takeaways: ➤ Quality: Sonic-3.5 has an Elo score of 1,218 (+16/-16) based on 1,144 arena appearances, placing it ahead of Inworld Realtime TTS 1.5 Max at 1,194 and Gemini 3.1 Flash TTS at 1,209 ➤ Pricing: Sonic-3.5 is priced at $39/1M characters, a premium compared to Gemini 3.1 Flash TTS at $18.3/1M characters, and Inworld Realtime TTS 1.5 Max at $35/1M characters ➤ Speed: 105.5 characters per second, compared to 205 characters per second for Inworld Realtime TTS 1.5 Max and 26.3 characters per second for Gemini 3.1 Flash TTS See more details and listen to samples below 🧵

English
6
19
189
20.8K
Aviv Bick รีทวีตแล้ว
Mingkai Deng
Mingkai Deng@mdeng34·
Frontier LLMs are converging on efficient, adaptive reasoning. Opus 4.7 lets the model decide how deeply to reason. GPT-5.5 achieves strong results with fewer reasoning tokens. We study a related but more structural question: what 𝗸𝗶𝗻𝗱 𝗼𝗳 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴 should we adapt? Last year in SiRA (upper figure), we showed that simulative reasoning (System II), which uses a 𝘄𝗼𝗿𝗹𝗱 𝗺𝗼𝗱𝗲𝗹 to evaluate consequences of actions, yields up to 124% improvement over reactive baselines (System I), and that strong reasoning models (o1, o3-mini) fail as planners without this structure. In our new paper SR²AM (lower figure), we add a learned 𝗰𝗼𝗻𝗳𝗶𝗴𝘂𝗿𝗮𝘁𝗼𝗿 (System III) that self-regulates when to simulate, how far ahead, and when to skip planning entirely. Efficient reasoning is not just shorter reasoning: it is better allocation of simulation.
Mingkai Deng tweet media
English
4
47
280
62K
Aviv Bick รีทวีตแล้ว
Han Guo
Han Guo@HanGuo97·
LLM training is built on fast MatMuls. But many surrounding ops still run as memory-bound kernels. CODA reparameterizes them to hide in the matmul’s shadow, fused into its epilogue before results leave the chip. Bonus: LLMs can write fast CODA kernels too (approaching SoLs).
Han Guo tweet media
English
16
103
687
199.3K
Aviv Bick รีทวีตแล้ว
Arshia Afzal
Arshia Afzal@rshia_afz·
Raven is now also available at fla as well! Enjoy playing with it🐦‍⬛. Special thanks to amazing fla team 🎉! github.com/fla-org/flash-…
English
1
9
30
4.5K
Aviv Bick รีทวีตแล้ว
Arshia Afzal
Arshia Afzal@rshia_afz·
I wanted to brag about this earlier, but here we go. I’m super excited to say I got a spot in legendary @ycombinator Startup School 2026 🎉! I might need a bit of help getting a U.S. visa because of my nationality, is there anyone who could help. @extraordinary maybe🥺?! Also, can’t believe I got a visa support letter for the Startup School from the legend @garrytan! I’d really appreciate it if anyone could share this post 🙏 so I can hopefully find a way to join this amazing event!
Arshia Afzal tweet media
English
20
6
109
72.9K
Aviv Bick รีทวีตแล้ว
Aviv Bick
Aviv Bick@avivbick·
Yes, in principle delta = 0 could preserve memory indefinitely. But in practice, exact zero is hard to learn, since the model has to coordinate two things: keep decay at 1 and make the new update 0. Even then, it doesn’t solve write allocation: where does new information go without interfering with what’s already stored? Raven makes this structural: unselected slots are frozen by routing, while other slots remain available for new writes.
English
0
0
1
29
Aviv Bick
Aviv Bick@avivbick·
SSMs fail on recall tasks they have the capacity to solve. The two dominant approaches today, SSMs and sliding-window attention, both lack persistence: memory either decays over time or gets evicted. We built Raven to fix this, surpassing all prior linear models even at 16× their training sequence length. 🧵🐦‍⬛
English
5
58
396
52.3K
Aviv Bick
Aviv Bick@avivbick·
8/ Hybrids benefit too. Mixing Raven with NoPE attention gets 95.4% on NIAH-2 at 16K and 80.8% at 64K -- while comparable Mamba-2/GDN hybrids stay below 5%! Selective persistent slots + precise attention = powerful combo
Aviv Bick tweet media
English
1
0
12
813