alphaXiv

1.8K posts

alphaXiv banner
alphaXiv

alphaXiv

@askalphaxiv

High fidelity research

Katılım Kasım 2023
45 Takip Edilen37.1K Takipçiler
Sabitlenmiş Tweet
alphaXiv
alphaXiv@askalphaxiv·
Introducing MCP for arXiv Let your research agents stand on the shoulders of giants Fast multi-turn retrieval, keyword search, and embedding search tools across millions of arXiv papers 🚀
English
69
389
3K
236.1K
alphaXiv
alphaXiv@askalphaxiv·
Now available in addition to Gemini and Claude. Check out alphaXiv.org!
English
0
0
4
646
alphaXiv
alphaXiv@askalphaxiv·
Introducing GLM 5 Turbo for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references
English
2
8
65
4K
alphaXiv
alphaXiv@askalphaxiv·
Yann LeCun and his team dropped yet another paper! "V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning" In this V-JEPA upgrade, they showed that if you make a video model predict every patch, not just the masked ones AND at multiple layers, they are able to turn vague scene understanding into dense + temporal stable features that actually understands "what is where". This key insight drove improvements in segmentation, depth, anticipation, and even robot planning.
alphaXiv tweet media
English
19
92
618
33K
alphaXiv retweetledi
alphaXiv
alphaXiv@askalphaxiv·
Introducing MCP for arXiv Let your research agents stand on the shoulders of giants Fast multi-turn retrieval, keyword search, and embedding search tools across millions of arXiv papers 🚀
English
69
389
3K
236.1K
alphaXiv
alphaXiv@askalphaxiv·
"Mixture-of-Depths Attention" This paper teaches a Transformer to attend not just across tokens, but also to depth KV from its earlier layers. That helps recover shallow-layer signals that standard residual stacking tends to dilute, improving performance with only a small extra compute cost. Similar idea to Kimi’s Attention Residuals, but MoDA modifies the attention module itself, while AttnRes changes the residual/depth aggregation path.
alphaXiv tweet media
English
11
87
523
21.6K
alphaXiv
alphaXiv@askalphaxiv·
The current RL setup might not be the right objective for reasoning models In our latest AI4Science talk, Fahim (@FahimTajwar10), PhD student at CMU, presented “Maximum Likelihood Reinforcement Learning”, a new framework for binary-reward tasks like math reasoning, navigation, and program synthesis. The core idea is simple: standard RL mainly optimizes pass@1, which can over-focus on easy prompts and underuse rare correct rollouts from hard ones. MaxRL instead approximates a maximum-likelihood objective, so the model learns more effectively from those sparse successes. What makes this especially interesting is that the method seems to preserve much more solution diversity. Across the experiments discussed in the talk, MaxRL led to stronger pass@k, less diversity collapse, and better inference efficiency than standard REINFORCE or GRPO, especially when sampling multiple rollouts matters. A very interesting talk if you’re into reasoning LLMs and RL!
alphaXiv tweet media
English
3
15
100
8.7K
alphaXiv
alphaXiv@askalphaxiv·
mamba 3: Mamba with RoPE! "Improved Sequence Modeling using State Space Principles" They show that state space models can have both speed and performance! In this new iteration, Mamba now has better recurrence design, complex-valued state tracking, and a MIMO update. It matches and beats prior mamba style models (mamba 2 & gated DeltaNet) on both language and retrieval tasks, while keeping the memory constant and sustains the fast decoding benefits.
alphaXiv tweet media
English
6
26
139
6K
alphaXiv
alphaXiv@askalphaxiv·
can AI do real math research? "HorizonMath: Measuring AI Progress Toward Mathematical Discovery with Automatic Verification" This paper turns a vague debate into a measurable test by introducing a contamination-resistant benchmark of 101 mostly unsolved problems across 8 domains in math. The solutions are genuinely unknown but answers can still be checked automatically, so it gives a scalable way to measure whether AI is moving beyond solving known textbook problems toward actual mathematical discoveries The paper shows that today’s frontier models still score near zero overall, while GPT 5.4 Pro is the only model they evaluate to produce two potentially novel improvements on published baselines, suggesting early signs of progress.
alphaXiv tweet media
English
8
41
159
9.3K
alphaXiv
alphaXiv@askalphaxiv·
correction: The talk starts at 12pm PT, March 18th!
alphaXiv@askalphaxiv

RL Isn’t Actually Optimizing What We Think, And That’s a Problem Come join us for this AI4Science talk: Maximum Likelihood Reinforcement Learning (MaxRL). In this session, the author of MaxRL @FahimTajwar10 will cover their paper that takes a step back and asks a fundamental question: when we use RL for tasks like reasoning, coding, or navigation, are we even optimizing the right objective? Whether you’re working on RL, LLM reasoning, or just curious about how training objectives shape model behavior, this is one to check out. 🗓 Wednesday March 18th 2026 · 10AM PT 🎙 Featuring Fahim Tajwar 💬 Casual Talk + Open Discussion

English
0
5
19
3.6K
alphaXiv
alphaXiv@askalphaxiv·
"Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange" This paper shows how to turn AI from a single helpful assistant into a decentralized scientific lab. With many autonomous agents independently run tools, exchange traceable research artifacts, and converge on new discoveries without a central boss! This approach seems more practical because it points toward scalable, auditable crowdsourced science by machines, rather than relying on one giant model or a rigid top-down coordinator to manage the whole research loop. This matters because it makes autonomous science more parallel, modular, and inspectable, and closer simulate real scientific communities work.
alphaXiv tweet media
English
15
43
221
11.3K
alphaXiv
alphaXiv@askalphaxiv·
“Attention Residuals” is now available on AlphaXiv! In standard transformer, every layer just inherits an equal sum of all earlier layers, so as models get deeper, useful computations get diluted instead of being selectively reused. The research team at @Kimi_Moonshot proposes Attention Residuals which fixes this by letting each layer attend over previous layers with learned weights, so depth works more like retrieval than accumulation. This makes training more stable, improving scaling and downstream performance with almost no extra overhead, specifically, 1.25x compute advantage against the standard transformer baseline.
alphaXiv tweet media
English
13
49
423
16.5K