PapersAnon

700 posts

PapersAnon

@papers_anon

Just a fan of acceleration. I read and post interesting papers. Let's all make it through.

SAITAMA Katılım Şubat 2024

49 Takip Edilen2.4K Takipçiler

Sabitlenmiş Tweet

PapersAnon@papers_anon·24 Haz

rentry.org/LocalModelsLin… Various links for ML and local models (not just LLMs) that's kept fairly updated. rentry.org/LocalModelsPap… ML papers I've read that I think are interesting. Also keep a text file at the top of all the abstracts for easy searching.

English

140

25.3K

PapersAnon@papers_anon·1d

Came across what could be an interesting benchmark. Old famicom game called Radical Bomber: Jurai-Kun. Asymmetrical boardgame with 1 runner and 4 chasers. Runner has the ability to bomb certain connections and limited double turns. Some special blocks too. youtube.com/watch?v=A8mPtw…

YouTube

English

164

PapersAnon@papers_anon·3d

arxiv.org/abs/2603.13518 herimor.github.io/voxtream2/ Page not live yet huggingface.co/herimor Will probably be posted here Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

142

PapersAnon@papers_anon·3d

VoXtream2: Full-stream TTS with dynamic speaking rate control Combines a distribution matching mechanism over duration states with CFG across conditioning signals to improve controllability and synthesis quality. Runs 4 times faster than real time on a consumer GPU. Links below

English

761

PapersAnon@papers_anon·4 Mar

arxiv.org/abs/2603.03251 github.com/tanishqkumar/s… Repo isn't live yet Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

320

PapersAnon@papers_anon·4 Mar

Speculative Speculative Decoding Draft model predicts likely verification outcomes and prepares speculations pre-emptively for them. If the actual verification outcome is in the predicted set, a speculation can be returned immediately, eliminating drafting overhead. Links below

English

2.3K

PapersAnon@papers_anon·3 Mar

arxiv.org/abs/2603.02188 github.com/SongtaoLiu0823… huggingface.co/Soughing/MLRA Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

684

PapersAnon@papers_anon·3 Mar

Multi-Head Low-Rank Attention Novel attention mechanism with native 4-way tensor parallelism support. At 2.9B scale achieves SOTA performance on perplexity and zero-shot common-sense reasoning benchmarks. 2.8× decoding speedup over MLA. Links below

English

126

12.6K

PapersAnon@papers_anon·25 Şub

arxiv.org/abs/2602.21201 github.com/google-deepmin… arxiv.org/abs/2602.05192 FirstProof challenge paper daniellitt.com/blog/2026/2/20… Interesting article about FirstProof Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

277

PapersAnon@papers_anon·25 Şub

Aletheia tackles FirstProof autonomously From Deepmind. Autonomously solved 6 problems (2, 5, 7, 8, 9, 10) out of 10 according to majority expert assessments; notes that experts were not unanimous on Problem 8 (only). Links below

English

451

PapersAnon@papers_anon·20 Şub

arxiv.org/abs/2602.17080 github.com/minxin-zhg/namo Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

1.1K

PapersAnon@papers_anon·20 Şub

Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum Scales orthogonalized momentum using a single adaptive stepsize, preserving orthogonality while improving upon Muon at negligible additional cost. Links below

English

103

7.9K

PapersAnon@papers_anon·13 Şub

arxiv.org/abs/2602.11287 Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

448

PapersAnon@papers_anon·13 Şub

HiFloat4 Format for Language Model Inference Packs 64 4-bit elements with 32 bits of shared scaling metadata, averaging 4.5 bits per value. Achieves higher average accuracy than the state-of-the-art NVFP4 format across multiple models and diverse downstream tasks. Links below

English

1.6K

PapersAnon@papers_anon·12 Şub

arxiv.org/abs/2602.10965 github.com/Terence-Gu/MoE… Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

239

PapersAnon@papers_anon·12 Şub

MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs Reparameterizes expert updates via per-expert null-space projections that keep router inputs invariant and thereby suppress routing shifts. Links below

English

518

PapersAnon@papers_anon·9 Şub

arxiv.org/abs/2602.06949 dreamdojo-world.github.io Code link not live yet Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

325

PapersAnon@papers_anon·9 Şub

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos From Nvidia. Foundation world model that learns diverse interactions and dexterous controls from 44k hours of egocentric human videos. Links below

English

678

PapersAnon@papers_anon·3 Şub

arxiv.org/abs/2602.01212 github.com/Ocram7/SimpleG… No code posted yet Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

422

PapersAnon@papers_anon·3 Şub

SimpleGPT: Improving GPT via A Simple Normalization Strategy Uses SimpleNorm which reduces the Hessian norm of the activation with respect to the loss enabling substantially larger admissible learning rates. Achieves training loss 0.08 lower than LLaMA2 with QKNorm Links below

English

578

PapersAnon@papers_anon·2 Şub

arxiv.org/abs/2601.22889 Some interesting papers I keep updated rentry.org/LocalModelsPap…

English

272

PapersAnon@papers_anon·2 Şub

DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion introduce a paradigm where speech LLMs generate internal text reasoning alongside spoken responses, with thinking traces informing speech quality. Links below

English

410

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry