PapersAnon

700 posts

PapersAnon banner
PapersAnon

PapersAnon

@papers_anon

Just a fan of acceleration. I read and post interesting papers. Let's all make it through.

SAITAMA Katılım Şubat 2024
49 Takip Edilen2.4K Takipçiler
Sabitlenmiş Tweet
PapersAnon
PapersAnon@papers_anon·
rentry.org/LocalModelsLin… Various links for ML and local models (not just LLMs) that's kept fairly updated. rentry.org/LocalModelsPap… ML papers I've read that I think are interesting. Also keep a text file at the top of all the abstracts for easy searching.
English
1
17
140
25.3K
PapersAnon
PapersAnon@papers_anon·
Came across what could be an interesting benchmark. Old famicom game called Radical Bomber: Jurai-Kun. Asymmetrical boardgame with 1 runner and 4 chasers. Runner has the ability to bomb certain connections and limited double turns. Some special blocks too. youtube.com/watch?v=A8mPtw…
YouTube video
YouTube
PapersAnon tweet media
English
0
1
2
164
PapersAnon
PapersAnon@papers_anon·
VoXtream2: Full-stream TTS with dynamic speaking rate control Combines a distribution matching mechanism over duration states with CFG across conditioning signals to improve controllability and synthesis quality. Runs 4 times faster than real time on a consumer GPU. Links below
PapersAnon tweet media
English
1
4
10
761
PapersAnon
PapersAnon@papers_anon·
Speculative Speculative Decoding Draft model predicts likely verification outcomes and prepares speculations pre-emptively for them. If the actual verification outcome is in the predicted set, a speculation can be returned immediately, eliminating drafting overhead. Links below
PapersAnon tweet media
English
3
5
42
2.3K
PapersAnon
PapersAnon@papers_anon·
Multi-Head Low-Rank Attention Novel attention mechanism with native 4-way tensor parallelism support. At 2.9B scale achieves SOTA performance on perplexity and zero-shot common-sense reasoning benchmarks. 2.8× decoding speedup over MLA. Links below
PapersAnon tweet media
English
1
13
126
12.6K
PapersAnon
PapersAnon@papers_anon·
Aletheia tackles FirstProof autonomously From Deepmind. Autonomously solved 6 problems (2, 5, 7, 8, 9, 10) out of 10 according to majority expert assessments; notes that experts were not unanimous on Problem 8 (only). Links below
PapersAnon tweet media
English
1
0
6
451
PapersAnon
PapersAnon@papers_anon·
Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum Scales orthogonalized momentum using a single adaptive stepsize, preserving orthogonality while improving upon Muon at negligible additional cost. Links below
PapersAnon tweet media
English
3
12
103
7.9K
PapersAnon
PapersAnon@papers_anon·
HiFloat4 Format for Language Model Inference Packs 64 4-bit elements with 32 bits of shared scaling metadata, averaging 4.5 bits per value. Achieves higher average accuracy than the state-of-the-art NVFP4 format across multiple models and diverse downstream tasks. Links below
PapersAnon tweet media
English
3
6
26
1.6K
PapersAnon
PapersAnon@papers_anon·
MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs Reparameterizes expert updates via per-expert null-space projections that keep router inputs invariant and thereby suppress routing shifts. Links below
PapersAnon tweet media
English
1
1
4
518
PapersAnon
PapersAnon@papers_anon·
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos From Nvidia. Foundation world model that learns diverse interactions and dexterous controls from 44k hours of egocentric human videos. Links below
PapersAnon tweet media
English
1
0
6
678
PapersAnon
PapersAnon@papers_anon·
SimpleGPT: Improving GPT via A Simple Normalization Strategy Uses SimpleNorm which reduces the Hessian norm of the activation with respect to the loss enabling substantially larger admissible learning rates. Achieves training loss 0.08 lower than LLaMA2 with QKNorm Links below
PapersAnon tweet media
English
2
1
13
578
PapersAnon
PapersAnon@papers_anon·
DiffuSpeech: Silent Thought, Spoken Answer via Unified Speech-Text Diffusion introduce a paradigm where speech LLMs generate internal text reasoning alongside spoken responses, with thinking traces informing speech quality. Links below
PapersAnon tweet media
English
1
1
5
410