Razvan Dumitru

15 posts

Razvan Dumitru

@RazvanDuu

Katılım Haziran 2023

55 Takip Edilen16 Takipçiler

@rjsabouhi @maenstru56 @Vikas_NLP_UA @PanLiangming Exactly. ConciseRL formalizes that intuition: encode “be concise yet correct” in the reward, let PPO follow the gradient, and the attractors are short-but-sufficient chains. Appreciate the gradient-field lens!

English

RJ Sabouhi@rjsabouhi·5 Eyl

@RazvanDuu @maenstru56 @Vikas_NLP_UA @PanLiangming You already knew the attractors were encoded in the field. This makes it formal. “Gradient fields” learn the subtask structure hidden in the reward. Latent constraint flow as behavior. Symbolic recursion in motion.

English

117

Razvan Dumitru@RazvanDuu·5 Eyl

Thrilled to share that my first author paper ConciseRL is accepted to EMNLP 2025 (Findings). We train models to be right and succinct with a simple LLM-judged conciseness reward. Explanations below 👇 Paper: arxiv.org/abs/2505.17250 Code: github.com/RazvanDu/Conci… (1/6)

English

1.3K

Razvan Dumitru@RazvanDuu·5 Eyl

+2.2 accuracy points using ~12.5× fewer tokens on TheoremQA. Length adapts to difficulty; traces read cleaner; drops smoothly into existing RL pipelines. (5/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

On MATH: up to 31× fewer tokens on easier problems with ~+7 accuracy points; on the hardest tier: ~3.6× fewer tokens with +7.5 accuracy points. (4/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

Reward semantic conciseness (not just shortness). An LLM judge scores whether a chain is dense and sufficient; combine with correctness so we trim fat without losing substance. (3/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

Reasoning models overthink—long chains past the answer. That burns tokens, slows inference, and muddies evaluation. (2/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

Co-authors: @Yminglai @Vikas_NLP_UA @msurd — thank you! See you in Suzhou, Nov 4–9. 🙏 #EMNLP2025 (6/6)

English

220

Razvan Dumitru@RazvanDuu·5 Eyl

Big news: My first author paper CopySpec is accepted to EMNLP 2025 (Main). If the next tokens just repeat the context, we stop re-writing and copy (then verify). I’ll break it down in the replies 👇 Paper: arxiv.org/abs/2502.08923 Code: github.com/RazvanDu/CopyS… (1/6)

English

747

Razvan Dumitru@RazvanDuu·5 Eyl

Up to 3.08× faster (second-turn MT-Redundant) and +49% on top of speculative decoding; benefits grow with longer context. Details in the paper. (5/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

Drop-in on top of speculative decoding: rolling hashes find matches → a copy-and-verify path runs alongside your drafter → target model accepts/rejects. No special memory tricks. (4/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

Real workflows repeat: multi-turn chat, RAG follow-ups, summaries, code edits. That redundancy is free speed if you can safely reuse instead of regenerate. (3/6)

English

Razvan Dumitru@RazvanDuu·5 Eyl

CopySpec watches the running text, spots a chunk that already exists in the context/previous turn, proposes a “copy block,” and lets the target model check it. Same output, less generation. (2/6)

English

Razvan Dumitru retweetledi

Vikas Yadav@Vikas_NLP_UA·11 Haz

🎉 Our work “Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs” is accepted at #ACLFindings2025 📎 arxiv.org/abs/2406.17415 — Keep key layers high-precision, push others lower → compact LLMs w/ ~no accuracy loss — Simple LIM & ZD scores rank layers

English

876

Razvan Dumitru retweetledi

Darius Peteleaza@maenstru56·19 Tem

🧵Excited to attend #ICML2024 and present our paper titled "Enhancing Transformer RNNs with Multiple Temporal Perspectives" at the "Next Generation of Sequence Modeling Architectures" workshop! #AI

English

464

Keşfet

@rjsabouhi @maenstru56 @Vikas_NLP_UA @PanLiangming @Yminglai @msurd @elonmusk @BarackObama