Yu-Neng Chuang

24 posts

Yu-Neng Chuang

@YuNengChuang

Katılım Aralık 2022

63 Takip Edilen24 Takipçiler

Yu-Neng Chuang retweetledi

Feng Luo@FengLuo895614·10 Haz

🚀 Can LLMs stop overthinking when detailed reasoning isn't needed? Excited to share our latest work on LLM reasoning: AutoL2S 🧠⚡ 📄 Paper: arxiv.org/abs/2505.22662 🤖 Model: huggingface.co/amandaa/AutoL2… LLMs often overthink—generating unnecessarily long CoTs even for easy questions, increasing cost & latency. We propose Auto Long-Short Reasoning (AutoL2S): A model-agnostic framework that dynamically choose long or short reasoning based on question complexity. 💡 Just add a token—that's all it takes to teach the model when to skip redundant steps. 🖼️ (See below 👇) How AutoL2S switches reasoning strategies using simple markers like and → 📉 Up to 57% reduction in CoT length across four reasoning tasks without performance drop. Credits to all co-authors: @FengLuo895614 *, @YuNengChuang*, @Guanchu_Gary, Hoang Anh Duy Le, @henryzhongsc , Hongyi Liu, @jiayiy , @YangSui, Vladimir Braverman, Vipin Chaudhary, @huxia

English

621

Yu-Neng Chuang retweetledi

elvis@omarsar0·21 Mar

A survey on efficient reasoning for LLMs. That was quick! I have been featuring papers on the topic of efficient reasoning and I see a few familiar papers in this survey. Good read overall!

English

374

57.1K

Yu-Neng Chuang retweetledi

Sumit@_reachsumit·3 Oca

MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation Proposes a training-free RAG framework using multiple LLM agents to collaboratively filter retrieved documents, improving retrieval precision while maintaining high recall. 📝arxiv.org/abs/2501.00332

English

1.4K

Yu-Neng Chuang@YuNengChuang·12 Kas

📢Excited to present "Taylor Unswift" poster at #EMNLP24 in Miami! Join us on Nov 13 (Wed), 10:30–12:00, at Main #778. "Taylor Unswift" aims to solve the dilemma of secured weight release for LLM developers and users. 🔗Paper: arxiv.org/pdf/2410.05331 🔗Code: github.com/guanchuwang/Ta… Wanna know more about "Taylor Unswift"😉: 🚨 Oftentimes, model developers face a dilemma: open-source their models and lose control, or offer closed APIs but bear costs and deter privacy-conscious users. 🚑 Introducing "Taylor Unswift": a method using Taylor Expansion Theory to protect model weights while allowing users to run models on their own data without accessing the weights. These correspond to the 'Taylor' and 'Unswift' in the title. 🌟 Developers can prevent misuse of their models, while users can run models on their own data without sharing it—unlike with services like the ChatGPT API. More detailed insights can be found in the paper! Kudos to all co-authors: @Guanchu_Gary*, @YuNengChuang*, @RuixiangT, @henryzhongsc, @jiayiy, @serendip410, @ziruirayliu, Vipin Chaudhary, Shuai Xu, James Caverlee, @huxia #LLM #security #NLP #EMNLP

English

812

Yu-Neng Chuang retweetledi

Yuchen Jin@Yuchenj_UW·29 Eki

After "Attention Is All You Need", AI paper titles be like:

English

176

1.7K

204.4K

Yu-Neng Chuang retweetledi

Jiayi Yuan@jiayiy·20 Eyl

🚀Excited to share our latest #EMNLP2024 work on benchmarking the long context ability with KV Cache compression across RNN-based architectures, token eviction, prompt compression, and quantization. We also provide an easy-to-use codebase (it also has my favorite WoW quote 😉). Feel free to give it a try and ⭐ it if you find it useful! 📄 Paper: arxiv.org/abs/2407.01527 💻 Code: github.com/henryzhongsc/l… Some interesting findings/suggestions include: 1️⃣ Maintaining an uncompressed prefill process is essential for performance, especially with harder tasks. 2️⃣ Combining RNN-based models with attention significantly enhances long-context capabilities. 3️⃣ In "needle-in-a-haystack" evaluation for recent LLMs like Llama-3, we should use longer needles (like 64 digits) since these models tokenize multiple digits into one token. More results and insights can be found in the paper! Kudos to all collaborators: @jiayiy, Hongyi Liu, @henryzhongsc, @YuNengChuang, Songchen Li, Guanchu Wang, Duy Le, @serendip410, Vipin Chaudhary, @ZhaozhuoX, @ziruirayliu, @huxia

English

10.2K

Yu-Neng Chuang retweetledi

Valeriy M., PhD, MBA, CQF@predict_addict·24 Haz

A new open source time series LLM - LTSM-bundle claims to achieve SOTA performance by recycling best practices from other LLMs. #timeseries #llm #forecasting

English

4.2K

Yu-Neng Chuang@YuNengChuang·26 Haz

Introducing the LTSM-bundle Package! 🌟Thrilled to launch our open-source tool 🔧Assess various crucial designs to train Large Time Series Models (LTSMs), and identity the best training practices 🔗 Paper: arxiv.org/abs/2406.14045 🔗 GitHub: github.com/daochenzha/ltsm

English

1.7K

Yu-Neng Chuang retweetledi

HongyeJ@NeurIPS@serendip410·24 Nis

We tested our SelfExtend (arxiv.org/pdf/2401.01325…) for LLama-3-8B/70B-Instruct on the new challenging long context benchmark Ada-Eval (arxiv.org/abs/2404.06480). The task is selecting the best answer from candidates. The results are pretty good! 🌟 Highlights: 1: Equipped with SelfExtend, Llama-3-70B beats all except GPT-4-turbo. 2: Even for Mistral-7B-Instruct-v0.2, which has enough context window, SelfExtend can boost its performance! !Especially for long cases. 3: The Llama-3 series, at their respective scales, are impressive! Check our repo for more details on SelfExtend: github.com/datamllab/Long…

English

1.3K

Yu-Neng Chuang retweetledi

HongyeJ@NeurIPS@serendip410·17 Nis

🚨Recently, we attempted to investigate the impact of different group size/neighbor window combinations on SelfExtend using the 'Needle In a Haystack' task. 🧐 Generally, SelfExtend is not overly sensitive to the two hyperparameters. We also got some intriguing findings: 1️⃣ Mistral-ins-0.1 stands out with a surprisingly narrow flexibility zone. Does this arise from their SWA during retraining? If so, how? 🤔 2️⃣The 70b LLama-2 has a larger flexible area compared to its 7b siblings! Could it be larger models' superior noise handling or just more layers? 🧩 3️⃣Phi-2, although much smaller, has a relatively large flexible area. Does this stem from the fact that it uses 40% of the head dimension for RoPE, or just its talent at these tasks? ✨ Dive into our repo for more details! 🔗 github.com/datamllab/Long… #MachineLearning #LLMs

English

Yu-Neng Chuang retweetledi

Zirui Liu@ziruirayliu·13 Nis

🚀 Deploying long-context LLMs is hindered by huge KVCache size. Our new method KIVI directly addresses this problem by quantizing KVCache into 2/4bit number. In Mistral-v0.2 testing, KIVI demonstrates similar accuracy as the full-precision baseline with 5.3X less KV Cache!

English

10.7K

Yu-Neng Chuang retweetledi

Xiaotian (Max) Han@XiaotianHan1·6 Nis

SelfExtend, without further training, upgrades Mistral-inst-v0.1 to match the performance level of its successor, v0.2, in qa tasks. therefore, the value of SelfExtend is at least equivalent to the training cost of Mistral-inst-v0.2?

English

1.8K

Yu-Neng Chuang retweetledi

Wei-Rui Chen@WeiRuiChen01·10 Nis

🤔 How many languages does #ChatGPT know? 🚀 Our work Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability is an attempt to answer this question and has been accepted to #NAACL2024 Findings. Paper Link: arxiv.org/abs/2311.09696 (1/5)

English

2.5K

Yu-Neng Chuang retweetledi

HongyeJ@NeurIPS@serendip410·28 Şub

Despite the mixed feelings about Google's latest Gemma model, we're big fans! @GoogleAI Why? Coz we found it pairs incredibly well with our SelfExtend 🤣🤣🤣 - like, perfectly! With Self-Extend, no fine-tuning needed, we effortlessly expanded Gemma's window from 8k to 90k+! On the 'Needle in the haystack' task, Gemma-2b-it even struggled at 8k, but with SelfExtend, Gemma-2b-it easily tackles it within 90k range! #AI #Gemma #SelfExtend #LLMs 🚀 Paper: arxiv.org/abs/2401.01325 Github: github.com/datamllab/Long…

English

210

32.4K

Yu-Neng Chuang retweetledi

HongyeJ@NeurIPS@serendip410·23 Şub

🚀 Our Self-Extend method remains effective for Gemma-7b. We've successfully applied the Self-Extend patch to Gemma, showcasing its potential in passkey retrieval tasks (16k). Our exploration continues as we test it on more complex tasks and longer sequences (x4, x8?). Have you encountered any issues? Do you have new results in your own case? We're eager to hear from you – please email us! Your feedback is invaluable as we push the boundaries of our research. Stay tuned for more updates! 🌟 Github: github.com/datamllab/Long… Paper: arxiv.org/pdf/2401.01325…

English

3.3K

Keşfet

@FengLuo895614 @Guanchu_Gary @henryzhongsc @jiayiy @YangSui @huxia @RuixiangT @serendip410