Nick Lee

10 posts

Nick Lee

@nicholaszlee

PhD Student at @berkeley_ai

Berkeley, CA Katılım Nisan 2017

143 Takip Edilen57 Takipçiler

Nick Lee retweetledi

Louis Liu@LouisLiu0324·24 Şub

🎙️Introducing StyleStream: the first streamable zero-shot voice style conversion system. 🚀Clone timbre, accent, and emotion simultaneously, with state-of-the-art quality, streaming locally on a single RTX 4060. 💡The Destylizer trained with ASR supervision + an ultra-narrow information bottleneck strips away voice style while preserving only linguistic content. Crucially, it keeps the original duration, making clean chunk-by-chunk streaming straightforward. 🎨The diffusion Stylizer then faithfully re-renders the content with the target style. 📰Paper: arxiv.org/abs/2602.20113… 💻Code: github.com/Berkeley-Speec… 🔊Demo: berkeley-speech-group.github.io/StyleStream/ ▶️Real-time demo below ⬇️

English

5.6K

Nick Lee retweetledi

Yuezhou Hu@yuezhouhu·2 Şub

Take a look at Residual Context Diffusion (RCD): a simple idea to boost diffusion LLMs—stop wasting “remasked” tokens!!! arxiv.org/abs/2601.22954 (Example on AIME24. RCD increases parallelism by 4x while reaching the baseline's peak accuracy.) #DiffusionLLM #LLM #Reasoning #GenAI

GIF

English

201

37.9K

Nick Lee@nicholaszlee·3 May

🚀 Excited to share that our paper on Plan-and-Act has been accepted to ICML 2025. Below is a TLDR: 🔎 Problem: • LLM agents struggle on complex, multi-step web tasks (or API calls for that matter). • Why not add planning for complex tasks and decouple planning and execution? • Planning only helps if it’s accurate, and LLMs aren’t trained for that. • Even small plan errors can drastically degrade performance. 💡 Thoughts: • Separate PLANNER and EXECUTOR models: Web agents especially benefit, as acting on HTML needs different skills than step by step planning. • Finetune each model with synthetic data to train the PLANNER and EXECUTOR models; no manual annotations or simulators needed. • Plan-and-Act provides a scalable framework to create such synthetic data in a scalable manner for web tasks 📦 How we generate synthetic data: • Use a Teacher LLM to generate new user queries from seed examples. • A second Teacher tries to solve each query, generating action trajectories. • We verify the success of each trajectory automatically. • Another LLM reverse-engineers a plan from the trajectory. • Finally, we expand this dataset further with more synthetic plans using LLMs. • We then use this data to fine-tune the base models ⚡ Results: 🏆 New SOTA for text-only open-source models with up to 40% improvement with our synthetic finetuning approach: • 57.58% on WebArena-Lite • 81.36% on WebVoyager • 48.15% on WebArena Paper: arxiv.org/abs/2503.09572 Joint work w/ @eren_lutfi78249 @sehoonkim418 @SuhongMoon @frt03_ @GopalaSpeech @KurtKeutzer @amir__gholami

English

1.9K

Nick Lee@nicholaszlee·12 Nis

An important question is, how does this compare to data augmentation? Why not just apply data augmentation on the examples that the student model got wrong? We did this ablation study and found that LLM2LLM outperforms data augmentation techniques, sometimes even surpassing adding more real unseen data from the training dataset.

English

Nick Lee@nicholaszlee·12 Nis

Across various datasets with different tasks and difficulty, LLM2LLM significantly improved performance over finetuning on the initial seed data, especially in the low-data regime outperforming various baselines.

English

149

Nick Lee@nicholaszlee·12 Nis

What to do if you don’t have enough data to fine-tune an LLM? Fine-tuning is a very promising method for specializing LLMs but it often requires a non-trivial number of data points. But in many cases it is very hard to obtain enough data. LLM2LLM addresses this by utilizing a teacher LLM model to iteratively enrich your initial dataset by adaptively generating new data based on where the student LLM is making mistakes. This can improve the LLM performance significantly in the low-data regime, outperforming various baselines (e.g. up to 24.2% improv. on GSM8K) Link to Paper: arxiv.org/pdf/2403.15042… Link to Code: github.com/SqueezeAILab/L… Joint work with @JaWattanaw19133 @sehoonkim418 @Karttikeya_m @shengs1123 @GopalaSpeech Michael Mahoney @KurtKeutzer @amir__gholami

English

836

Nick Lee retweetledi

LlamaIndex 🦙@llama_index·29 Ara

🚨 SOTA Parallel Function Calling Agents in @llama_index 🚨 The LLMCompiler project by Kim et al. (@berkeley_ai) is a state-of-the-art agent framework that enables 1) DAG-based planning, and 2) parallel function execution. Makes it much faster than sequential approaches like ReAct (and performs better) ⚡️ We gave the paper + repo a close read and implemented a native version in @llama_index 🔥. This means that you can now apply LLMCompiler on top of your LLM + data pipelines (whether they’re RAG or other agents). Use this to answer complex/multi-part/comparison questions over your data much more quickly than ReAct. Unlike @OpenAI function calling, this framework can be used with any chat/completion-based LLM. We implemented the agent as a LlamaPack 📦: llamahub.ai/l/llama_packs-…. Here’s a notebook on how to use it: github.com/run-llama/llam… Huge credits to every author of the LLMCompiler project. We took a lot of inspiration from the LLMCompiler repo/paper: Repo: github.com/SqueezeAILab/L… Paper: arxiv.org/pdf/2312.04511…

English

319

118.1K

Nick Lee retweetledi

sehoonkim@sehoonkim418·9 Ara

How can we make LLM agents work together efficiently on complex tasks at a large scale? 🚨Introducing LLMCompiler🦙🛠️, a tool that compiles an effective plan for executing multiple tasks in parallel. It helps create scalable LLM applications, identifies tasks for parallel execution, and manages dependencies. LLMCompiler is compatible with both open-source and OpenAI models, marking a stride towards more efficient and intelligent software systems. 🧵1/n 📌Link to Paper: arxiv.org/abs/2312.04511 📌Link to Code: github.com/SqueezeAILab/L… Joint work with: @snrpsnr @ryan_tabrizi @nicholaszlee Michael Mahoney @KurtKeutzer @amir__gholami

English

124

741

171.6K

Keşfet

@eren_lutfi78249 @sehoonkim418 @SuhongMoon @frt03_ @GopalaSpeech @KurtKeutzer @amir__gholami @JaWattanaw19133