Suhong Moon

60 posts

Suhong Moon

@SuhongMoon

PhD student at Berkeley AI Research (@berkeley_ai)

Berkeley Katılım Aralık 2020

312 Takip Edilen128 Takipçiler

Sabitlenmiş Tweet

Suhong Moon@SuhongMoon·23 Haz

Rich backstories enable deeper persona binding. LLMs conditioned with the backstories go beyond predicting human opinions to simulating how people perceive others and are perceived in return.

Minwoo (Josh) Kang@joshminwookang

🤔 Do LLMs exhibit in-group↔out-group perceptions like us? ❓ Can they serve as faithful virtual subjects of human political partisans? Excited to share our paper on taking LLM virtual personas to the *next level* of depth! 🔗 arxiv.org/abs/2504.11673 🧵

English

368

Suhong Moon retweetledi

Joseph Jeesung Suh@JosephJSSuh·6 Kas

LLMs have dominated recent work on simulating human behaviors. But do you really need them? In discrete‑choice settings, our answer is: not necessarily. A lightweight graph neural network (GNN) can match or beat strong LLM-based methods. Paper: arxiv.org/abs/2511.02135 🧵👇

English

36K

Suhong Moon@SuhongMoon·17 Eyl

@redstone_hong Congrats!

English

Hongsuk Benjamin Choi@redstone_hong·17 Eyl

Check out our full code release!!! And I am also glad to see that VideoMimic is in the best paper finalist at CoRL 2025. Seoul!! github.com/hongsukchoi/Vi…

Arthur Allshire@arthurallshire

excited that VideoMimic was just nominated for an award at CoRL, see you soon in Seoul! 🇰🇷 we have also now also released our sim/real-robot deployment pipeline, including checkpoints and nice viser web viz -- check it out github.com/hongsukchoi/Vi…

English

2.7K

Suhong Moon retweetledi

Kayo Yin@kayo_yin·26 Haz

📅 The submission deadline has been extended to June 27th AoE! Looking forward to your ideas :)

Kayo Yin@kayo_yin

Happy to announce the first workshop on Pragmatic Reasoning in Language Models — PragLM @ COLM 2025! 🧠🎉 How do LLMs engage in pragmatic reasoning, and what core pragmatic capacities remain beyond their reach? 🌐 sites.google.com/berkeley.edu/p… 📅 Submit by June 23rd

English

14.9K

Suhong Moon@SuhongMoon·20 May

@jmin__cho @unccs @JHUCompSci Congratulations!

English

202

Jaemin Cho@jmin__cho·20 May

Sharing some personal updates 🥳: - I've completed my PhD at @unccs! 🎓 - Starting Fall 2026, I'll be joining the Computer Science dept. at Johns Hopkins University (@JHUCompSci) as an Assistant Professor 💙 - Currently exploring options + finalizing the plan for my gap year (Aug 2025 - Jul 2026), so feel free to reach out! 🔎 Endless thanks to my amazing advisor @mohitban47, the @uncnlp group, my partner @HeesooJang2, and my family. I couldn’t have done this without your constant support 🙏 Also, a heartfelt shoutout to all the collaborators I’ve worked with over the years—your ideas, encouragement, and hustle have meant the world. Excited for what’s ahead. Let’s keep building together! ❤️

English

448

91K

Suhong Moon retweetledi

Nick Lee@nicholaszlee·3 May

🚀 Excited to share that our paper on Plan-and-Act has been accepted to ICML 2025. Below is a TLDR: 🔎 Problem: • LLM agents struggle on complex, multi-step web tasks (or API calls for that matter). • Why not add planning for complex tasks and decouple planning and execution? • Planning only helps if it’s accurate, and LLMs aren’t trained for that. • Even small plan errors can drastically degrade performance. 💡 Thoughts: • Separate PLANNER and EXECUTOR models: Web agents especially benefit, as acting on HTML needs different skills than step by step planning. • Finetune each model with synthetic data to train the PLANNER and EXECUTOR models; no manual annotations or simulators needed. • Plan-and-Act provides a scalable framework to create such synthetic data in a scalable manner for web tasks 📦 How we generate synthetic data: • Use a Teacher LLM to generate new user queries from seed examples. • A second Teacher tries to solve each query, generating action trajectories. • We verify the success of each trajectory automatically. • Another LLM reverse-engineers a plan from the trajectory. • Finally, we expand this dataset further with more synthetic plans using LLMs. • We then use this data to fine-tune the base models ⚡ Results: 🏆 New SOTA for text-only open-source models with up to 40% improvement with our synthetic finetuning approach: • 57.58% on WebArena-Lite • 81.36% on WebVoyager • 48.15% on WebArena Paper: arxiv.org/abs/2503.09572 Joint work w/ @eren_lutfi78249 @sehoonkim418 @SuhongMoon @frt03_ @GopalaSpeech @KurtKeutzer @amir__gholami

English

1.9K

Suhong Moon retweetledi

Ritwik Gupta 🇺🇦@Ritwik_G·10 Mar

Do LLMs understand probability distributions? Can they serve as effective simulators of probability? No! However, in our latest paper that via in-context learning, LLMs update their broken priors in a manner akin to Bayseian updating. 📝 arxiv.org/abs/2503.04722

English

159

32.3K

Suhong Moon retweetledi

Coleman Hooper@coleman_hooper1·28 Şub

How can we efficiently scale up test-time compute with parallel tree search? 🚨 Introducing Efficient Tree Search (ETS): A new method for achieving efficient and accurate test-time search for LLM reasoning tasks! - Test-time scaling has emerged as a new axis for improving model performance by leveraging additional computation at inference time in order to solve more challenging problems. - One promising approach for scaling compute at test time is through search, where a model generates multiple potential candidates and we then filter these candidates down to a single final response. This type of search can be performed as a tree (where each level of the tree corresponds to taking an additional step towards solving the problem). - Previous tree search methods have either yielded low accuracy (due to performing insufficient exploration) or else have substantial efficiency penalties due to the high cost of exploring diverse trajectories. These efficiency penalties are because of increased memory requirements due to reduced shared KV cache state during the search process. Our method, ETS, addresses this challenge by encouraging KV cache sharing in the search process to reduce memory consumption, while maintaining the exploration of semantically diverse trajectories which is critical for attaining high accuracy. Paper: arxiv.org/abs/2502.13575 Code: github.com/SqueezeAILab/E… Joint work with: @sehoonkim418 @SuhongMoon Kerem Dilmen @sudomonish @nicholaszlee Michael Mahoney @Sophia_Shao_ @KurtKeutzer @amir__gholami 🧵 [1/7]

English

3.4K

Suhong Moon retweetledi

Simon Guo@simonguozirui·25 Şub

LLMs for GPU kernel🌽generation have been getting Pop🍿ular since our preview last Dec; excited to announce 📢 our full paper 📃 for KernelBench! Turns out KernelBench is quite challenging 🧠 — frontier models outperform the PyTorch Eager baseline <20% of the time. More 🧵👇

English

303

113.9K

Suhong Moon@SuhongMoon·26 Şub

Surveys are key for public opinion research but are costly. Can AI help? We fine-tune LLMs on SubPOP, a newly curated large-scale dataset, reducing the human-LLM opinion gap by up to 46%. Please check our paper and code!

Joseph Jeesung Suh@JosephJSSuh

Can LLMs assist public opinion survey designs by predicting responses? We fine-tune LLMs on our new large-scale survey response dataset, SubPOP, which reduces the distributional gap between human-LLM predictions by up to 46% 📊 A 🧵 on our findings: 👇

English

385

Suhong Moon retweetledi

Patrick Wu@tsunghan_wu·22 Tem

🚨 Launching The Visual Haystacks (VHs) Benchmark: the first "visual-centric" Needle-In-A-Haystack (NIAH) benchmark to assess LMMs' capability in long-context visual retrieval and reasoning. Check out the 🧵 to see our findings on models like #GPT4o, #Gemini, and more! (1/7)

English

14.3K

Suhong Moon retweetledi

Alex Pan@aypan_17·13 Ara

LLMs have behaviors, beliefs, and reasoning hidden in their activations. What if we could decode them into natural language? We introduce LatentQA: a new way to interact with the inner workings of AI systems. 🧵

English

172

34.2K

Suhong Moon@SuhongMoon·10 Ara

@woosuk_k I’m looking forward to seeing you!

English

Woosuk Kwon@woosuk_k·10 Ara

I’ll be at #NeurIPS 2024 this week and happy to chat about LLM inference. Feel free to reach out!

English

1.2K

Suhong Moon retweetledi

David Chan@_dmchan·4 Ara

🚨 Call for Papers! 🚨 Join us at the Workshop on Human Alignment in AI Decision-Making Systems (IEEE CAI 2025) to explore challenges & opportunities in aligning AI with human values & societal norms📅 Papers Due 1/15/25 Details ➡️ sites.google.com/view/ieee-cai-… #AIAlignment #AIethics

English

193

Suhong Moon retweetledi

Heesoo Jang@HeesooJang2·3 Ara

South Korean President Yoon Suk Yeol just declared emergency martial law to "eradicate the shameless pro-North Korean anti-state forces that are plundering the freedom and happiness of our people." This is a whole other level of authoritarian move we're seeing. #SouthKorea

English

5.6K

Suhong Moon@SuhongMoon·20 Kas

@coleman_hooper1 This is so cool!

English

Suhong Moon retweetledi

Coleman Hooper@coleman_hooper1·20 Kas

🚨 Introducing Squeezed Attention! A new fast attention method for long context inference providing up to 4x speed up 🚄 Many LLM applications require processing long input prompts for tasks such as document analysis and code generation; however, inference cost linearly increases with sequence length, making long context inference prohibitively expensive and slow. One key trait for many LLM applications is that a large portion of the input prompt is fixed across successive user queries, for example, when a user asks multiple questions against a document, knowledge source, or codebase. This means that we have the opportunity to perform offline optimizations to be able to accelerate attention for user inputs when they are received. Our hierarchical method can reduce the complexity of attention from linear to logarithmic with respect to the fixed context length. Squeezed Attention accelerates attention from the user input to the fixed context in the prompt by identifying which keys are important for a given query, and then only computing attention with these important keys. This identification is performed by comparing the query with clusters of keys, and then progressively refining this comparison using finer-grained clusters in order to identify the important keys for the query. Paper: arxiv.org/abs/2411.09688 Code: github.com/SqueezeAILab/S… Joint work with: @sehoonkim418 @hiva_moh @sudomonish June Paik from @FuriosaAI Michael Mahoney @KurtKeutzer @amir__gholami 🧵 [1/7]

English

2.5K

Suhong Moon@SuhongMoon·12 Kas

Please check out our recent BAIR blog post (@berkeley_ai) for a deeper dive into Anthology and its potential to create realistic virtual personas with LLMs! BAIR Blog: bair.berkeley.edu/blog/2024/11/1…

English

Suhong Moon@SuhongMoon·11 Kas

(12/n) This work would not have been possible without my amazing collaborators: @marwaabdulhai @joshminwookang @JeesungSuh20716, Widyadewi Soedarmadji, Eran Kohen Behar, @_dmchan, and John Canny

English

163

Suhong Moon@SuhongMoon·11 Kas

(1/n) 🧵 Can Large Language Models simulate different individuals' beliefs and opinions? Checkout our paper on conditioning LLMs to virtual personas for approximating individual human samples at #EMNLP2024! Paper: arxiv.org/abs/2407.06576… Code: github.com/CannyLab/antho…

English

8.9K

Keşfet

@redstone_hong @jmin__cho @unccs @JHUCompSci @mohitban47 @HeesooJang2 @eren_lutfi78249 @sehoonkim418