Robert Vacareanu

124 posts

Robert Vacareanu

Robert Vacareanu

@robert_nlp

fighting entropy | PhD from @UofArizona Working on #nlproc Past: 2022, 2023: Applied Scientist Intern (@AWS)

Katılım Nisan 2022
1.8K Takip Edilen242 Takipçiler
Robert Vacareanu retweetledi
Francesco Orabona
Francesco Orabona@bremen79·
As promised, we put on Arxiv the proof we did with Gemini. arxiv.org/pdf/2505.20219 This shows that the Polyak stepsize not only will not reach the optimum, but it can cycle, when used without the knowledge of f*. Gemini failed when prompted directly ("Find an example where the best and average iterate do not converge"), but it worked when I gave more specific instructions ("Find a function and an initial point where it generates a cycle of length 3 and none of the iterates nor their average converge to the minimum"). As you can see, the proof is not difficult, but it is very creative: Rewriting the update with trigonometric functions and using their doubling formulas to show the cycle is not something I would have thought!
Francesco Orabona tweet mediaFrancesco Orabona tweet media
Francesco Orabona@bremen79

This is a turning point: I just proved a complex math result useful for my research using an LLM. I am not sure if I should be happy or scared...

English
6
61
417
69.8K
Robert Vacareanu retweetledi
MohammadHossein Rezaei
MohammadHossein Rezaei@mhrezaeics·
If you’re at NAACL today, I’ll be presenting this poster in Hall 3 from 2:00 – 3:30 PM. Paper link: aclanthology.org/2025.naacl-lon…
MohammadHossein Rezaei@mhrezaeics

1/🚨 Thrilled to share that our paper (w/ @eduardo_nlp), "Making Language Models Robust Against Negation," has been accepted to the #NAACL2025 main conference! 🎉 #Negation has always been a challenge for language models. Here's our self-supervised method to tackle this issue:

English
0
1
7
525
Robert Vacareanu retweetledi
Francesco Orabona
Francesco Orabona@bremen79·
This is a turning point: I just proved a complex math result useful for my research using an LLM. I am not sure if I should be happy or scared...
English
23
45
596
98.2K
Robert Vacareanu retweetledi
Zifan (Sail) Wang
Zifan (Sail) Wang@_zifan_wang·
Exciting that @scale_AI is sponsoring Agent Workshop at CMU in April. Students and researchers who work on agents feel free to visit CMU to present your work! I will also be traveling to Pittsburgh to share my recent focuses on agents, both capability and safety.
Zifan (Sail) Wang tweet media
Faria Huq | 🦋: fariahuqoaishi@FariaHuqOaishi

📢 Join us at the CMU Agent Workshop 2025, April 10-11! Don't miss our esteemed invited speakers: - Qingyun Wu (PSU) - Diyi Yang (Stanford) - Aviral Kumar (CMU) - Graham Neubig (CMU) ...and many more to come! To register, visit: cmu-agent-workshop.github.io

English
1
5
19
8.4K
Robert Vacareanu retweetledi
Amanda Bertsch
Amanda Bertsch@abertsch72·
coming to a NAACL 2025 near you! 🌞 Looking forward to discussing with folks in Albuquerque :) The camera-ready is on arxiv now, with more models, more tasks, and more compared settings-- including results comparing ICL to full finetuning! arxiv.org/abs/2405.00200
Amanda Bertsch@abertsch72

In-context learning provides an LLM with a few examples to improve accuracy. But with long-context LLMs, we can now use *thousands* of examples in-context. We find that this long-context ICL paradigm is surprisingly effective– and differs in behavior from short-context ICL! 🧵

English
1
12
64
7.7K
Robert Vacareanu retweetledi
Prateek Yadav
Prateek Yadav@prateeky2806·
Excited to share our work on RSQ — enhancing quantization by focusing on the most impactful tokens. - Rotate, Scale, Quantize: delivering strong performance - Dynamic, attention-based token importance drives better efficiency - Results across LLaMA3, Mistral, Qwen-2.5, and more
Yi Lin Sung@yilin_sung

🚀 New Paper: RSQ: Learning from Important Tokens Leads to Better Quantized LLMs We show that not all tokens should be treated equally during quantization. By prioritizing important tokens through a three-step process—Rotate, Scale, and Quantize—we achieve better-quantized models on LLaMA3, Mistral, and Qwen2.5. 🧵👇

English
0
9
25
2.3K
Robert Vacareanu retweetledi
Diyi Yang
Diyi Yang@Diyi_Yang·
Check out 🔥 EgoNormia: a benchmark for physical social norm understanding egonormia.org Can we really trust VLMs to make decisions that align with human norms? 👩‍⚖️ With EgoNormia, a 1800 ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging 🤖 🌐 arxiv.org/abs/2502.20490 Our amazing team: MohammadHossein Rezaei* (U of A), Yicheng Fu* , Phil Cuvin* (U of T), @cjziems , @StevenyzZhang , @_Hao_Zhu
English
1
36
162
31.4K
Robert Vacareanu retweetledi
MohammadHossein Rezaei
MohammadHossein Rezaei@mhrezaeics·
🔥 Excited to share EgoNormia! A benchmark for physical social norm understanding. Can we really trust VLMs to make decisions that align with human norms? 🌐 Check out our website for the answer: egonormia.org Proud to be part of this amazing team! 🚀
Diyi Yang@Diyi_Yang

Check out 🔥 EgoNormia: a benchmark for physical social norm understanding egonormia.org Can we really trust VLMs to make decisions that align with human norms? 👩‍⚖️ With EgoNormia, a 1800 ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging 🤖 🌐 arxiv.org/abs/2502.20490 Our amazing team: MohammadHossein Rezaei* (U of A), Yicheng Fu* , Phil Cuvin* (U of T), @cjziems , @StevenyzZhang , @_Hao_Zhu

English
0
1
4
569
Robert Vacareanu retweetledi
Tanmoy Chakraborty
Tanmoy Chakraborty@Tanmoy_Chak·
**Kindly consider sharing the post** We are seeking opinions about the current quality of reviewing in *CL conferences. We (@emnlpmeeting PCs along with @ReviewAcl EiCs) are committed to improving the review quality. We are bringing a series of changes in the review process. Kindly consider filling out the form: forms.office.com/r/P68uvwXYqf @VioletNPeng
Christos Christodoulopoulos@c_christodoulop

Do you have opinions about the current state of reviewing at *CL conferences? Do you want to help? We (@emnlpmeeting PCs) want to hear from you: forms.office.com/r/P68uvwXYqf

English
0
5
13
3.7K
Robert Vacareanu retweetledi
Mihai Surdeanu
Mihai Surdeanu@msurd·
Our new paper in Findings of NAACL 2025, with Vlad Negru, @robert_nlp, @CameliaLemnaru, and Rodica Potolea, proposes a new, softer take on Natural Logic, where alignment is generated through text morphing. This yields robust performance cross domain. arxiv.org/abs/2502.09567
English
0
5
24
5.3K
Robert Vacareanu retweetledi
Zifan (Sail) Wang
Zifan (Sail) Wang@_zifan_wang·
🧵 1/N) Excited to share our recent work at @scale_AI, "Jailbreaking to Jailbreak (J2)".😈 We present a novel LLM-as-red-teamer approach in which a human jailbreaks a refusal-trained LLM to make it willing to jailbreak itself or other LLMs. We refer to this process as constructing a J2 attacker. They are so good at attacking 🫨. Our approach is straightforward without complex wiring between LLMs, serving as an improved baseline in this domain. We see that the current LLMs have gained non trivial capability at jailbreaking. 🔗 Demo & Paper: scale.com/research/j2
Zifan (Sail) Wang tweet media
English
6
19
71
21.6K
Robert Vacareanu retweetledi
Jacob Andreas
Jacob Andreas@jacobandreas·
Is your CS dept worried about what academic research should be in the age of LLMs? Hire one of my lab members! Leshem Choshen (@LChoshen), Pratyusha Sharma (@pratyusha_PS) and Ekin Akyürek (@akyurekekin) are all on the job market with unique perspectives on the future of NLP: 🧵
English
5
27
227
34.6K
Robert Vacareanu retweetledi
Summer Yue
Summer Yue@summeryue0·
🚀Big update: 4 new SEAL multilingual leaderboards are LIVE — Arabic, Chinese, Japanese, and Korean! 🌍 Arabic: Gemini 1.5 Pro (gemini-exp-1121) leads the pack 🏮 Chinese: Gemini 1.5 Pro (gemini-1.5-pro-exp-0827) holds the crown 💫 Japanese & Korean: o1-preview dominates 📊 See how your models stack up: scale.com/leaderboard
English
2
10
36
9K
Robert Vacareanu retweetledi
Summer Yue
Summer Yue@summeryue0·
SEAL Visual-Understanding Leaderboard Launch 🏆 Today, we’re introducing VISTA—a new rubric-based visual task assessment benchmark that pushes beyond simple Q&A. The leading models achieve under 40% on this eval, compared to a human baseline of ~55.4%. This highlights that multimodal reasoning remains challenging for current LLMs.
Summer Yue tweet media
English
2
23
79
7.8K
Robert Vacareanu retweetledi
maharshi
maharshi@maharshii·
what an amazing read: converting json to regex then regex to finite state machines, and then optimising it is brilliant!
maharshi tweet media
English
30
154
1.8K
159.3K
Robert Vacareanu retweetledi
Prateek Yadav
Prateek Yadav@prateeky2806·
I'm on the job market! Please reach out if you are looking to hire someone to work on - RLHF - Efficiency - MoE/Modular models - Synthetic Data - Test time compute - other phases of pre/post-training. If you are not hiring then I would appreciate a retweet! More details👇
English
8
59
234
65.8K
Robert Vacareanu retweetledi
Roberta Raileanu
Roberta Raileanu@robertarail·
I’m looking for a PhD intern for next year to work at the intersection of LLM-based agents and open-ended learning, part of the Llama Research Team in London. If interested please send me an email with a short paragraph with some research ideas and apply at the link below.
English
9
100
563
97.8K