Nir Mazor

51 posts

Nir Mazor

Nir Mazor

@NirMMazor

Katılım Mart 2025
65 Takip Edilen21 Takipçiler
Sabitlenmiş Tweet
Nir Mazor
Nir Mazor@NirMMazor·
New preprint 💥 Can a general-purpose model achieve results comparable to medically pre-trained models? 🤔 We show that lightweight fine-tuning of a general-purpose LVLM and an LVLM-aware retriever can. 🚀 🔗 GitHub: github.com/Nirmaz/CLARE 📄 Paper: arxiv.org/pdf/2508.17394
Nir Mazor tweet media
English
1
9
24
736
Nir Mazor retweetledi
Asaf Yehudai
Asaf Yehudai@AsafYehudai·
New preprint, evaluation framework & leaderboard!🚨 General-purpose AI agents are everywhere. 🤖 From ReAct to @claudeai Code and @OpenAI SDK. But how do we actually evaluate them — as general agents? Currently, benchmarks are deeply tied to domain-specific setups, making it impossible to evaluate true cross-domain agents. We’re changing that! We’re introducing Exgentic and the Open General Agent Leaderboard. 🧵👇
Asaf Yehudai tweet media
English
2
14
47
6.6K
Nir Mazor
Nir Mazor@NirMMazor·
Our model also achieves superior results over general-purpose RAG baseline models 🚀📈
Nir Mazor tweet media
English
1
0
2
41
Nir Mazor
Nir Mazor@NirMMazor·
New preprint 💥 Can a general-purpose model achieve results comparable to medically pre-trained models? 🤔 We show that lightweight fine-tuning of a general-purpose LVLM and an LVLM-aware retriever can. 🚀 🔗 GitHub: github.com/Nirmaz/CLARE 📄 Paper: arxiv.org/pdf/2508.17394
Nir Mazor tweet media
English
1
9
24
736
Nir Mazor retweetledi
Avishai Elmakies
Avishai Elmakies@AvishaiElm37946·
🚀 Excited to share that my paper from my internship at @IBMResearch has been accepted to #ICASSP2026! We train Speech-Aware LLMs (SALLMs) with Group Relative Policy Optimization (GRPO) on open-ended tasks (Spoken QA & Speech Translation). We find that GRPO beats SFT!
English
2
9
26
667
Nir Mazor retweetledi
Noam Dahan
Noam Dahan@Dahan_Noam·
1) PromptSuite (EMNLP 2025 demo) enables robust multi-prompt evaluation by automatically generating controlled prompt variations over existing datasets (e.g. Hugging Face). Try it with a Python API and web UI: eliyahabba.github.io/PromptSuite/ with @EliyaHabba and @GiliLior
Noam Dahan tweet media
English
1
1
7
238
Nir Mazor retweetledi
Noam Dahan
Noam Dahan@Dahan_Noam·
FOMO for missing the great community of @iscol_meeting! Thankfully my collaborators @nlphuji (and advisor @GabiStanovsky🙏) presented our recent work: 1) on prompt sensitivity, and 2) on using digitized newspapers as data for low-resource languages Links in thread:
Noam Dahan tweet media
English
1
3
11
398
Nir Mazor retweetledi
Gili Lior
Gili Lior@GiliLior·
Honored to have been part of the ISCOL 2025 panel with such great professors! @yoavgo @melhadad It was an interesting discussion on AI’s role in academia and research, with diverse opinions and challenging perspectives. Thanks @iscol_meeting and @ella_rabinovich for having me!
Gili Lior tweet mediaGili Lior tweet media
English
0
6
36
743
Nir Mazor retweetledi
Shahaf Bassan
Shahaf Bassan@shahaf_bassan·
✈️ 𝐂𝐨𝐩𝐞𝐧𝐡𝐚𝐠𝐞𝐧 🇩🇰 → 𝐒𝐚𝐧 𝐃𝐢𝐞𝐠𝐨 🇺🇸 Had a great time presenting 𝐭𝐰𝐨 𝐩𝐚𝐩𝐞𝐫𝐬 at #NeurIPS2025 and giving an invited talk at the 𝑇ℎ𝑒𝑜𝑟𝑦 𝑜𝑓 𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑎𝑏𝑙𝑒 𝑀𝐿 workshop (Elis Unconference, #EurIPS2025). #ExplainableAI #Interpretability #XAI
Shahaf Bassan tweet media
English
5
1
18
423
Nir Mazor retweetledi
Kevin Lu
Kevin Lu@kevinlu4588·
Excited to share our paper “When Are Concepts Erased from Diffusion Models?” at @NeurIPSConf! We introduce two conceptual models for erasure mechanisms in diffusion models, and a suite of probes to recover supposedly forgotten concepts. Project website: unerasing.baulab.info
English
2
11
39
6.3K
Nir Mazor retweetledi
Daria Lioubashevski
Daria Lioubashevski@DariaLioub·
🚨 New preprint! One idea, many ways to say it – does your brain track those options before you speak? Using LLMs, we put this to the test: biorxiv.org/content/10.110… We show for the 1st time that the brain represents many alternatives simultaneously in both listening & speaking 🧵
GIF
English
1
21
49
2.6K
Nir Mazor retweetledi
HUJI NLP
HUJI NLP@nlphuji·
Our group closing out #EMNLP2025 in Suzhou. Until next time!
HUJI NLP tweet mediaHUJI NLP tweet media
English
0
3
32
2K
Nir Mazor retweetledi
Noy Sternlicht
Noy Sternlicht@NoySternlicht·
New benchmarking task for LLM judges: Evaluating debate speeches! 🗣️ We find that judges align with human orderings but remain miscalibrated. Presenting tomorrow at @emnlpmeeting, 14:00-15:30, Hall C. Come by to chat, and even debate (in the spirit of the poster) 🤝 #EMNLP2025
Noy Sternlicht tweet media
English
1
13
37
1.3K