berk atıl (@berkatilgs) - Twitter Profili | Zamantika Mersobahis Locabet

berk atıl@berkatilgs·6 Nis

Thanks a lot to all the collaborators!

English

0

52

berk atıl@berkatilgs·6 Nis

Two papers are accepted to ACL2026 (@aclmeeting )! - Comprehensive dataset and benchmark for toxicity detection: arxiv.org/abs/2506.02326 - Hate-speech detection with moral rationales: arxiv.org/abs/2601.03481 Looking forward to talking about these and safety in San Diego #ACL2026!

English

1

2

22

1.7K

berk atıl@berkatilgs·21 Kas

@vipul_1011 @ruizhang_nlp @Wenpeng_Yin @ShomirWilson @RPassonneau Thank you!

English

0

34

Vipul Gupta@vipul_1011·21 Kas

@berkatilgs @ruizhang_nlp @Wenpeng_Yin @ShomirWilson @RPassonneau Wow, congrats!!

English

1

0

2

184

berk atıl@berkatilgs·21 Kas

I am pleased to announce that I have passed my Comprehensive Exam! Thank you very much for the committee members for their vaulable comments and feedbacks, @ruizhang_nlp, @Wenpeng_Yin , @ShomirWilson and my advisor @RPassonneau for her support and advice!

English

1

0

4

241

berk atıl@berkatilgs·14 Kas

@vipul_1011 @scale_AI Congrats! 🎉

English

0

1

61

Vipul Gupta@vipul_1011·14 Kas

My first paper with @scale_AI is out! We introduce a new dataset to test the limit of AI models for real world professional domains: finance and legal My bet: it will take atleast an year before models saturate this dataset

Afra Feyza Akyürek@afeyzaakyurek

New @Scale_AI paper! We’re introducing Professional Reasoning Bench (PRBench), the largest open-source, rubric-based evaluation for real-world professional reasoning in Law and Finance. PRBench contains a total of 1,100 tasks and 19,000+ rubric criteria authored by 182 experts!

English

3

1

31

6K

berk atıl retweetledi

Vipul Gupta@vipul_1011·25 Ağu

Job update: Last week I started working as Research Scientist at Scale AI @scale_AI. Lot has changed at the company in last 2 months but very excited to see what’s in store for me. I will be based in NYC! To attending more AI events in NYC🚀

Vipul Gupta@vipul_1011

Life update: Completed my PhD. From never wanting to get a masters degree to here, it’s been a memorable journey. Time flies, 4 years went by quite fast.

English

8

3

128

17.7K

berk atıl@berkatilgs·5 Haz

We also observed that few-shot learning has mixed effects on these tasks depending on the model. Plus, reasoning does not help much, indicating a need for improvement in social reasoning! Thanks a lot to my collaborators Namrata Sureddy and @RPassonneau! 4/4

English

0

33

berk atıl@berkatilgs·5 Haz

Toxicity detection performance of models change from social group to social group significantly! 3/4

English

1

0

47

berk atıl@berkatilgs·5 Haz

⚠️ Paper Alert! We merge existing datasets on toxicity detection, social target group identification, and toxic span recognition to form a unified and reliable benchmark. TLDR: LLMs have still biases and are worse than fine-tuned PLMs on these tasks. arxiv.org/pdf/2506.02326 1/4

English

1

0

64

berk atıl retweetledi

Rui Zhang@ruizhang_nlp·23 May

Excited to share our latest work! The development of process reward models (PRMs) is limited by manual labeling of step-level reasoning correctness. In this new paper led by @RyoKamoi, we use formal verification tools — formal logic and theorem proving — to automatically synthesize high-quality examples for training LLM-based reasoning verifier. We found this improves LLM-based PRMs with generalization across a wide range reasoning tasks!

Ryo Kamoi@RyoKamoi

📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. arxiv.org/abs/2505.15960

English

0

8

30

3.6K

berk atıl retweetledi

Ryo Kamoi@RyoKamoi·23 May

📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. arxiv.org/abs/2505.15960

English

4

25

126

36.6K

berk atıl@berkatilgs·16 May

@vipul_1011 So sorry to hear that :(

English

0

1

113

Vipul Gupta@vipul_1011·16 May

Just got to know that someone I interacted with 1-2 times (former MS student at my college), got stabbed and lost his life It hurts, despite knowing him so little. Life is so unpredictable, we don’t appreciate what we have Moving forward: stop complaining, appreciate more

English

3

0

28

2.5K

berk atıl@berkatilgs·8 May

@RyoKamoi Congrats! Well deserved! 🎉

English

1

0

1

122

Ryo Kamoi@RyoKamoi·8 May

I passed my comprehensive exam and am now a PhD candidate! Thank you to my advisor and collaborators for their continued support!

English

9

1

150

11.7K

berk atıl retweetledi

Rui Zhang@ruizhang_nlp·10 Nis

Excited to share SiReRAG! Our #ICLR2025 paper on improving RAG indexing for multihop reasoning. 🔍 SiReRAG combines similarity (semantic closeness) and relatedness (entity-based connections) to better organize and retrieve information from large corpora for comprehensive knowledge synthesis. 📈 SReRAG delivers consistent gains over state-of-the-art methods with +1.9% avg F1 on multihop QA benchmarks including MuSiQue, 2WikiMultiHopQA, and HotpotQA. 📄 Paper: arxiv.org/abs/2412.06206 🛠️ Code: github.com/SalesforceAIRe… This work is led by my excellent PhD student @NanZhangNLP during his internship at Salesforce, collaborated with Prafulla Kumar Choubey, @alexfabbri4 , Gabriel Bernadett-Shapiro, Prasenjit Mitra, @CaimingXiong , and @jasonwu0731. Nan is currently on the job market, please consider hiring him!

Nan Zhang@NanZhangNLP

📢 Happy to introduce SiReRAG: our #ICLR2025 paper on RAG indexing! Facilitating comprehensive knowledge synthesis on multihop reasoning, SiReRAG models both similarity and relatedness signals of a corpus. Code: github.com/SalesforceAIRe… Paper: arxiv.org/abs/2412.06206 (1/N)🧵

English

0

6

11

1.3K

berk atıl retweetledi

Rui Zhang@ruizhang_nlp·9 Nis

Our #NAACL2025 paper bridges disparity analysis to quantifying LLM's fairness for summarization! Led by @HaoyuanLi9 and @snigdhac25.

Snigdha Chaturvedi@snigdhac25

Check out this recent work from @uncnlp on evaluating fairness in summarization. We propose two new metrics to quantify fairness. This was in collaboration with @ruizhang_nlp, and @HaoyuanLi9 will be presenting it at #NAACL25

English

0

4

8

872

berk atıl retweetledi

Rui Zhang@ruizhang_nlp·9 Nis

This work is led by my stellar PhD student Sarkar @sarkarssdas, who is graduating soon and looking for industrial positions. Sarkar has top engineering capabilities, and it’s absolutely a pleasure to talk with as a friend. Please consider hiring him!

Rui Zhang@ruizhang_nlp

🚀If you're looking for inference-time techniques to max out the reasoning ability of your local LLMs, check our #ICLR2025 paper GreaTer for gradient-based prompt optimization! We generate fluent & strategic prompts outperforming APO/APE/PE2/TextGrad on multiple reasoning benchmarks! 📊🔥 🔑Three key takeaways 1️⃣Gradient can better guide prompt refinement than text-based feedback, unlocking stronger reasoning in small, local LLMs without relying on expensive proprietary models. 2️⃣Incorporate CoT reasoning — not just final answers — leads to smarter, more context-aware prompts. 3️⃣Use perplexity to constrain the search space to generate fluent and readable prompts. We make it extremely easy to use GreaTer as a Python library with GUI interfaces: 📝Paper: arxiv.org/abs/2412.09722 🔗Python Library: github.com/psunlpgroup/Gr…

English

0

3

8

2.1K

berk atıl retweetledi

Rui Zhang@ruizhang_nlp·9 Nis

🚀If you're looking for inference-time techniques to max out the reasoning ability of your local LLMs, check our #ICLR2025 paper GreaTer for gradient-based prompt optimization! We generate fluent & strategic prompts outperforming APO/APE/PE2/TextGrad on multiple reasoning benchmarks across BBH, GSM8k, and FOLIO! 📊🔥 🔑Three key takeaways 1️⃣Gradient can better guide prompt refinement than text-based feedback, unlocking stronger reasoning in small, local LLMs without relying on expensive proprietary models. 2️⃣Incorporate CoT reasoning — not just final answers — leads to smarter, more context-aware prompts. 3️⃣Use perplexity to constrain the search space to generate fluent and readable prompts. We make it extremely easy to use GreaTer as a Python library with GUI interfaces: 📝Paper: arxiv.org/abs/2412.09722 🔗Python Library: github.com/psunlpgroup/Gr… Led by @sarkarssdas. Collaboration with @RyoKamoi @bo_pang0 @YusenZhangNLP @CaimingXiong

Sarkar Snigdha Sarathi Das@sarkarssdas

🚨 New #ICLR2025 Paper & Library Alert! Tired of hand-crafting prompts or relying on massive LLMs for optimization? Meet GReaTer — a gradient-based method that leverages gradients over reasoning chains to optimize prompts and help small models reason better. Even better? Check out GReaTerPrompt, a fully open-source library that wraps GReaTer and other prompt optimization techniques into a unified API — plus an easy-to-use Web UI for non-experts. 📝 GReaTer (ICLR’25): arxiv.org/abs/2412.09722 @RyoKamoi @bo_pang0 @YusenZhangNLP @CaimingXiong @ruizhang_nlp 🛠️ GReaTerPrompt: arxiv.org/abs/2504.03975 Wenliang Zheng, @YusenZhangNLP , @ruizhang_nlp 📦 PyPI: pypi.org/project/greate… 🔗 GitHub: github.com/psunlpgroup/Gr…, github.com/psunlpgroup/Gr… (1/N)

English

0

9

16

2.2K

berk atıl retweetledi

Sarkar Snigdha Sarathi Das@sarkarssdas·9 Nis

🚨 New #ICLR2025 Paper & Library Alert! Tired of hand-crafting prompts or relying on massive LLMs for optimization? Meet GReaTer — a gradient-based method that leverages gradients over reasoning chains to optimize prompts and help small models reason better. Even better? Check out GReaTerPrompt, a fully open-source library that wraps GReaTer and other prompt optimization techniques into a unified API — plus an easy-to-use Web UI for non-experts. 📝 GReaTer (ICLR’25): arxiv.org/abs/2412.09722 @RyoKamoi @bo_pang0 @YusenZhangNLP @CaimingXiong @ruizhang_nlp 🛠️ GReaTerPrompt: arxiv.org/abs/2504.03975 Wenliang Zheng, @YusenZhangNLP , @ruizhang_nlp 📦 PyPI: pypi.org/project/greate… 🔗 GitHub: github.com/psunlpgroup/Gr…, github.com/psunlpgroup/Gr… (1/N)