berk atıl

220 posts

berk atıl

berk atıl

@berkatilgs

PhD student at Penn State University. NLP Researcher

Katılım Temmuz 2010
504 Takip Edilen110 Takipçiler
berk atıl
berk atıl@berkatilgs·
Thanks a lot to all the collaborators!
English
0
0
0
52
berk atıl
berk atıl@berkatilgs·
I am pleased to announce that I have passed my Comprehensive Exam! Thank you very much for the committee members for their vaulable comments and feedbacks, @ruizhang_nlp, @Wenpeng_Yin , @ShomirWilson and my advisor @RPassonneau for her support and advice!
berk atıl tweet media
English
1
0
4
241
Vipul Gupta
Vipul Gupta@vipul_1011·
My first paper with @scale_AI is out! We introduce a new dataset to test the limit of AI models for real world professional domains: finance and legal My bet: it will take atleast an year before models saturate this dataset
Afra Feyza Akyürek@afeyzaakyurek

New @Scale_AI paper! We’re introducing Professional Reasoning Bench (PRBench), the largest open-source, rubric-based evaluation for real-world professional reasoning in Law and Finance. PRBench contains a total of 1,100 tasks and 19,000+ rubric criteria authored by 182 experts!

English
3
1
31
6K
berk atıl retweetledi
Vipul Gupta
Vipul Gupta@vipul_1011·
Job update: Last week I started working as Research Scientist at Scale AI @scale_AI. Lot has changed at the company in last 2 months but very excited to see what’s in store for me. I will be based in NYC! To attending more AI events in NYC🚀
Vipul Gupta@vipul_1011

Life update: Completed my PhD. From never wanting to get a masters degree to here, it’s been a memorable journey. Time flies, 4 years went by quite fast.

English
8
3
128
17.7K
berk atıl
berk atıl@berkatilgs·
We also observed that few-shot learning has mixed effects on these tasks depending on the model. Plus, reasoning does not help much, indicating a need for improvement in social reasoning! Thanks a lot to my collaborators Namrata Sureddy and @RPassonneau! 4/4
English
0
0
0
33
berk atıl
berk atıl@berkatilgs·
Toxicity detection performance of models change from social group to social group significantly! 3/4
berk atıl tweet media
English
1
0
0
47
berk atıl
berk atıl@berkatilgs·
⚠️ Paper Alert! We merge existing datasets on toxicity detection, social target group identification, and toxic span recognition to form a unified and reliable benchmark. TLDR: LLMs have still biases and are worse than fine-tuned PLMs on these tasks. arxiv.org/pdf/2506.02326 1/4
English
1
0
0
64
berk atıl retweetledi
Rui Zhang
Rui Zhang@ruizhang_nlp·
Excited to share our latest work! The development of process reward models (PRMs) is limited by manual labeling of step-level reasoning correctness. In this new paper led by @RyoKamoi, we use formal verification tools — formal logic and theorem proving — to automatically synthesize high-quality examples for training LLM-based reasoning verifier. We found this improves LLM-based PRMs with generalization across a wide range reasoning tasks!
Ryo Kamoi@RyoKamoi

📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. arxiv.org/abs/2505.15960

English
0
8
30
3.6K
berk atıl retweetledi
Ryo Kamoi
Ryo Kamoi@RyoKamoi·
📢 New paper! FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀 We synthesize training data using formal verification tools and improve LLMs at step-level verification of LLM responses on MATH, AIME, MMLU, BBH, etc. arxiv.org/abs/2505.15960
Ryo Kamoi tweet mediaRyo Kamoi tweet mediaRyo Kamoi tweet media
English
4
25
126
36.6K
Vipul Gupta
Vipul Gupta@vipul_1011·
Just got to know that someone I interacted with 1-2 times (former MS student at my college), got stabbed and lost his life It hurts, despite knowing him so little. Life is so unpredictable, we don’t appreciate what we have Moving forward: stop complaining, appreciate more
English
3
0
28
2.5K
Ryo Kamoi
Ryo Kamoi@RyoKamoi·
I passed my comprehensive exam and am now a PhD candidate! Thank you to my advisor and collaborators for their continued support!
English
9
1
150
11.7K
berk atıl retweetledi
Rui Zhang
Rui Zhang@ruizhang_nlp·
Excited to share SiReRAG! Our #ICLR2025 paper on improving RAG indexing for multihop reasoning. 🔍 SiReRAG combines similarity (semantic closeness) and relatedness (entity-based connections) to better organize and retrieve information from large corpora for comprehensive knowledge synthesis. 📈 SReRAG delivers consistent gains over state-of-the-art methods with +1.9% avg F1 on multihop QA benchmarks including MuSiQue, 2WikiMultiHopQA, and HotpotQA. 📄 Paper: arxiv.org/abs/2412.06206 🛠️ Code: github.com/SalesforceAIRe… This work is led by my excellent PhD student @NanZhangNLP during his internship at Salesforce, collaborated with Prafulla Kumar Choubey, @alexfabbri4 , Gabriel Bernadett-Shapiro, Prasenjit Mitra, @CaimingXiong , and @jasonwu0731. Nan is currently on the job market, please consider hiring him!
Nan Zhang@NanZhangNLP

📢 Happy to introduce SiReRAG: our #ICLR2025 paper on RAG indexing! Facilitating comprehensive knowledge synthesis on multihop reasoning, SiReRAG models both similarity and relatedness signals of a corpus. Code: github.com/SalesforceAIRe… Paper: arxiv.org/abs/2412.06206 (1/N)🧵

English
0
6
11
1.3K
berk atıl retweetledi
Rui Zhang
Rui Zhang@ruizhang_nlp·
This work is led by my stellar PhD student Sarkar @sarkarssdas, who is graduating soon and looking for industrial positions. Sarkar has top engineering capabilities, and it’s absolutely a pleasure to talk with as a friend. Please consider hiring him!
Rui Zhang@ruizhang_nlp

🚀If you're looking for inference-time techniques to max out the reasoning ability of your local LLMs, check our #ICLR2025 paper GreaTer for gradient-based prompt optimization! We generate fluent & strategic prompts outperforming APO/APE/PE2/TextGrad on multiple reasoning benchmarks! 📊🔥 🔑Three key takeaways 1️⃣Gradient can better guide prompt refinement than text-based feedback, unlocking stronger reasoning in small, local LLMs without relying on expensive proprietary models. 2️⃣Incorporate CoT reasoning — not just final answers — leads to smarter, more context-aware prompts. 3️⃣Use perplexity to constrain the search space to generate fluent and readable prompts. We make it extremely easy to use GreaTer as a Python library with GUI interfaces: 📝Paper: arxiv.org/abs/2412.09722 🔗Python Library: github.com/psunlpgroup/Gr…

English
0
3
8
2.1K
berk atıl retweetledi
Rui Zhang
Rui Zhang@ruizhang_nlp·
🚀If you're looking for inference-time techniques to max out the reasoning ability of your local LLMs, check our #ICLR2025 paper GreaTer for gradient-based prompt optimization! We generate fluent & strategic prompts outperforming APO/APE/PE2/TextGrad on multiple reasoning benchmarks across BBH, GSM8k, and FOLIO! 📊🔥 🔑Three key takeaways 1️⃣Gradient can better guide prompt refinement than text-based feedback, unlocking stronger reasoning in small, local LLMs without relying on expensive proprietary models. 2️⃣Incorporate CoT reasoning — not just final answers — leads to smarter, more context-aware prompts. 3️⃣Use perplexity to constrain the search space to generate fluent and readable prompts. We make it extremely easy to use GreaTer as a Python library with GUI interfaces: 📝Paper: arxiv.org/abs/2412.09722 🔗Python Library: github.com/psunlpgroup/Gr… Led by @sarkarssdas. Collaboration with @RyoKamoi @bo_pang0 @YusenZhangNLP @CaimingXiong
Sarkar Snigdha Sarathi Das@sarkarssdas

🚨 New #ICLR2025 Paper & Library Alert! Tired of hand-crafting prompts or relying on massive LLMs for optimization? Meet GReaTer — a gradient-based method that leverages gradients over reasoning chains to optimize prompts and help small models reason better. Even better? Check out GReaTerPrompt, a fully open-source library that wraps GReaTer and other prompt optimization techniques into a unified API — plus an easy-to-use Web UI for non-experts. 📝 GReaTer (ICLR’25): arxiv.org/abs/2412.09722 @RyoKamoi @bo_pang0 @YusenZhangNLP @CaimingXiong @ruizhang_nlp 🛠️ GReaTerPrompt: arxiv.org/abs/2504.03975 Wenliang Zheng, @YusenZhangNLP , @ruizhang_nlp 📦 PyPI: pypi.org/project/greate… 🔗 GitHub: github.com/psunlpgroup/Gr…, github.com/psunlpgroup/Gr… (1/N)

English
0
9
16
2.2K
berk atıl retweetledi
Sarkar Snigdha Sarathi Das
Sarkar Snigdha Sarathi Das@sarkarssdas·
🚨 New #ICLR2025 Paper & Library Alert! Tired of hand-crafting prompts or relying on massive LLMs for optimization? Meet GReaTer — a gradient-based method that leverages gradients over reasoning chains to optimize prompts and help small models reason better. Even better? Check out GReaTerPrompt, a fully open-source library that wraps GReaTer and other prompt optimization techniques into a unified API — plus an easy-to-use Web UI for non-experts. 📝 GReaTer (ICLR’25): arxiv.org/abs/2412.09722 @RyoKamoi @bo_pang0 @YusenZhangNLP @CaimingXiong @ruizhang_nlp 🛠️ GReaTerPrompt: arxiv.org/abs/2504.03975 Wenliang Zheng, @YusenZhangNLP , @ruizhang_nlp 📦 PyPI: pypi.org/project/greate… 🔗 GitHub: github.com/psunlpgroup/Gr…, github.com/psunlpgroup/Gr… (1/N)
Sarkar Snigdha Sarathi Das tweet mediaSarkar Snigdha Sarathi Das tweet media
English
3
6
15
5.1K