Jinu Lee

83 posts

Jinu Lee

Jinu Lee

@jinulee_v

PhD Student @UIUC_NLP. Interested in *semantics of reasoning*, from neuro-symbolic methods to reasoning evaluation/improvement in LLMs. Ex-Intern @MSFTResearch

Katılım Kasım 2023
334 Takip Edilen519 Takipçiler
Sabitlenmiş Tweet
Jinu Lee
Jinu Lee@jinulee_v·
🧑‍⚖️Rubrics proven to be powerful in legal NLP, too! Introducing LEGIT (LEGal Issue Trees): a novel legal reasoning dataset with rubrics for evaluating the reasoning traces. arxiv.org/pdf/2512.01020 (1/7)
Jinu Lee tweet media
English
1
7
20
3.9K
Jinu Lee retweetledi
Open Life Science AI
Open Life Science AI@OpenlifesciAI·
🚨 Medical AI Research Alert! 🚨 How do we ensure AI patient simulators in mental health training truly reflect diverse, realistic human behavior? @UofIllinois presents 𝗣𝗦𝗜-𝗕𝗲𝗻𝗰𝗵, an interpretable, clinically grounded framework for evaluating depression patient simulators. By Nguyen Khoi Hoang(@hknguyen20 ), Shuhaib Mehri(@ShuhaibMehri ), Tse-An Hsu, Yi-Jyun Sun(@jyun_sun), Quynh Xuan Nguyen Truong, Khoa D Doan(@khoaddoan), Dilek Hakkani-Tür(@dilekhakkanitur ) Now you can watch and listen to the latest Medical AI papers daily on our YouTube and Spotify channels! YouTube Deep Dive: youtu.be/WzfCw8QeH6M YouTube Shorts: youtube.com/shorts/2A7_RE4… Spotify: open.spotify.com/show/4edRuSTOu… Here's why it's exciting: 👇🧵 1/n #MedicalAI #Healthcare #MentalHealthAI #SimulatorEvaluation [1/9]
YouTube video
YouTube
YouTube video
YouTube
Open Life Science AI tweet media
English
1
4
10
1.3K
Jinu Lee
Jinu Lee@jinulee_v·
@Bollegala @emnlpmeeting @sunipa_dev The case you mentioned is already impossible due to the double submission policy. Per my understanding, what you *can't* do is to opt in for AACL at submission time, and get better scores than expected so end up committing to EMNLP instead
English
1
0
1
173
EMNLP 2026
EMNLP 2026@emnlpmeeting·
📢 Please note a critical update to the original #EMNLP2026 Call for Papers: since EMNLP and AACL share the ARR May cycle, authors will need to explicitly select a target conference at submission time. This choice will be binding! For more details see: #paper-submission-information" target="_blank" rel="nofollow noopener">2026.emnlp.org/calls/main_con…
EMNLP 2026@emnlpmeeting

📢 The First Call for Papers for EMNLP 2026 is officially out! 📝 We welcome long & short papers featuring original research on empirical methods for NLP. 🗓️ ARR Submission Deadline: May 25, 2026 🔗 Read the full CFP here: 2026.emnlp.org/calls/main_con… #EMNLP2026

English
3
17
71
42.7K
Jinu Lee
Jinu Lee@jinulee_v·
@dela3499 Nice puzzle! Some more: S _ _ _ _ _ _ _ _ word 1: happiness word 2: horror _ _ _ _ _ O word 1: organizing things word 2: organized moves B _ _ _ _ _ word 1: found at mountains word 2: found at rivers _ _ _ _ _ E word 1: throw out word 2: scratch out
English
0
0
1
184
Jinu Lee retweetledi
Emmy Liu
Emmy Liu@_emliu·
wrote a guide on getting compute grants as a student, something I wish I did more at the beginning of my PhD. It's honestly one of the highest ROI things you can do as a student (we've gotten 100k+ gpu hrs for roughly 2 weeks of work writing). nightingal3.github.io/blog/2026/04/1…
English
16
198
1.4K
234.2K
Jinu Lee
Jinu Lee@jinulee_v·
@RyoKamoi ahh that's unfortunate... hope to see you soon
English
0
0
1
33
Ryo Kamoi
Ryo Kamoi@RyoKamoi·
@jinulee_v Thank you! But I will not be able to attend the conference 😭
English
1
0
0
27
Ryo Kamoi
Ryo Kamoi@RyoKamoi·
🎉 Our paper has been accepted to ACL 2026 Findings! We propose FoVer, an efficient method for creating accurate PRM training data using formal verification tools 🚀 Our tool-annotated formal synthetic training data improves PRMs on math, NLI, and BBH 😳 arxiv.org/abs/2505.15960
Ryo Kamoi tweet media
English
2
7
74
9.6K
Jinu Lee
Jinu Lee@jinulee_v·
Finally, *Scaling evaluation-time compute* got accepted to ACL Findings! See you all at SD 🏖️ Since this paper was released, many have sought scaling evaluation-time compute via "reasoning" PRMs and LLM-as-a-judge+rubrics. Still more to explore in 2026!
Jinu Lee@jinulee_v

📢New paper alert! Scaling compute for reasoning evaluation is powerful; if you have compute for Best-of-N decoding, scaling evaluation compute is better than generating more CoTs! More in thread! (1/6) arxiv.org/abs/2503.19877

English
0
1
29
2.2K
Jinu Lee
Jinu Lee@jinulee_v·
@teortaxesTex I'd say it is a minor revision of existing papers, e.g., R1-thoughtology, Thought anchors, etc. Section 5.2. seems new in the context of reasoning model distillation, but the fact that many of their analyses are trivial will not change.
English
0
0
0
80
Jinu Lee
Jinu Lee@jinulee_v·
@DrDatta_AIIMS Visuals are indeed cool, which bothers me even more 🤣
English
1
0
6
611
Jinu Lee
Jinu Lee@jinulee_v·
Thank you for shouting out my preprint on rubrics for legal reasoning! Evaluating reasoning traces with rubrics is a promising direction, but constructing a good rubric for expert-level tasks remains a significant challenge.
Cameron R. Wolfe, Ph.D.@cwolferesearch

Some more really good papers on rubric rewards that I've been reading: - arxiv.org/abs/2602.01511 - arxiv.org/abs/2411.01111 - arxiv.org/abs/2410.21545 - arxiv.org/abs/2511.01758 - arxiv.org/abs/2512.01020 TL;DR: Rubric rewards are really cool. There is a lot of great recent progress that surpassed my expectations. There's also a lot more to be done, and making progress on truly subjective tasks seems to be noticeably more difficult. My favorite paper so far is the first in this list, which proposes an alternating RL framework for jointly training a rubric generator and rubric-based reward model. There still is a lot to figure out w.r.t. making rubrics work well, but this paper shows a really clear benefit from rubric-based RL and has an interesting setup to make joint training (of the rubric generator and generative reward model) more stable. There are still many areas for improvement for rubrics. For example, it seems rubrics still work best for constraints that are more objective, whereas very open-ended tasks (e.g., properly-styled creative writing) are still going to be quite tough. The benefit of rubrics is not uniform across domains, and it's not immediately clear for which domains rubrics will work best; e.g., instruction following tends to benefit a lot from rubrics, the benefit is less clear for things like science / medicine. Interestingly, a lot of papers tackling very open-ended tasks with rubrics are also formulating evaluation as a pairwise problem. Given two completions, they ask the rubric generator to produce a rubric that will properly distinguish / rank the chosen and rejected completion in the pair. This probably makes very subjective evaluations easier, but the application to online RL is also less straightforward. We can't just compute the reward for a completion, we have to somehow create a pairwise comprison to compute the reward.

English
1
0
9
1.2K
Jinu Lee
Jinu Lee@jinulee_v·
@cwolferesearch I believe designing rubrics for expert domains (e.g., law) still requires domain-specific approaches to achieve *useful* improvement. My project emphasizes designing and validating legal rubrics from the experts' perspective for RaR: arxiv.org/abs/2512.01020
English
0
0
0
60
Cameron R. Wolfe, Ph.D.
Cameron R. Wolfe, Ph.D.@cwolferesearch·
I've been reading a lot about rubrics-as-rewards (RaR) for RL. Some of my favorite papers (so far): 1. arxiv.org/abs/2507.17746 2. arxiv.org/abs/2508.12790 3. arxiv.org/abs/2510.07743 4. arxiv.org/abs/2511.19399 5. arxiv.org/abs/2507.18624 Most of the added technical complexity of RaR is less related to RL and more related to reward modeling. If we can get a reliable reward signal, RaR works well, but teaching a model to perform granular / instance-level evaluation is tough. Generalizing these evaluation capabilities across arbitrary domains is even tougher (especially those that are highly subjective). Our reward model also needs to avoid hacking in large-scale RL runs. In my opinion, new developments in this space are likely to come from advancing the frontier of (generative) reward models rather than RL. So much to be done.
Cameron R. Wolfe, Ph.D. tweet mediaCameron R. Wolfe, Ph.D. tweet mediaCameron R. Wolfe, Ph.D. tweet media
English
12
60
413
23.3K
Jinu Lee retweetledi
Manling Li
Manling Li@ManlingLi_·
[Long Tweet Ahead] Faculty Interview Tips & Common Questions: 🧘‍♀️0. Firstly, do not be nervous - Almost everything can be prepared in advance:) - Be grateful for everyone's time. - Think of it as an opportunity to share your research with others -- exciting, right? - Technical issues WILL happen -- no worries. - Try meditation! (seriously, it helps me tremendously with the interview marathon) 🚀 1. The MOST crucial part: Research Vision This is what keeps me up at night (literally!), trying to distill my entire research agenda into one powerful sentence. It is like crafting your research tagline/punchline/slogan. What is your unique contribution? What stands you out? Here is the thing: it is fundamentally why the university wants to hire you. They want to see you as a rising star for the next few years, someone who can make the university name become associated with some impactful research. Think about it: when people want to learn about a specific topic, they immediately think "Oh, I should check out X's work because they are THE person for this". The university is not just hiring a researcher; they are investing in a vision for the future of the field. The key is to come up with a punchline that captures your research identity and repeat it all the time during the talks and onsites. Ask yourself: - What will your name be associated with in the next decades? - Where is your field heading? (Is RoboGPT the future? Is Transformer really the final architecture?) - What are the REAL unsolved challenges? (Not just throwing more data at problems) Get ready to discuss: - Are large models really the future? Can we achieve true intelligence just by scaling up? - What about data bottlenecks? Is synthetic data reliable? What are effective ways to collect data? - Are models really do reasoning? Do we need symbols/structures? - Is Transformer is the final answer? What is the bottleneck of Transformer? - What are the new tasks we really need to focus on? - How do you think of the current research trend of creating evaluation benchmarks? - What is still fundamentally missing in current research? 🤓 2. The BIG question: "Why Academia?" This is actually what you should confirm multiple times with yourself. It is really about your passion and motivation: - What your happiest moments (I talked about those late-night breakthrough moments haha) - Where do you see yourself in 50 years? (Dream big! Talk about the research institute you want to build, the problems you want to solve, the leader you want to become in your field) - What is your ideal group size and resources you will need? (Be concrete!) - Are you also looking at industry jobs? Here is the real talk: we are in the age of large AI models requiring infinite GPUs. So you need to have solid answers about: - Why choose academia NOW? - How you will position yourself in this large model era - The practical stuff, like how you will handle GPU needs (I will concretely mention XX research directions that don't need massive compute, XX research require GPUs but I have XX potential funding sources, and collaboration opportunities with XX) 💼 3. Logistics about Application Materials 3.1 Application Materials: - Use figures. It is always what people firstly check when reading long documents. - DO NOT miss deadlines! You can usually update materials after submission. 3.2 Personal Websites: - CV and websites are very important (I personally feel it is even more important than research statements, or at least equal) - Two Must-Haves for your website: (1) CV (fresh and updated!), research statement, teaching statement, diversity research statement (since people may not be able to find your package quickly during onsites; the research statements can always be updated if you have better ideas of your storyline) (2) your email address: make it OBVIOUS. ⏰ 4. Logistics about Timeline 4.1 My actual timeline: - First phone interview: Dec 14 - First onsite: Jan 11 - Last onsite: Apr 7 4.2 If you are asked to choose an interview slot: - Most importantly, figuring out whether it is "rolling-based" or not. - Rolling admissions? Interview earlier! - Non-rolling? Later interviews = more practice = better performance 4.3 Timing matters: - Schedule your dream schools for mid-Feb to early-Mar (Most universities I interviewed with after mid-March did not extend an offer, but almost all my January and February interviews led to offers, while the universities are similar ranks) - Health tip: Protect yourself from COVID during Jan-Feb interview season (I have to reschedule several interviews, learned this one from experience!😷) 🎙️ 5. Logistics about Phone Interviews Let us talk about something that makes everyone nervous: the interview process. I have learned that preparation is KEY. Let us go through step by step. 5.1 The "Why THIS School" Question (some universities even ask this as the first question, so I started to do more preparations on this part): - First think about what makes the university special (Is it known for something unique? What research centers do they have?) - Name drop (respectfully!) potential collaborators in the department - Track their recent wins (I always check department news before interviews) - Think about location benefits (research collaborations, funding opportunities, industry connections) Pro tip: Keep a cheat sheet with specific details for each school. Trust me, it helps when you're on your 5th interview and the details start blurring! 5.2 Research Vision 2.0 (School Edition) This is where you customize your research vision for THEIR context: - Show how you fill a unique gap in their department - Paint an exciting picture of future collaborations (for example, when listing your future direction, you can say: I am excited about this future direction xx and xx university is perfect for me since I can collaborate with faculties xx, and research centers xx. ) 5.3 Teaching Plans: - Specific course numbers (both undergrad and grad levels) that you could teach - Your dream course ideas (I actually created a full syllabus for a Multimodal Machine Learning course and put it on my website. Having concrete materials ready shows you are serious about teaching) 🎤 6. Logistics about Onsite (The Big Show) Alright, let us talk about the main event, the onsite interview! This is where the real magic happens, and a lot of things can be prepared as always. 6.1 The Job Talk: Your Moment to Shine ✨ Let's be real: this is THE most important part of whether you can get the offer (others are all minor). Still, your research vision is the most important: - Boil down your idea to ONE powerful sentence (and repeat it strategically!) - The first 10-15 minutes are GOLD. Some dept chairs only stay for this part. Be sure to show your research impact. - The goal is to EXCITE people about your research. I always start with a walkthrough example (this works way better than diving straight into theory) - Guide viewer attention for EVERY. SINGLE. SENTENCE. (Use animations, strategic dimming, highlight what matters) - Time management is crucial: Aim for 40 mins + Q&A (And be prepared that talks often start late! Factor in technical issues and waiting for people and other things) - I add a progressive bar to help people track my talk. 6.2 Handling Q&A Like a Pro 🎯 - Drop mini Q&A slides after your first and second sections (if you want to increase interactions with audience, this works!) - Golden rule: Be concise + logical - Common questions to prep for: "How do you handle bias/safety issues in model learning? What about adversarial attacks?" "How do you create data in model learning?" "Would you say your work leans more towards ML theory or applications?" "I do not think it is the right way to get it work, what about xx" (I feel a lot of audience will be outside your area and when they try to connect to their direction, there will be far-out questions you never think about, or you will face challenges saying they do not believe black boxes or do not believe symbols, or do not believe some other things. It is totally okay! Do not panic! You should always be confident about your direction. No need to get irritated or defensive. No need to back down or bluntly disagree / turning it into a debate. Just treat it as a research discussion. Something like: I think it is an interesting angle. At the current research stage, I believe that my way is the most reliable and practical way of handling xx, however, later I would be happy to explore more on xx and it would be great if we could even collaborate together on this. ) 6.3 Surviving the Marathon: One-on-Ones 👥 I am super introverted, and not good at small talks, so it is more a guideline for introverted people haha. I feel these are not really about casual chats. Each one is a mini-presentation opportunity. - I heard that people are saying one-on-ones just try to see whether you are nice. I do not agree. People won't hire you because that you are nice, but more because of unique, exciting insights that you can bring, which can make you get high voting scores. - Do your homework on EACH professor: • Check their recent papers (Google Scholar, sort by time) • Know what made them famous (sort by citations) • Look up their grants and awards • Find personal connections (alma mater? city connections?) 6.4 Lunches & Dinners 🍽️ Again since I often worry about being too introverted, I like to prepare talking points in advance. I usually focus on my strengths, such as research, mentoring philosophy, and funding applications. If you happen to know something interesting about the city or the food, that is a great conversation starter and a bonus! (Job search season is here again! I have been receiving DMs about faculty interview advice, so I thought I'd share a few key insights that personally helped me navigate the process. If you have already seen the slides I shared earlier, this is essentially the same content. Just a heads-up to save your time!)
English
14
91
597
54.8K
Jinu Lee
Jinu Lee@jinulee_v·
We hope to expand LEGIT across diverse jurisdiction systems! If you are interested, feel free to contact me to push the frontiers of legal reasoning together. Huge thanks to my collaborators for their feedback and guidance: Kyoungwoon, @HanSineng, @armancohan, and Julia. (7/7)
English
0
0
2
1.1K
Jinu Lee
Jinu Lee@jinulee_v·
Finally, we compare the effects of RAG and RL with rubrics-as-rewards in the LEGIT dataset. RAG improves both correctness and coverage. However, RL pushes models to focus on “safe” issues while discarding uncertain ones, boosting correctness but at the cost of coverage. (6/7)
English
1
0
2
197
Jinu Lee
Jinu Lee@jinulee_v·
🧑‍⚖️Rubrics proven to be powerful in legal NLP, too! Introducing LEGIT (LEGal Issue Trees): a novel legal reasoning dataset with rubrics for evaluating the reasoning traces. arxiv.org/pdf/2512.01020 (1/7)
Jinu Lee tweet media
English
1
7
20
3.9K