Lei Li @NeurIPS2025

793 posts

Lei Li @NeurIPS2025

Lei Li @NeurIPS2025

@lileics

Generative AI for language and science. MT, LLM, GenAI Safety, Drug Discovery

Katılım Nisan 2010
450 Takip Edilen6.4K Takipçiler
Manling Li
Manling Li@ManlingLi_·
I found ICML’s experiment on AI reviewing quite interesting: Policy B (with AI) scores seem to be generally higher than Policy A (pure human). Not sure whether this is just the datapoints I know of...
English
3
4
90
19.6K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
5/ Check out the full paper and our live leaderboard here: 🔗 Project Page: leililab.github.io/susvibes-leade…📄 Paper: arxiv.org/abs/2512.03262 #VibeCoding #CyberSecurity #LLM #SoftwareEngineering #AIAgent
Lei Li @NeurIPS2025@lileics

4/ Key Leaderboard Highlights: 🏆 Security Leader: @OpenHands + GLM4.7 🏆 Functionality Leader: SWE-agent + Claude 4 Sonnet If we are moving toward an agent-led dev cycle, we need to talk about security now, not later.

English
1
0
5
1K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
2/, We tested the world’s leading coding agents, and the results are a wake-up call for the industry: Functionality ≠ Security: For example, while SWE-Agent with Claude 4 Sonnet solved 61% of tasks correctly, only 10.5% of those solutions were actually secure.
Lei Li @NeurIPS2025@lileics

🚀 Is "Vibe Coding" actually safe for production? We’ve all seen the demos: give an LLM agent a prompt, watch it work its magic, and boom—you have a feature. But there’s a massive hidden risk. In our latest paper, we introduce SUSVIBES, a benchmark of 200 real-world SE tasks.

English
0
0
4
1.6K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
🚀 Is "Vibe Coding" actually safe for production? We’ve all seen the demos: give an LLM agent a prompt, watch it work its magic, and boom—you have a feature. But there’s a massive hidden risk. In our latest paper, we introduce SUSVIBES, a benchmark of 200 real-world SE tasks.
Guilherme Favaron@guifav

Your vibe coded app works. But is it secure? New benchmark SusVibes from Songwen Zhao, Danqing Wang, Kexun Zhang, Jiaxuan Luo, Zhuo Li, and @lileics at @CarnegieMellon, @Columbia, and @JohnsHopkins tested 200 real world feature requests on coding agents. The results are sobering: SWE Agent with Claude 4 Sonnet produced functionally correct code 61% of the time, but only 10.5% of solutions were actually secure. Even adding security hints to prompts did not fix the problem. The gap between 'it works' and 'it is safe to deploy' is massive. 77 different CWE vulnerability types showed up across the benchmark. Worth thinking about next time someone says AI will replace software engineers. The harder question was never about writing code that runs. It was always about writing code that does not break under adversarial conditions. Source: arxiv.org/abs/2512.03262

English
4
2
7
2.9K
Siqi Ouyang
Siqi Ouyang@siqi_ouyang·
Excited to co-chair next year’s IWSLT Simultaneous Translation Track! We’re collecting community feedback to shape the 2026 task. If you work on SimulST, please fill out our short survey: forms.gle/8EvERoSGsuDtLR…
English
1
0
3
604
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
I am at #NeurIPS2025 this week and happy to meet and chat about coding/reasoning agents, LLM security, privacy/copyright of genAI, and AI for drug/protein design. Also happy to meet prospective phd applicants to CMU and applicants to CMU GenAI/LLM certificate program.
English
6
1
22
2K
Huan Sun
Huan Sun@hhsun1·
🚀 Worried about faculty openings? Ohio State @OhioState is to hire 100 new faculty with AI expertise over the next five years! 🤖🎓 The new hires will join one of three AI Faculty Cohorts: 🧠 Foundational AI — Elevating the theoretical, mathematical, and algorithmic underpinnings of AI. 🧩 Applied AI — Harnessing AI to revolutionize the translation of ideas into real-world solutions for Ohio and beyond. 🛡️ Responsible AI & Cybersecurity — Ensuring ethical innovation and safeguarding digital landscapes for a secure future. 📅 Initial faculty searches are underway — the first group of hires will join in Autumn 2026! ✨ Moreover, two tenure-track (open-rank) positions are already open in CSE @OhioStateCSE, starting in Fall 2026! 🔹 Timashev Professor (Assistant/Associate/Full, Tenure-Track) — in Programming Languages & Software Engineering (PLSE), starting as early as Fall 2026. Affiliated with the new Center for Software Innovation (CSI), endowed by a historic $110M Timashev Family Foundation gift — the largest in OSU’s history! 💰 🔹 Open-Rank Faculty in AI + Healthcare — seeking scholars at the intersection of AI/ML and healthcare innovation. The faculty will join the AI(X) Hub at Ohio State, established to drive innovation, provide resources, and foster the development of foundational and applied AI. 🧬💡
English
8
42
213
38.6K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
Meet LLaMAX2: a strong multilingual LLM which excels on 17 language's translation and reasoning! (it is actually based on QWen3 but since there is a prior LLaMAX model, we just reuse the name convention). as always, feedback is welcome
FeYuan@t_feyuan

Welcome to use our models. More Details: 🎉 Paper: LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning (huggingface.co/papers/2510.09…) 🎉 Code: github.com/CONE-MT/LLaMAX… 🎉 Model: huggingface.co/collections/LL…

English
1
1
9
2.2K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
Excited and Congratulations to my colleague Maarten Sap for winning the prestigious Packard Fellowship for Science and Engineering! #CMU #LTI
Maarten Sap (he/him)@MaartenSap

I’m ✨ super excited and grateful ✨to announce that I'm part of the 2025 class of #PackardFellows (packard.org/2025fellows). The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

English
0
1
5
2.3K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
Come join us on 9/12 at CMU AI for Science workshop to present and discuss about how modern generative AI and foundation models accelerate scientific discoveries. We have an outstanding lineup of speakers and various poster/panel/lab/social activities. cmu-ai-for-science-workshop.github.io
Jiayi Geng@JiayiiGeng

📢 We're thrilled to announce the CMU AI for Science Workshop on Sept 12 at CUC-MPW! Featuring an amazing lineup of speakers: - Akari Asai (AI2/CMU) - Gabe Gomes (CMU) - Chenglei Si (Stanford) - Keyon Vafa (Harvard) Join us on campus, submit your poster & register here: cmu-ai-for-science-workshop.github.io Questions? Feel free to email: cmu-ai-for-science-workshop@andrew.cmu.edu We look forward to see you there!🤗

English
0
0
0
1.2K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
Wonderful results of benchmarking LLM on MCP use from @michaelqshieh 👍
Michael Qizhe Shieh@michaelqshieh

Introducing MCPMark, a collaboration with @EvalSysOrg and @lobehub! We created a challenging benchmark to stress-test MCP use in comprehensive contexts. - 127 high-quality data samples created by experts. - GPT-5 takes the current lead and achieves a Pass@1 of 46.96% while the other models fall in the range of 10-30%. - Diverse test cases on Notion, Github, Filesystem, Playwright (browser), and Postgres. 9🧵s ahead

English
0
1
8
1.6K
Shinji Watanabe
Shinji Watanabe@shinjiw_at_cmu·
Our work on OWSM v4 received the Best Student Paper Award at #Interspeech2025! 🏆🎉 Huge congratulations to the team! 🚀👏 I’m especially happy to see our open science efforts for speech foundation models recognized by the community. 🙌 🔗 isca-archive.org/interspeech_20…
Shinji Watanabe tweet media
English
9
22
116
13.8K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
Congratulations to AI2 @allen_ai on getting major support from @NSF and @nvidia to advance AI for scientific discovery, which is major area modern generative AI and foundation models can accelerate the progress!
Ai2@allen_ai

With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

English
0
1
42
7.1K
Lei Li @NeurIPS2025
Lei Li @NeurIPS2025@lileics·
The show is on. Welcome to 2025 Generative AI for Biology workshop. 7 invited talks + a panel with 5 panelists + 14 spotlight talks + 121 poster presentations! Huge thanks to the workshop sponsors: Genesis Therapeutics, Genbio AI, and Tencent! genbio-workshop.github.io/2025/
Lei Li @NeurIPS2025 tweet media
English
1
2
8
1.6K
Thomas G. Dietterich
Thomas G. Dietterich@tdietterich·
At #ICML2025, several of the posters I wanted to visit had no one to present them. The authors are Chinese, so this is probably due to visa issues. These researchers want to come to Canada and share their insights, but our governments are blocking this. It is our loss!
English
6
5
161
11.9K