Lei Li @NeurIPS2025

793 posts

Lei Li @NeurIPS2025

@lileics

Generative AI for language and science. MT, LLM, GenAI Safety, Drug Discovery

Katılım Nisan 2010

450 Takip Edilen6.4K Takipçiler

Lei Li @NeurIPS2025@lileics·2d

@ManlingLi_ interesting. so AI is in general more constructive..

English

117

Manling Li@ManlingLi_·3d

I found ICML’s experiment on AI reviewing quite interesting: Policy B (with AI) scores seem to be generally higher than Policy A (pure human). Not sure whether this is just the datapoints I know of...

English

19.6K

Lei Li @NeurIPS2025@lileics·24 Şub

5/ Check out the full paper and our live leaderboard here: 🔗 Project Page: leililab.github.io/susvibes-leade…📄 Paper: arxiv.org/abs/2512.03262 #VibeCoding #CyberSecurity #LLM #SoftwareEngineering #AIAgent

Lei Li @NeurIPS2025@lileics

4/ Key Leaderboard Highlights: 🏆 Security Leader: @OpenHands + GLM4.7 🏆 Functionality Leader: SWE-agent + Claude 4 Sonnet If we are moving toward an agent-led dev cycle, we need to talk about security now, not later.

English

Lei Li @NeurIPS2025@lileics·24 Şub

Lei Li @NeurIPS2025@lileics

3/ The "Vibe" Trap: Even when we gave agents hints about potential vulnerabilities, they struggled to mitigate the risks.

English

1.7K

Lei Li @NeurIPS2025@lileics·24 Şub

3/ The "Vibe" Trap: Even when we gave agents hints about potential vulnerabilities, they struggled to mitigate the risks.

Lei Li @NeurIPS2025@lileics

2/, We tested the world’s leading coding agents, and the results are a wake-up call for the industry: Functionality ≠ Security: For example, while SWE-Agent with Claude 4 Sonnet solved 61% of tasks correctly, only 10.5% of those solutions were actually secure.

English

1.2K

Lei Li @NeurIPS2025@lileics·24 Şub

Lei Li @NeurIPS2025@lileics

🚀 Is "Vibe Coding" actually safe for production? We’ve all seen the demos: give an LLM agent a prompt, watch it work its magic, and boom—you have a feature. But there’s a massive hidden risk. In our latest paper, we introduce SUSVIBES, a benchmark of 200 real-world SE tasks.

English

1.6K

Lei Li @NeurIPS2025@lileics·24 Şub

Guilherme Favaron@guifav

Your vibe coded app works. But is it secure? New benchmark SusVibes from Songwen Zhao, Danqing Wang, Kexun Zhang, Jiaxuan Luo, Zhuo Li, and @lileics at @CarnegieMellon, @Columbia, and @JohnsHopkins tested 200 real world feature requests on coding agents. The results are sobering: SWE Agent with Claude 4 Sonnet produced functionally correct code 61% of the time, but only 10.5% of solutions were actually secure. Even adding security hints to prompts did not fix the problem. The gap between 'it works' and 'it is safe to deploy' is massive. 77 different CWE vulnerability types showed up across the benchmark. Worth thinking about next time someone says AI will replace software engineers. The harder question was never about writing code that runs. It was always about writing code that does not break under adversarial conditions. Source: arxiv.org/abs/2512.03262

English

2.9K

Lei Li @NeurIPS2025@lileics·12 Ara

@siqi_ouyang wonderful! Thank you @siqi_ouyang for taking the lead and serving the community!

English

500

Siqi Ouyang@siqi_ouyang·10 Ara

Excited to co-chair next year’s IWSLT Simultaneous Translation Track! We’re collecting community feedback to shape the 2026 task. If you work on SimulST, please fill out our short survey: forms.gle/8EvERoSGsuDtLR…

English

604

Lei Li @NeurIPS2025@lileics·10 Ara

Congratulations to all students in the “Generative AI for Biomedicine”!Truly amazing and excellent posters beyond my expectation! Thanks for co-instructor @jmuiuc and superb TAs @ZhenqiaoSong @ramith__ to make this course successful!

Jian Ma@jmuiuc

Poster day for our “Generative AI in Biomedicine” course this semester. The students’ creativity, energy, and enthusiasm for this exciting area are truly inspiring!

English

1.7K

Lei Li @NeurIPS2025@lileics·2 Ara

I am at #NeurIPS2025 this week and happy to meet and chat about coding/reasoning agents, LLM security, privacy/copyright of genAI, and AI for drug/protein design. Also happy to meet prospective phd applicants to CMU and applicants to CMU GenAI/LLM certificate program.

English

Lei Li @NeurIPS2025@lileics·13 Kas

@hhsun1 @windx0303 @OhioState Wow! Congrats for such huge openings!

English

174

Huan Sun@hhsun1·10 Kas

🚀 Worried about faculty openings? Ohio State @OhioState is to hire 100 new faculty with AI expertise over the next five years! 🤖🎓 The new hires will join one of three AI Faculty Cohorts: 🧠 Foundational AI — Elevating the theoretical, mathematical, and algorithmic underpinnings of AI. 🧩 Applied AI — Harnessing AI to revolutionize the translation of ideas into real-world solutions for Ohio and beyond. 🛡️ Responsible AI & Cybersecurity — Ensuring ethical innovation and safeguarding digital landscapes for a secure future. 📅 Initial faculty searches are underway — the first group of hires will join in Autumn 2026! ✨ Moreover, two tenure-track (open-rank) positions are already open in CSE @OhioStateCSE, starting in Fall 2026! 🔹 Timashev Professor (Assistant/Associate/Full, Tenure-Track) — in Programming Languages & Software Engineering (PLSE), starting as early as Fall 2026. Affiliated with the new Center for Software Innovation (CSI), endowed by a historic $110M Timashev Family Foundation gift — the largest in OSU’s history! 💰 🔹 Open-Rank Faculty in AI + Healthcare — seeking scholars at the intersection of AI/ML and healthcare innovation. The faculty will join the AI(X) Hub at Ohio State, established to drive innovation, provide resources, and foster the development of foundational and applied AI. 🧬💡

English

213

38.6K

Lei Li @NeurIPS2025@lileics·16 Eki

Meet LLaMAX2: a strong multilingual LLM which excels on 17 language's translation and reasoning! (it is actually based on QWen3 but since there is a prior LLaMAX model, we just reuse the name convention). as always, feedback is welcome

FeYuan@t_feyuan

Welcome to use our models. More Details: 🎉 Paper: LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning (huggingface.co/papers/2510.09…) 🎉 Code: github.com/CONE-MT/LLaMAX… 🎉 Model: huggingface.co/collections/LL…

English

2.2K

Lei Li @NeurIPS2025@lileics·16 Eki

Excited and Congratulations to my colleague Maarten Sap for winning the prestigious Packard Fellowship for Science and Engineering! #CMU #LTI

Maarten Sap (he/him)@MaartenSap

I’m ✨ super excited and grateful ✨to announce that I'm part of the 2025 class of #PackardFellows (packard.org/2025fellows). The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

English

2.3K

Lei Li @NeurIPS2025 retweetledi

Martin Jinye Zhang@martinjzhang·9 Eki

Can AI develop methods like a seasoned statistical geneticist? 🤔 In 8 hrs, our new method TusoAI improve two popular tools in genetics: scDRS (+40% power) & pgBoost (+11% enrichment). Preprint: arxiv.org/abs/2509.23986 Great work by @AlistairTurcan with @KexinHuang5 @lileics

English

20.7K

Lei Li @NeurIPS2025@lileics·26 Ağu

Come join us on 9/12 at CMU AI for Science workshop to present and discuss about how modern generative AI and foundation models accelerate scientific discoveries. We have an outstanding lineup of speakers and various poster/panel/lab/social activities. cmu-ai-for-science-workshop.github.io

Jiayi Geng@JiayiiGeng

📢 We're thrilled to announce the CMU AI for Science Workshop on Sept 12 at CUC-MPW! Featuring an amazing lineup of speakers: - Akari Asai (AI2/CMU) - Gabe Gomes (CMU) - Chenglei Si (Stanford) - Keyon Vafa (Harvard) Join us on campus, submit your poster & register here: cmu-ai-for-science-workshop.github.io Questions? Feel free to email: cmu-ai-for-science-workshop@andrew.cmu.edu We look forward to see you there!🤗

English

1.2K

Lei Li @NeurIPS2025@lileics·26 Ağu

Wonderful results of benchmarking LLM on MCP use from @michaelqshieh 👍

Michael Qizhe Shieh@michaelqshieh

Introducing MCPMark, a collaboration with @EvalSysOrg and @lobehub! We created a challenging benchmark to stress-test MCP use in comprehensive contexts. - 127 high-quality data samples created by experts. - GPT-5 takes the current lead and achieves a Pass@1 of 46.96% while the other models fall in the range of 10-30%. - Diverse test cases on Notion, Github, Filesystem, Playwright (browser), and Postgres. 9🧵s ahead

English

1.6K

Lei Li @NeurIPS2025@lileics·23 Ağu

@shinjiw_at_cmu @pengyf21 @chenwanch1 @yuisudo24 @MXzBFhjFpS1jyMI Big Congratulations!🎉

English

235

Shinji Watanabe@shinjiw_at_cmu·22 Ağu

Our work on OWSM v4 received the Best Student Paper Award at #Interspeech2025! 🏆🎉 Huge congratulations to the team! 🚀👏 I’m especially happy to see our open science efforts for speech foundation models recognized by the community. 🙌 🔗 isca-archive.org/interspeech_20…

English

116

13.8K

Lei Li @NeurIPS2025@lileics·17 Ağu

Congratulations to AI2 @allen_ai on getting major support from @NSF and @nvidia to advance AI for scientific discovery, which is major area modern generative AI and foundation models can accelerate the progress!

Ai2@allen_ai

With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡

English

7.1K

Lei Li @NeurIPS2025@lileics·18 Tem

The show is on. Welcome to 2025 Generative AI for Biology workshop. 7 invited talks + a panel with 5 panelists + 14 spotlight talks + 121 poster presentations! Huge thanks to the workshop sponsors: Genesis Therapeutics, Genbio AI, and Tencent! genbio-workshop.github.io/2025/

English

1.6K

Lei Li @NeurIPS2025@lileics·18 Tem

@tdietterich My student (Chinese) is in US but still could not get Canada visa to present her work.

English

327

Thomas G. Dietterich@tdietterich·17 Tem

At #ICML2025, several of the posters I wanted to visit had no one to present them. The authors are Chinese, so this is probably due to visa issues. These researchers want to come to Canada and share their insights, but our governments are blocking this. It is our loss!

English

161

11.9K

Keşfet

@ManlingLi_ @OpenHands @siqi_ouyang @jmuiuc @ZhenqiaoSong @ramith__ @hhsun1 @windx0303