Mingmeng GENG

100 posts

Mingmeng GENG

@GengMingmeng

Postdoc @ENS_ULM & @CNRS, PhD @SISSAschool, X2017 @Polytechnique, alumnus @SUSTechSZ, ML/LLM/CSS, Survey methodology 🦋 https://t.co/X4OC2YQ034

Paris, France Katılım Eylül 2014

164 Takip Edilen90 Takipçiler

Sabitlenmiş Tweet

Mingmeng GENG@GengMingmeng·15 Nis

Is ChatGPT Transforming Academics' Writing Style? From anecdotal evidence to quantitative estimate 👇👇👇 arxiv.org/abs/2404.08627

English

Mingmeng GENG@GengMingmeng·3 Nis

@matsuikentaro1 Thank you for sharing!

English

松井健太郎睡眠・精神医学 (Kentaro Matsui)@matsuikentaro1·1 Nis

Source: Beyond Via: Analysis and Estimation of the Impact of Large Language Models in Academic Papers arxiv.org/abs/2603.25638

English

969

松井健太郎睡眠・精神医学 (Kentaro Matsui)@matsuikentaro1·1 Nis

LLMs no longer delve. 草

松井健太郎睡眠・精神医学 (Kentaro Matsui) tweet media

English

9.9K

Mingmeng GENG@GengMingmeng·29 Kas

@ziv_ravid @TmlrOrg has already put part of your idea into practice: the name of the Action Editor is visible to all

English

327

Ravid Shwartz Ziv@ziv_ravid·29 Kas

Unpopular opinion: The reviewers' and ACs' names should not be unanimous. People will put much more effort into creating better reviews if they know that everyone will know their name

English

8.2K

Mingmeng GENG@GengMingmeng·16 Kas

@max_spero_ This is a general concern: x.com/GengMingmeng/s… I can't make too many judgments based just on your results. But if translation is not taken into account, the terms "AI-edited" and "fully human-written" may not be appropriate as names...

Mingmeng GENG@GengMingmeng

LLM-generated text detectors are often misunderstood and misused, especially since "LLM-generated text" lacks a unified and precise definition. For more details, check out our "recent" preprint: arxiv.org/abs/2510.20810 (joint work with @tpoibeau )

English

560

Max Spero@max_spero_·16 Kas

@GengMingmeng I should also clarify that Pangram is tuned to detect generative LLM outputs starting with GPT-3.5 turbo through GPT-5. So even if someone was using GPT-2 or BERT or something, that would likely go undetected by our model.

English

545

Max Spero@max_spero_·16 Kas

We were curious about our false positive rate, so we ran all ICLR 2022 reviews (pre-ChatGPT) as a baseline. Lightly AI-edited FPR: 1 in 1,000 Moderately AI-edited FPR: 1 in 5,000 Heavily AI-edited FPR: 1 in 10,000 Fully AI-generated: No false positives

Graham Neubig@gneubig

ICLR authors, want to check if your reviews are likely AI generated? ICLR reviewers, want to check if your paper is likely AI generated? Here are AI detection results for every ICLR paper and review from @pangramlabs! It seems that ~21% of reviews may be AI?

English

406

80K

Mingmeng GENG@GengMingmeng·16 Kas

English

813

Mingmeng GENG@GengMingmeng·16 Kas

@max_spero_ Some people used other tools (such as DeepL) to write or translate reviews before 2023... I find it difficult to understand how the false positive rate could be so low...

English

481

Max Spero@max_spero_·16 Kas

@GengMingmeng Can confirm.

English

1.3K

Mingmeng GENG@GengMingmeng·29 Eki

@emollick We have also done some research on how LLMs might affect Wikipedia and their other indirect effects. Wikipedia in the Era of LLMs: Evolution and Risks arxiv.org/abs/2503.02879

English

382

Ethan Mollick@emollick·29 Eki

For all its flaws, Wikipedia is the crowning achievement of the human web An AI-assisted fork of Wikipedia could be interesting, but only if it embraced transparency, had access to scholarly work & if the AI was used to help humans in making information better, not override them

Ethan Mollick@emollick

At the center of everything is Wikipedia. 🔎Wikipedia articles appear in 67%-84% of all search engine results & most info boxes 🔎Wikipedia generates 43M clicks to external websites a month 🔎Wikipedia is a major component of AI training data, including The Pile training set

English

492

47K

Mingmeng GENG@GengMingmeng·26 Eyl

@florian_tramer @NeurIPSConf "We want to be transparent:"

English

791

Florian Tramèr@florian_tramer·26 Eyl

The whole experience with the @NeurIPSConf position paper track has just been one big 😂 Missed every deadline, only to now announce (a week after the original notification deadline) that they'll only accept ~6% of submissions. Should have just submitted to main track...

English

125

22.4K

Mingmeng GENG@GengMingmeng·26 Eyl

@_vztu "to provide focused attention to authors"

English

256

Zhengzhong Tu@_vztu·26 Eyl

I'm pleased to share that all my submissions to the NeurIPS Position track have been rejected. Fun Fact: just 40 of the almost 700 papers were accepted (<6%), given that this is an experimental year 😇

English

7.5K

Mingmeng GENG@GengMingmeng·28 Tem

Happy to catch up with many friends this morning at the "Broken Telephone" poster. Looking forward to seeing more of you at the "Human-LLM Coevolution" poster this afternoon! #ACL2025 Broken Telephone: aclanthology.org/2025.acl-long.… Human-LLM Coevolution: aclanthology.org/2025.findings-…

English

230

Mingmeng GENG@GengMingmeng·6 Mar

Analysis based on the same set of Wikipedia pages, but using versions from different years! Joint work with @Dongping0612 and others

English

154

Mingmeng GENG@GengMingmeng·6 Mar

Have LLMs already impacted Wikipedia, and if so, how might they influence the broader NLP community? [New preprint] "Wikipedia in the Era of LLMs: Evolution and Risks" arxiv.org/abs/2503.02879 TL;DR: page views✅article content✅→ machine translation⏳RAG⏳ Additionally,

WikiResearch@WikiResearch

"Wikipedia in the Era of LLMs: Evolution and Risks" arxiv.org/abs/2503.02879… "Our findings and simulation results reveal that Wikipedia articles have been influenced by LLMs, with an impact of approximately 1%-2% in certain categories."

English

983

Mingmeng GENG@GengMingmeng·6 Mar

@WikiResearch Thanks for sharing our work! 😁 @Dongping0612

English

100

WikiResearch@WikiResearch·6 Mar

English

1.8K

Mingmeng GENG retweetledi

𝚐𝔪𝟾𝚡𝚡𝟾@gm8xx8·1 Mar

LLM as a Broken Telephone: Iterative Generation Distorts Information LLM-generated content degrades over repeated processing, similar to the “broken telephone” effect in human communication. This study shows that distortion accumulates based on language and chain complexity but can be reduced with strategic prompting.

English

1.2K

Mingmeng GENG retweetledi

Rohan Paul@rohanpaul_ai·1 Mar

LLMs might distort information when they process their own generated content iteratively, similar to the "broken telephone" game. This paper investigates if and how LLMs distort information through repeated translation, and explores mitigation strategies. 📌 Iterative translation using similar languages preserves factuality better, showing linguistic influence on distortion. 📌 Complex translation chains, with more languages or models, amplify information distortion significantly. 📌 Lower temperature and constrained prompts are effective methods to reduce information distortion in iterative LLM workflows. ---------- Methods Explored in this Paper 🔧: → The paper simulates the "broken telephone" effect using iterative machine translation tasks. → It translates English documents into other languages and back to English repeatedly using LLMs. → The study uses languages with varying similarity to English, like French, German, Thai, and Chinese. → It employs metrics such as BLEU, ROUGE, and FActScore to measure text relevance and factuality across iterations. → Experiments involve bilingual self-loops, bilingual two-player model chains, and multilingual chains to test distortion levels. → The research also examines the impact of temperature settings and prompt constraints on information distortion during iterative generation.

English

1.8K

Mingmeng GENG retweetledi

Ethan Mollick@emollick·19 Şub

In April 2024, a few words like "delve" had a viral moment, when researchers pointed out that GPT-4 used it often, and the spread of the word showed AI use was common in scientific papers. Did that result in a decrease in AI use? No, but people started avoiding the word "delve!"

Mingmeng GENG@GengMingmeng

@emollick But anyway, many people are actually avoiding the term "delve" 😂 arxiv.org/abs/2502.09606

English

178

27.5K

Mingmeng GENG@GengMingmeng·19 Şub

@sonikudzu @emollick one possibility 😂

English

333

loading…@sonikudzu·19 Şub

@GengMingmeng @emollick *adding “do not use the word delve” to the prompt

English

370

Ethan Mollick@emollick·19 Şub

Forget “tapestry” or “delve” these are the actual unique giveaway words for each model, relative to each other. arxiv.org/pdf/2502.12150

English

139

978

103.7K

Mingmeng GENG@GengMingmeng·17 Şub

@lateinteraction Yes, so I want to measure the overall impact of LLMs on academic writing, not just to determine whether some papers are written by LLMs 😄 though MGT detection still matters sometimes

English

Omar Khattab@lateinteraction·17 Şub

unpopular opinion: there was never anything wrong with “delve”, “intricate”, or any of these words — the problem is overusing flowery language and vacuous generalities, which LLMs and people are prone to do

Sauers@Sauers_

They just don't delve like they used to

English

4.6K

Mingmeng GENG@GengMingmeng·17 Şub

Human-LLM Coevolution: Evidence from Academic Writing arxiv.org/abs/2502.09606 Hint 1: To delve or not to delve, that is the intricate question! Hint 2: A short and easy-to-read paper! Still the word frequency in arXiv abstracts! 👇👇👇

Mingmeng GENG@GengMingmeng

Is ChatGPT Transforming Academics' Writing Style? From anecdotal evidence to quantitative estimate 👇👇👇 arxiv.org/abs/2404.08627

English

609

Keşfet

@matsuikentaro1 @ziv_ravid @TmlrOrg @max_spero_ @tpoibeau @emollick @florian_tramer @NeurIPSConf