Mingmeng GENG

100 posts

Mingmeng GENG banner
Mingmeng GENG

Mingmeng GENG

@GengMingmeng

Postdoc @ENS_ULM & @CNRS, PhD @SISSAschool, X2017 @Polytechnique, alumnus @SUSTechSZ, ML/LLM/CSS, Survey methodology 🦋 https://t.co/X4OC2YQ034

Paris, France Katılım Eylül 2014
164 Takip Edilen90 Takipçiler
Sabitlenmiş Tweet
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
Is ChatGPT Transforming Academics' Writing Style? From anecdotal evidence to quantitative estimate 👇👇👇 arxiv.org/abs/2404.08627
Mingmeng GENG tweet mediaMingmeng GENG tweet media
English
1
1
14
4K
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
@ziv_ravid @TmlrOrg has already put part of your idea into practice: the name of the Action Editor is visible to all
English
0
0
1
327
Ravid Shwartz Ziv
Ravid Shwartz Ziv@ziv_ravid·
Unpopular opinion: The reviewers' and ACs' names should not be unanimous. People will put much more effort into creating better reviews if they know that everyone will know their name
English
6
0
59
8.2K
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
@max_spero_ This is a general concern: x.com/GengMingmeng/s… I can't make too many judgments based just on your results. But if translation is not taken into account, the terms "AI-edited" and "fully human-written" may not be appropriate as names...
Mingmeng GENG@GengMingmeng

LLM-generated text detectors are often misunderstood and misused, especially since "LLM-generated text" lacks a unified and precise definition. For more details, check out our "recent" preprint: arxiv.org/abs/2510.20810 (joint work with @tpoibeau )

English
1
0
1
560
Max Spero
Max Spero@max_spero_·
@GengMingmeng I should also clarify that Pangram is tuned to detect generative LLM outputs starting with GPT-3.5 turbo through GPT-5. So even if someone was using GPT-2 or BERT or something, that would likely go undetected by our model.
English
1
0
0
545
Max Spero
Max Spero@max_spero_·
We were curious about our false positive rate, so we ran all ICLR 2022 reviews (pre-ChatGPT) as a baseline. Lightly AI-edited FPR: 1 in 1,000 Moderately AI-edited FPR: 1 in 5,000 Heavily AI-edited FPR: 1 in 10,000 Fully AI-generated: No false positives
Max Spero tweet media
Graham Neubig@gneubig

ICLR authors, want to check if your reviews are likely AI generated? ICLR reviewers, want to check if your paper is likely AI generated? Here are AI detection results for every ICLR paper and review from @pangramlabs! It seems that ~21% of reviews may be AI?

English
12
27
406
80K
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
LLM-generated text detectors are often misunderstood and misused, especially since "LLM-generated text" lacks a unified and precise definition. For more details, check out our "recent" preprint: arxiv.org/abs/2510.20810 (joint work with @tpoibeau )
English
0
0
3
813
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
@max_spero_ Some people used other tools (such as DeepL) to write or translate reviews before 2023... I find it difficult to understand how the false positive rate could be so low...
English
1
0
3
481
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
@emollick We have also done some research on how LLMs might affect Wikipedia and their other indirect effects. Wikipedia in the Era of LLMs: Evolution and Risks arxiv.org/abs/2503.02879
English
1
0
5
382
Ethan Mollick
Ethan Mollick@emollick·
For all its flaws, Wikipedia is the crowning achievement of the human web An AI-assisted fork of Wikipedia could be interesting, but only if it embraced transparency, had access to scholarly work & if the AI was used to help humans in making information better, not override them
Ethan Mollick@emollick

At the center of everything is Wikipedia. 🔎Wikipedia articles appear in 67%-84% of all search engine results & most info boxes 🔎Wikipedia generates 43M clicks to external websites a month 🔎Wikipedia is a major component of AI training data, including The Pile training set

English
51
41
492
47K
Florian Tramèr
Florian Tramèr@florian_tramer·
The whole experience with the @NeurIPSConf position paper track has just been one big 😂 Missed every deadline, only to now announce (a week after the original notification deadline) that they'll only accept ~6% of submissions. Should have just submitted to main track...
English
5
5
125
22.4K
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
@_vztu "to provide focused attention to authors"
English
0
0
1
256
Zhengzhong Tu
Zhengzhong Tu@_vztu·
I'm pleased to share that all my submissions to the NeurIPS Position track have been rejected. Fun Fact: just 40 of the almost 700 papers were accepted (<6%), given that this is an experimental year 😇
English
6
2
64
7.5K
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
Analysis based on the same set of Wikipedia pages, but using versions from different years! Joint work with @Dongping0612 and others
English
0
0
1
154
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
Have LLMs already impacted Wikipedia, and if so, how might they influence the broader NLP community? [New preprint] "Wikipedia in the Era of LLMs: Evolution and Risks" arxiv.org/abs/2503.02879 TL;DR: page views✅article content✅→ machine translation⏳RAG⏳ Additionally,
Mingmeng GENG tweet media
WikiResearch@WikiResearch

"Wikipedia in the Era of LLMs: Evolution and Risks" arxiv.org/abs/2503.02879… "Our findings and simulation results reveal that Wikipedia articles have been influenced by LLMs, with an impact of approximately 1%-2% in certain categories."

English
1
1
7
983
WikiResearch
WikiResearch@WikiResearch·
"Wikipedia in the Era of LLMs: Evolution and Risks" arxiv.org/abs/2503.02879… "Our findings and simulation results reveal that Wikipedia articles have been influenced by LLMs, with an impact of approximately 1%-2% in certain categories."
WikiResearch tweet mediaWikiResearch tweet media
English
1
1
13
1.8K
Mingmeng GENG retweetledi
𝚐𝔪𝟾𝚡𝚡𝟾
LLM as a Broken Telephone: Iterative Generation Distorts Information LLM-generated content degrades over repeated processing, similar to the “broken telephone” effect in human communication. This study shows that distortion accumulates based on language and chain complexity but can be reduced with strategic prompting.
𝚐𝔪𝟾𝚡𝚡𝟾 tweet media
English
1
4
11
1.2K
Mingmeng GENG retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
LLMs might distort information when they process their own generated content iteratively, similar to the "broken telephone" game. This paper investigates if and how LLMs distort information through repeated translation, and explores mitigation strategies. 📌 Iterative translation using similar languages preserves factuality better, showing linguistic influence on distortion. 📌 Complex translation chains, with more languages or models, amplify information distortion significantly. 📌 Lower temperature and constrained prompts are effective methods to reduce information distortion in iterative LLM workflows. ---------- Methods Explored in this Paper 🔧: → The paper simulates the "broken telephone" effect using iterative machine translation tasks. → It translates English documents into other languages and back to English repeatedly using LLMs. → The study uses languages with varying similarity to English, like French, German, Thai, and Chinese. → It employs metrics such as BLEU, ROUGE, and FActScore to measure text relevance and factuality across iterations. → Experiments involve bilingual self-loops, bilingual two-player model chains, and multilingual chains to test distortion levels. → The research also examines the impact of temperature settings and prompt constraints on information distortion during iterative generation.
Rohan Paul tweet media
English
1
1
13
1.8K
Mingmeng GENG retweetledi
Ethan Mollick
Ethan Mollick@emollick·
In April 2024, a few words like "delve" had a viral moment, when researchers pointed out that GPT-4 used it often, and the spread of the word showed AI use was common in scientific papers. Did that result in a decrease in AI use? No, but people started avoiding the word "delve!"
Mingmeng GENG@GengMingmeng

@emollick But anyway, many people are actually avoiding the term "delve" 😂 arxiv.org/abs/2502.09606

English
10
17
178
27.5K
Ethan Mollick
Ethan Mollick@emollick·
Forget “tapestry” or “delve” these are the actual unique giveaway words for each model, relative to each other. arxiv.org/pdf/2502.12150
Ethan Mollick tweet media
English
38
139
978
103.7K
Mingmeng GENG
Mingmeng GENG@GengMingmeng·
@lateinteraction Yes, so I want to measure the overall impact of LLMs on academic writing, not just to determine whether some papers are written by LLMs 😄 though MGT detection still matters sometimes
English
0
0
2
63