Shayne Longpre

2.3K posts

Shayne Longpre banner
Shayne Longpre

Shayne Longpre

@ShayneRedford

Lead the Data Provenance Initiative. PhD @MIT. 🇨🇦 Prev: @Google Brain, Apple, Stanford. AI/ML/NLP

Boston เข้าร่วม Şubat 2015
1.3K กำลังติดตาม5.9K ผู้ติดตาม
ทวีตที่ปักหมุด
Shayne Longpre
Shayne Longpre@ShayneRedford·
Who is winning the open AI race? Our new study "Economies of Open Intelligence" maps 2.2B @huggingface downloads across 851k models (2020→2025). 1) Power is rebalancing (US big tech ↓; China + community ↑) 2) Models got big & efficient (MoE, quant, multimodal surge) 3) Intermediaries now matter (adapters/quantizers steer usage) 4) Transparency is slipping /🧵
Shayne Longpre tweet mediaShayne Longpre tweet media
English
8
26
88
28.8K
Shayne Longpre รีทวีตแล้ว
Enrico Shippole
Enrico Shippole@EnricoShippole·
We @TeraflopAI have worked together with @johngfriedman and @daftengine to open-sourced all major filings from SEC EDGAR completely for free on @huggingface. It is now more important than ever to push for open dataset releases.
TeraflopAI@TeraflopAI

Given the increasingly closed-source nature of the U.S. AI ecosystem, it is now more important than ever to push for the proliferation of open model and dataset releases. Datamule (@johngfriedman), @TeraflopAI, and @daftengine collaborated to release 43 Billion Tokens of SEC EDGAR data.

English
3
17
54
28.3K
Shayne Longpre
Shayne Longpre@ShayneRedford·
Excited to see our Economies of Open Intelligence work highlighted in Chp. 1 of @StanfordHAI's #AIIndex2026! We release tons of info on the open model ecosystem, using 🤗 HF data. Thank you @russellwald and team!
Shayne Longpre tweet media
English
1
3
11
517
Shayne Longpre รีทวีตแล้ว
Yong Zheng-Xin
Yong Zheng-Xin@yong_zhengxin·
🚨New paper! How safe and aligned is Kimi K2.5? We found concerning dual-use capabilities, sabotage and self-replication tendencies, political censorship on Chinese-language queries, and potential agentic misuse risks. (1/N)
Yong Zheng-Xin tweet media
English
5
25
99
20.7K
Shayne Longpre รีทวีตแล้ว
Hamidah Oderinwale
Hamidah Oderinwale@didaoh·
Wrote a new essay with @AbramovichShira for @reboot_hq on procedural data extraction, consumer platforms, what it means for privacy, and the parallels to the attention economy! Cover art is a h/t to Daniel Dennett's "Cartesian theater" by @connie_surf :)
Hamidah Oderinwale tweet media
English
0
2
9
681
Shayne Longpre รีทวีตแล้ว
Shannon Shen
Shannon Shen@shannonzshen·
Check out our latest @augmind_fm release! It's a privilege to have such an interesting conversation with @tongshuangwu! I learned so much from her insights in both specific projects and general research guidance — I've kept quoting her in recent chats with friends. I love many parts of our conversation, but in particular the following quotes — She articulated so many profound thoughts with such clarity: “To think about really impactful research is to 𝐫𝐞𝐭𝐡𝐢𝐧𝐤 𝐭𝐡𝐞 𝐚𝐬𝐬𝐮𝐦𝐩𝐭𝐢𝐨𝐧𝐬 𝐦𝐚𝐝𝐞 𝐛𝐲 𝐭𝐡𝐞 𝐜𝐨𝐦𝐦𝐮𝐧𝐢𝐭𝐲 and try to challenge those assumptions. If everyone feels like things should happen in this way and no one questions it, question it and see if it actually brings something interesting." — This couldn't resonate more in an era when everyone feels exhausted by constant AI updates: there are still many questions worth asking and waiting to be discovered. This is such a grounded answer to Steve Jobs's famous mantra "Think Different." "Even for the research I am doing right now, it's either human-centered AI or AI-centered human [...]. But when I think about it, 𝐡𝐮𝐦𝐚𝐧𝐬 𝐚𝐧𝐝 𝐀𝐈, 𝐢𝐭'𝐬 𝐯𝐞𝐫𝐲 𝐡𝐚𝐫𝐝 𝐭𝐨 𝐬𝐞𝐩𝐚𝐫𝐚𝐭𝐞 𝐭𝐡𝐞𝐦. 𝐈 𝐝𝐨 𝐭𝐡𝐢𝐧𝐤 𝐭𝐡𝐞𝐲 𝐜𝐨-𝐞𝐯𝐨𝐥𝐯𝐞. [...] How do we actually study them together. [...] that is definitely a field that, I think, would become even more interesting in the next few years." — Studying intelligence is looking into a mirror of ourselves, and this becomes ever more true as the models get better. The emphasis on human-centeredness is not about sacrificing technical rigor but rather looking beyond the surface of intelligence to truly understand us. There's so much more packed in this conversation. Give it a listen and hope you'll enjoy it as much as I did!
Sherry Tongshuang Wu@tongshuangwu

I'm not brave enough to watch myself on camera🫣, but @shannonzshen is a great interviewer and I remember us having really interesting discussions! Annnd we made sure to feature CMU’s Scotty in the scene so don’t miss it!...🐶

English
1
5
18
2.8K
Shayne Longpre รีทวีตแล้ว
Matthew Leavitt
Matthew Leavitt@leavittron·
Two nursing home residents are eating lunch. One says, "Boy, the food at this place is terrible." The other says, "Yeah, I know, and such small portions, too." This is the multilingual data problem. The data is bad, AND there's not enough of it. Yesterday at @datologyai we released ÜberWeb: our study of multilingual curation that gets 4-10x train FLOPs improvements on multilingual benchmarks compared to strong public baselines like Qwen3-1.7B and Tiny Aya Base.
Matthew Leavitt tweet media
English
1
9
39
3.6K
Shayne Longpre รีทวีตแล้ว
Lossfunk
Lossfunk@lossfunk·
🚨 Shocking: The quality of response you get from the LLM depends on the language you use! Our new paper reveals how LLMs entangle language with culture, leading to culturally different responses purely based on the language of the query 👇 Accepted at LM4UC, AAAI!
GIF
English
12
28
152
26.2K
Shayne Longpre รีทวีตแล้ว
adaption
adaption@adaption_ai·
Adaption has raised $50M to build adaptive AI systems that evolve in real time. Everything intelligent adapts. So should AI.
English
194
160
1.6K
193.9K
Shayne Longpre
Shayne Longpre@ShayneRedford·
We just released the Google Research Blog for ATLAS 🗺️! Check out for: 1) Multilingual scaling and data mixing laws for 100s of languages 2) "Curse of Multilinguality" modeling 3) Cross-lingual transfer scores 🌎 research.google/blog/atlas-pra…
Shayne Longpre tweet media
English
1
5
17
629
Shayne Longpre รีทวีตแล้ว
Google Research
Google Research@GoogleResearch·
Introducing ATLAS: New scaling laws for massively multilingual language models. We offer practical, data-driven guidance to balance data mix and model size, helping global developers better serve billions of non-English speakers. Learn more: goo.gle/49WYLL0
Google Research tweet media
English
21
205
1.4K
89.3K
Shayne Longpre รีทวีตแล้ว
Niloofar
Niloofar@niloofar_mire·
Finally wrote up a blogpost on my surviving (and maybe thriving?) on the academic job market! stuff people don't usually talk about: routines, food, and how to do 10 back-to-back 1:1s without your brain turning to mush. Also why I always had broccoli in my bag lol Link ⬇️
Niloofar tweet media
English
10
33
472
78.9K
Shayne Longpre รีทวีตแล้ว
Ahmed Ahmed
Ahmed Ahmed@AhmedSQRD·
1/🧵 We prompted production LLMs with a short prefix of a book and asked them to complete the rest. How much of the book did they return? For Harry Potter and the Sorcerer’s Stone: (jailbroken) Claude 3.7 Sonnet→95.8%, GPT-4.1→4.0% (not jailbroken) Gemini 2.5 Pro→76.8%, Grok 3→70.3% Read on more details:
Ahmed Ahmed tweet media
English
22
53
323
76.6K
Shayne Longpre รีทวีตแล้ว
Cohere Labs
Cohere Labs@Cohere_Labs·
Great teams form when we widen the search. 🌍 @ShayneRedford reminds us that the right collaborators aren’t defined by geography or seniority— they emerge when we look across disciplines and along the full spectrum of experience. Watch this full Keynote Presentation from the Connect Conference: youtu.be/b0ydOb6e_T0
YouTube video
YouTube
English
0
5
14
1.3K
Shayne Longpre รีทวีตแล้ว
rishi
rishi@RishiBommasani·
How transparent are major AI companies? We answer this question each year in the annual Foundation Model Transparency Index. While the AI industry as a whole is quite opaque, we found a huge spread. @IBM scored a 95/100 while @xai scored 14/100. So what's going on? 🧵
rishi tweet media
English
15
21
64
59.1K
Shayne Longpre รีทวีตแล้ว
MMitchell
MMitchell@mmitchell_ai·
Open-[source/weights/science] influences much of AI’s uptake, yet the dynamics are being overlooked. “Leadership in [AI] is not fixed and can be reshaped within a single model generation.” Nice piece from my colleague @frimelle and @ShayneRedford. techpolicy.press/policymakers-o…
English
2
9
20
2.6K