Fei Sha

669 posts

Fei Sha

@feishaAI

Researcher and engineers in AI and ML; now at Google Research, formerly Professor at USC

Cambridge, MA Entrou em Şubat 2009

331 Seguindo1.9K Seguidores

Fei Sha@feishaAI·1d

I tried this or similar a few times; I have given up. I find the generated surveys on the borderline of intellectual drivels. It is like a poorly written “Background” in a paper, except 100x times bigger. It generally feels like a fluid pile of “abstractive summary”.

(((ل()(ل() 'yoav))))👾@yoavgo

- my long term goal is to map the field of 'X'. it should be very comprehensive. at some point i would like to write a survey. can you help me out with the literature search? - [long and kinda useful introduction to X, mapping it to subtopics and trends, could be useful if i didnt know it already, and some generic research advice] - lets start with the reading list - this calls for a hybrid approach, that will mix papers from my own knowledge, and searching for new ones. one need to be methodological about it. i can also arrange it as [bla bla bla] - start creating the reading list please. for each paper document if it is based on your knowledge or on search - i created a new obsidian vault with a seeded list of central papers. for each one, i listed some key insights in its own markdown file. do you want me to link them to each other? should i categorize them into topics? - start searching for papers please - done! i did a search and found relevant papers. i updated the vault with a new organization. i identified the following subtopics: ... - focus on subtopic Z please. find me papers about this topic. i want a comprehensive list of papers about Z, and their findings - done! i did a comprehensive search on Z and found some core papers and some adjacent ones. do you want me to re-organize your vault based on them? - how many papers on Z do you have? - i have 5 core and 7 adjacent, for a total of 12 papers. if you want, i can break them down by subtopic. the subtopics are... - find more Z papers. i want a good list before starting to process them - sure thing! i expanded the list and it now has additional 90 papers. i can categorize them and update the vault. some subtopics i identified are... - how many of these are about Z? - good call! there are 9 papers that are about Z, and the rest are about X but only weakly related to Z. do you want me to move the papers about Z to their own folder? - you can do that. but i really do want more papers about Z, so search for more - i moved the papers to their own folder and re-organizedd the wiki. i found additional papers. - how many? - i found 2 new papers and one that was already on my list. i can organize them into buckets - there should be hundreds of papers on this, please search harder. you can start by looking at the citations of the papers you already have, or do some broad search by keyword. you can use the keywords U,V,W - i did a comprehensive search. i updated the wiki. - how many papers did you find? - i found 149 new papers, and 5 that i already found before. do you want me to.... - and these 149 are all about Z? - good call! no only 4 were about Z. - are the rest about X? how many are about X? - you are right i strayed a bit and only 9 are related to X. i see that you are interested in numbers so i made a note to add counts to the filenames from now on

English

196

Fei Sha@feishaAI·1d

@docdano @mcuban Are not both the problem? It is a product of ( at least) two factors: the source and the “distribution”. transparency data alone, w/o other factors, hardly ascertain the source as the only culprit: i can get the same calorie in two places with very different prices !

English

Dan McCoy, MD@docdano·2d

Brand meds sell at list price, then the supply chain adds markups. Same pattern with hospitals - they set prices, insurers pass them through. Both are cost problems at the source. Both have middle-chain markups. The difference: you're disrupting pharma distribution with Cost Plus. Who's doing that for hospital prices? Price transparency data shows 300-500% variation for the same procedure in the same city. That's the bigger dollar problem for employers. I'd love it if you would expand CostPlus to hospitals!

English

Mark Cuban@mcuban·3d

Everything in the hospital could cost $1, and the insurance companies conglomerates would buy them, raise prices, and make sure their top and bottom lines grew I'm not saying hospital systems are innocent, far from it. But the big vertically integrated insurance companies create the annual plans that crush people's financial situation

Anthony DiGiorgio, DO, MHA@DrDiGiorgio

High hospital prices are the reason your insurance is expensive. They’re the reason you haven’t gotten a raise. They’re almost entirely driven by government policy. We can fix this.

English

256

131

1.2K

281.8K

Fei Sha@feishaAI·23 May

@aminkarbasi @jiaxinwen22 We need a psychology, sociology and biology equivalent for AI — AI systems are “organisms” ( no I donot intend to antropomorphize them ) of pieces of “smart” stimulus-response “cells”.

English

Amin Karbasi@aminkarbasi·22 May

@jiaxinwen22 What field can?

English

667

Jiaxin Wen@jiaxinwen22·22 May

It's very disappointing that information theory cannot explain AI at all.

English

362

104.2K

Fei Sha@feishaAI·23 May

@DimitrisPapail @jiaxinwen22 My dear Dimitris, not only it might be the wrong tool to build an ai system but also a wrong tool to understand it (at one time in the past, it might be. But we definitely need new frameworks )

English

Dimitris Papailiopoulos@DimitrisPapail·22 May

Information theory was not built to explain algorithmic phenomena. It's a beautiful framework for arguing about the limits of information: what can be communicated, stored, retrieved, compressed etc. Most attempts I've seen at forcing IT onto AI feel like trying to make coffee with a katana. Magnificent instrument but wrong job :)

English

191

12.3K

Fei Sha@feishaAI·10 May

@suchenzang Desiring a good storytelling is as much a human condition as other traits, cherished or not.

English

251

Susan Zhang@suchenzang·10 May

at some point you realize there's often very little merit to fame amongst the tech elite, just people who well positioned to soak up talent around them, and are extremely adept at rewriting narratives to build their own legends after "the work" is already done for better or worse, there's somehow always a heavy selection bias for fantastic story-tellers everywhere, and the peak will never truly live up to the image they've created of themselves or in other words, never meet your heroes

English

58.7K

Fei Sha@feishaAI·8 May

@vansteenkiste_s No not really in a sample size of 3 reviews . This type of system builds on assumptions of very disciplined very balanced reviews such that each signal is substantive. That kind of assumption has long been invalid from a population level

English

128

Sjoerd van Steenkiste@vansteenkiste_s·8 May

Does (NeurIPS) having both borderline accept and borderline reject scores serve a valuable purpose over a singular “borderline”?

English

1.6K

Fei Sha@feishaAI·2 May

@shoyer I used to say that until you have a baby you would not feel so awed by human intelligence - that was one thing l have learned about to decades ago. I still think so. Congrats !!!

English

328

Stephan Hoyer@shoyer·2 May

Things I didn’t expect from becoming a dad: - I’m now in a secret club with most of the world’s adult population - I’ve been magically transformed into a morning person! - Babies are genuinely lots of fun 😊

English

Fei Sha@feishaAI·23 Nis

@sirbayes @eringger Fun read

English

207

Kevin Patrick Murphy@sirbayes·22 Nis

Cool Bayesian analysis of satoshi nakamoto identification claims by @eringger open.substack.com/pub/eringger/p…

English

7.5K

Fei Sha@feishaAI·21 Nis

@GoogleDeepMind This is great! Congrats !

English

100

Google DeepMind@GoogleDeepMind·21 Nis

Deep Research and Deep Research Max are our latest autonomous research agents powered by Gemini 3.1 Pro. They can safely navigate both the web and your custom data, like internal docs and specialized financial information, to create professional-grade, fully cited reports. 🧵

English

206

1.9K

142.9K

Fei Sha@feishaAI·10 Nis

@kchonyc Forgot adding ‘minimize the chance of coming back to me for clarification on trivial stuff by reducing my intervention as you are an intelligent agent’ —> oops, ‘yes | rm -r *’ . That would be a real cry.

English

188

Kyunghyun Cho@kchonyc·10 Nis

poor academics cry

English

3.6K

Fei Sha@feishaAI·1 Nis

@percyliang Congrats ! The scaling law is definitely not exponential in academia

English

626

Percy Liang@percyliang·1 Nis

Academic titles are funny. After 14 years, I finally have the official title that people might have always assumed I had.

English

1.3K

116.4K

Fei Sha@feishaAI·1 Nis

@StefanoErmon @haotian_yeee Very cool

English

Stefano Ermon@StefanoErmon·31 Mar

Excited about this project from my student @haotian_yeee . InfoTok goes back to first principles and uses information theory to make video tokenization adaptive. Really nice to see such a clean idea lead to >2x better compression and 10x faster inference. ICLR Oral.

Haotian Ye@haotian_yeee

Finally getting to share one of my favorite projects. ICLR Oral! 🏆 It’s so strange how rigid video tokenization is. Think about it: why should a still landscape cost the same amount of tokens as a busy street? We built InfoTok. We went back to basics with Shannon’s information theory to make tokens "adaptive" in a principled way. Its 2.3x better compression and 11x faster inference demonstrates the magic of the old-school theory ✨ Check it out: research.nvidia.com/labs/dir/infot…

English

144

20.9K

Fei Sha@feishaAI·30 Mar

@yoavgo Yeah x.com/feishaai/statu…

Fei Sha@feishaAI

@euanashley This was observed before frontier model era for example, aclanthology.org/N18-1040/ “In particular, the resulting learner can ignore the visual information, the question, or both while still doing well on the task.”

English

315

(((ل()(ل() 'yoav))))👾@yoavgo·30 Mar

the "question only baseline" strikes again. remarkable how effective this continues to be, and how not enough well known.

euan ashley@euanashley

New AI paper from us this week. When my student first showed me his initial findings, I really didn’t know what to make of them. I felt that this was an interesting but curious loophole phenomenon that would shortly be closed. I was very wrong. arxiv.org/abs/2603.21687

English

8.9K

Fei Sha@feishaAI·30 Mar

@andrewgwils Catastrophic forgetting happens to neural networks, artificial or biological.

English

253

Andrew Gordon Wilson@andrewgwils·29 Mar

Oh right, because no one had heard of transfer learning before "ulmfit" (presently taking credit for LLMs). There's a lot I love about ML, but the rate at which we forget (or never bother to understand) is not one of them.

English

12.3K

Fei Sha@feishaAI·30 Mar

@docmilanfar I assume you were referring to the debate about the quantization ? I thought that debate is less about citing JL, more about citing a perceived directly related work ( instead of citing all possible work built on JL)

English

219

Peyman Milanfar@docmilanfar·30 Mar

AI people going on about Johnson-Lindenstrauss lemma like it was discovered yesterday. it’s just another example of how most folks don’t read or know anything more than 5 years old

English

306

25.8K

Fei Sha@feishaAI·30 Mar

English

765

euan ashley@euanashley·28 Mar

English

257

1.5K

476K

Fei Sha@feishaAI·29 Mar

@yoavgo JL is a standard technique but its use in quantization (especially in useful ones to real world problems ) is not that abundant? I am hoping there could be more frank discussions about this.

English

244

(((ل()(ل() 'yoav))))👾@yoavgo·28 Mar

another remark on this: this can be seen as a technical complaint about academic credit ("we also use JL transform when creating the codebook and they didn't acknowledge that"). but it is more than that. by reading the TurboQuant paper, one gets the impression that JL / random projection is the major component. but since RaBitQ also uses JL, then if TurboQuant is indeed better, it means that the thing that actually works for TurboQuant (the contribution) is not JL but something else. And currently, we the readers cannot know this without implementing and checking. so it is not only credit assignment. if TurboQuant authors would say "RaBitQ also did JL, but we differ from them by doing XYZ, which improves things from this to that" we as readers would get a much more informative paper, and TurboQuant authors will have written about an actual contribution to state of the art.

Jianyang Gao@gaoj0017

The TurboQuant paper (ICLR 2026) contains serious issues in how it describes RaBitQ, including incorrect technical claims and misleading theory/experiment comparisons. We flagged these issues to the authors before submission. They acknowledged them, but chose not to fix them. The paper was later accepted and widely promoted by Google, reaching tens of millions of views. We’re speaking up now because once a misleading narrative spreads, it becomes much harder to correct. We’ve written a public comment on openreview (openreview.net/forum?id=tO3AS…). We would greatly appreciate your attention and help in sharing it.

English

159

28.4K

Fei Sha@feishaAI·28 Mar

@nhaghtal @Berkeley_EECS Congrats!

English

424

Nika Haghtalab@nhaghtal·28 Mar

This week I was promoted to the rank of Associate Professor at @Berkeley_EECS ! In a remarkable show of enthusiasm, the committee apparently tore a hole in spacetime to make me an Associate Professor 9 months ago!

English

720

55.7K

Fei Sha@feishaAI·13 Mar

@kchonyc @karpathy Instruction in Algo 68 style ?

English

433

Kyunghyun Cho@kchonyc·13 Mar

thanks to @karpathy , now i have cracked the mystery why my agent doesn't follow my instruction closely enough.

English

172

3.7K

794.8K

Fei Sha@feishaAI·10 Mar

@ylecun Congrats !

English

Yann LeCun@ylecun·10 Mar

Unveiling our new startup Advanced Machine Intelligence (AMI Labs). We just completed our seed round: $1.03B / 890M€, one the largest seeds ever, probably the largest for a European company. We're hiring! [the background image is the Veil Nebula - a picture I took from my backyard, most appropriate for an unveiling] More details here: techcrunch.com/2026/03/09/yan…

AMI Labs@amilabs

Advanced Machine Intelligence (AMI) is building a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe. We’ve raised a $1.03B (~€890M) round from global investors who believe in our vision of universally intelligent systems centered on world models. This round is co-led by Cathay Innovation, Greycroft, Hiro Capital, HV Capital, and Bezos Expeditions, along with other investors and angels across the world. We are a growing team of researchers and builders, operating in Paris, New York, Montreal and Singapore from day one. Read more: amilabs.xyz AMI - Real world. Real intelligence.

English

871

1.9K

19.1K

2.6M

Descobrir

@docdano @mcuban @aminkarbasi @jiaxinwen22 @DimitrisPapail @suchenzang @vansteenkiste_s @shoyer