Dan Roy

27.9K posts

Dan Roy

@roydanroy

@Google DeepMind. On leave, Canada CIFAR AI Chair and Former Research Director, @VectorInst. Professor, @UofT (Statistics/CS). Views are my own.

London Katılım Haziran 2009

1.9K Takip Edilen65.1K Takipçiler

Sabitlenmiş Tweet

Dan Roy@roydanroy·7 Şub

I'm an AI researcher.

English

746

303.5K

Dan Roy@roydanroy·11h

@SovereignJap It’s a reasonable solution.

English

1.2K

JS@SovereignJap·11h

@roydanroy They could have simply added a AI slop rating (just like likes on social media platforms) which requires someone to sign in and mark it as ai slop as a anonymous reader.

English

1.4K

Dan Roy@roydanroy·12h

There's a lot of controversy brewing around arXiv's decision to penalize authors who post unchecked AI generated content. The impulse is correct, IMO, simply on grounds of efficiency: it is much cheaper to insist the authors vet their work first, rather than distributing the cost of that work to EVERY reader/agent who subsequently downloads the work. I believe the mechanism is likely the wrong one, however. Unfortunately, suggestions to use github are even worse, IMO, because they lose the (effective) immutability of the scientific record, which arXiv upholds.

English

139

15.3K

Dan Roy@roydanroy·12h

@BlackHC No one can fault you for not standing by your principles, that's for sure.

English

6.8K

Andreas Kirsch 🇺🇦@BlackHC·1d

sueddeutsche.de/politik/ki-goo…

ZXX

Andreas Kirsch 🇺🇦@BlackHC·1d

In a personal capacity, and not on behalf of Google or Google DeepMind: very glad I could speak with FAZ and SZ about Google's reported classified AI deal with the Pentagon. Autonomous weapons, mass surveillance, and autonomous policing all concern public health and safety. They are matters of enormous public interest, not just internal policy debates. Companies that should know better are using safety-washed PR that undermines the honest public conversation we need about AI in classified settings.

Frankfurter Allgemeine gesamt@FAZ_NET

Google hat sich in einem KI-Abkommen mit dem Pentagon auf Konditionen eingelassen, die Anthropic abgelehnt hat. Ein Gespräch mit einem Mitarbeiter, der sich für sein Unternehmen schämt. faz.net/aktuell/wirtsc…

English

146

25.2K

Dan Roy@roydanroy·14h

@dfrsrchtwts @ben_moll @AdaptiveAgents Brain fart. I’ve known Stuart for 20 years.

English

Daniel Filan@dfrsrchtwts·18h

@ben_moll @roydanroy @AdaptiveAgents I can confirm it's Stuart Russell

English

Ben Moll@ben_moll·1d

In the economics literature, there are too many approaches to "bounded rationality" in which the decision problem the agents is purportedly solving is *at least as complicated* as that under rational expectations. That's backwards!

Pedro A. Ortega@AdaptiveAgents

“Resource Rationality” is a nice idea, but fundamentally flawed. Assume, for contradiction, that there is a rational way to allocate cognitive resources. Then some cognitive process must decide how much effort to spend on a task. But that process itself spends effort. Thus, a rational allocator must also decide how much effort to spend on allocation itself. That requires a further allocation decision, producing a regress. If the regress continues, no allocation is ever completed. If it stops, the stopping point is arbitrary rather than rational. Therefore, no fully rational way to allocate cognitive resources exists. Any non-contradictory theory of bounded rationality must therefore contain an irrational stopping point: some allocation of cognitive resources that is not itself rationally allocated. Formally, this is equivalent to a decision maker who cannot fully know the decision problem in advance, because the beliefs that define the problem are themselves produced by prior, unchosen cognitive allocations.

English

187

27.9K

Dan Roy@roydanroy·14h

@DimitrisPapail @mengyer Easy: You can vanish your paper.

English

1.7K

Dimitris Papailiopoulos@DimitrisPapail·20h

Found myself posting papers to GitHub instead of arXiv lately. No gatekeeping, is in the same repo as the code, one link for everything, and gets uploaded immediately. Makes you wonder what arXiv's actual value is.

English

751

82.5K

Dan Roy@roydanroy·22h

@stellensatz Those bounds were useless, though.

English

681

M_Vidyasagar@stellensatz·1d

No theorems, no bounds in neural networks in 2009? Already in the mid 1990s, people like Pascal Koiran, Eduardo Sontag et al. were combining VC-dimension theory with NNs to get both theorems and bounds. My book "A Theory of Learning and Generalization" was published in 1997.

Arjun Jain | Fast Code AI@Arjunjain

This is what AI looked like when I was doing my PhD in 2008. Tree search. Alpha-beta pruning. Branches and heuristics. I worked on the machine learning side, which was a separate field then. And neural networks, even inside ML, were treated as pseudoscience. No theorems. No bounds. I sat in seminars and smirked when someone presented results on them. I told friends not to waste a PhD on that stuff. The people I smirked at run the labs I cannot get into. One of them told me over coffee, in 2009, that he was switching to neural networks. I told him he was being unserious. I genuinely thought I was helping. He runs one of those labs. I got lucky. I went to NYU after, and the smirk left my face in six months. I am grateful I got in when I did. Two or three years earlier and I would not be writing this post. I think about that coffee a lot. What are you smirking at right now?

English

110

9.4K

Dan Roy@roydanroy·22h

Steep penalties for submitting AI slop to the arXiv.

Thomas G. Dietterich@tdietterich

The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue. 4/

English

213

28.4K

Dan Roy retweetledi

Thomas G. Dietterich@tdietterich·23h

Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments") end/

English

1.1K

64.3K

Dan Roy retweetledi

Krzakala Florent@KrzakalaF·1d

What if a theory of deep learning could be built from iterated kernel spectral methods? Feature learning, advantage of depth, emergence of concepts, convnets filters.... and a new backprop-free algorithm too! We have it all! Introducing Neural LoFi 🧵 arxiv.org/abs/2605.13612

English

355

33.6K

Dan Roy@roydanroy·1d

@dccockfoster Who owns Anthropic though?

English

1.9K

Duncan Cock Foster@dccockfoster·1d

A lot of people are reacting to this as if it is some sort of big short level 'We are in a bubble' statement, but it is just linear extrapolation of Anthropic's current revenue growth rate

Duncan Cock Foster@dccockfoster

I disagree. Google has ~$400 billion in annual revenue. Anthropic is on track to end the year with ~$100 bil ARR with a >10x growth rate. Even if the growth rate slows down significantly, Anthropic will surpass Google in revenue soon (maybe even in 2027). Anthropic's gross margin is a question, but Semianalysis thinks their gross margin is 70%

English

4.1K

Dan Roy@roydanroy·2d

@AvivTamar1 @yoavgo False for any procedure that is uniformly dominated.

English

Aviv Tamar@AvivTamar1·2d

@yoavgo @roydanroy Replace LLMs with anything else and your post still holds ;)

English

138

(((ل()(ل() 'yoav))))👾@yoavgo·3d

"I've been doing AI for 20 years and ..." and nothing. LLMs are new. LLM-Agents are new. our 20+ years experience with AI/ML/NLP may be marginally useful for understanding aspects of their training, but thats about it. we need new tools and experiences. we dont deserve authority.

English

401

23K

Dan Roy@roydanroy·2d

@CoreyLeander Damn

English

Corey@CoreyLeander·2d

@roydanroy Yeah but this isn’t one, sadly! This is coming from discredited Harold White: en.wikipedia.org/wiki/Harold_G.… If we were getting free energy from the vacuum, I promise you it’d be a bigger deal

English

Dan Roy@roydanroy·2d

Interesting developments at EpochAI.

Greg Burnham@GregHBurnham

Thread with a few notes on this. It’s a disappointing finding, of course. The best we can do is fix it up and learn lessons for future work.

English

3.8K

Dan Roy@roydanroy·2d

@JohnBal14046363 @octonion Yeah. This is where the “calculator” metaphor breaks down.

English

John Baldwin@JohnBal14046363·2d

@roydanroy @octonion Actual mathematician here. Unless we also change our mental architectures, there's limited speedup possible without losing necessary ingredients for deeply understanding difficult things.

English

Dan Roy@roydanroy·5d

The next era of mathematics will be owned by those who adapt. I'm not sure how long this era will last.

English

162

19.8K

Dan Roy@roydanroy·2d

@AdtRaghunathan Reminds me a bit of arxiv.org/abs/2506.14126 by @gkdziugaite

English

964

Aditi Raghunathan@AdtRaghunathan·3d

It's one of the first lessons in ML: the model with the lowest train loss isn't the one that generalizes best. Pretraining made that easy to forget. You train for one epoch over trillions of tokens, there's no traditional overfitting, and pretrain loss starts to feel like the whole story. Our paper argues it isn't. The lowest-loss model isn't the best starting point for post-training. An old sharp-vs-flat lesson, back in a new regime.

Ishaan Watts@IshaanWatts18

Spending billions to train the "best" base model? You might be optimizing the wrong thing! 🎯 We show that controlling sharpness during mid-training leads to over 35% less forgetting after fine-tuning / quantization... even when the base model itself gets worse. 🧵 Takeaways for pretraining: - Use SAM (Sharpness-Aware-Minimization) in the final steps (~10%) - Try much higher learning rates (yes, even ~10× larger) 1/9

English

142

21K

Dan Roy@roydanroy·3d

@Prityush (1) I think it is inevitable. (2) It will, however, likely introduce some bias. The nature of this bias is likely not obvious.

English

683

Prityush bansal@Prityush·3d

@roydanroy Using AI to find problems in benchmarks used to evaluate AI Surely this will not have any downstream problems

English

784

Dan Roy@roydanroy·3d

Friday: AI co-mathematician 48% on FrontierMath Tier 4 Monday:

English

251

26.3K

Dan Roy@roydanroy·3d

@Aaroth Thanks.

English

100

Aaron Roth@Aaroth·3d

@roydanroy So the more group functions you include, the more permissive the measure is because you've got a larger basis in which to represent the mapping from features to labels.

English

129

Aaron Roth@Aaroth·3d

Recently we showed that the minimax optimal rate for multicalibration is T^{2/3}. But that doesn't mean you have to do that badly on all instances. We give an algorithm that can adapt to easy instances and get better rates while still being minimax optimal in the worst case.

English

4.1K

Keşfet

@SovereignJap @BlackHC @dfrsrchtwts @ben_moll @AdaptiveAgents @DimitrisPapail @mengyer @stellensatz