Lester Mackey

563 posts

Lester Mackey

Lester Mackey

@LesterMackey

Machine learning researcher @MSFTResearch (@MSRNE); adjunct professor @Stanford

Katılım Kasım 2010
279 Takip Edilen2.9K Takipçiler
Sabitlenmiş Tweet
Lester Mackey
Lester Mackey@LesterMackey·
@tobias_schrdr and I are excited to share WildCat: Near-Linear Attention in Theory and Practice arxiv.org/abs/2602.10056 By attending over a spectrally-accurate optimally-weighted coreset, WildCat approximates exact attention with super-polynomial error decay in near-linear time
Lester Mackey tweet media
English
5
11
64
8.2K
Lester Mackey
Lester Mackey@LesterMackey·
Qiang Liu, Chris Oates, and I are writing a monograph on Probabilistic Inference and Learning with Stein’s Method, and we’d love to get your feedback on the first draft
Lester Mackey tweet media
English
3
25
181
17.1K
Lester Mackey retweetledi
Judah Cohen
Judah Cohen@judah47·
According to ChatGPT, the two best long range #winter forecasters are yours truly & ECMWF. I think winter 2026 I made a strong showing for why I'm number one. Better winter seasonal forecast, our Microduet team won DJF subseasonal AI contest decisively, & of course the PV blog!
Judah Cohen tweet mediaJudah Cohen tweet mediaJudah Cohen tweet mediaJudah Cohen tweet media
English
17
3
120
7.4K
Lester Mackey retweetledi
Statistics Papers
Statistics Papers@StatsPapers·
Probabilistic Inference and Learning with Stein's Method Qiang Liu, Lester Mackey, Chris Oates arxiv.org/abs/2603.07467 [𝚜𝚝𝚊𝚝.𝙼𝙻 𝚌𝚜.𝙻𝙶 𝚖𝚊𝚝𝚑.𝙿𝚁 𝚖𝚊𝚝𝚑.𝚂𝚃 𝚜𝚝𝚊𝚝.𝙼𝙴 𝚜𝚝𝚊𝚝.𝚃𝙷]
Statistics Papers tweet media
English
0
2
12
437
Lester Mackey retweetledi
Francesco Orabona
Francesco Orabona@bremen79·
New blog post: Better optimistic bounds and delays as bad hints I present improved (not the usual ones!) guarantees for optimistic algorithms. Then, I use them to show the optimal regret bounds for OLO with delays by treating delays as bad hints. Feedback is welcome.
Francesco Orabona tweet mediaFrancesco Orabona tweet media
English
1
4
55
3K
Francesco Orabona
Francesco Orabona@bremen79·
It is now official: My lecture notes on online learning will be published by Cambridge University Press. The final version is due by the end of May. So, if there is anything I missed/anything unclear/some refs I missed/anything else you don't like, please send me a message!
English
7
17
156
11K
Lester Mackey
Lester Mackey@LesterMackey·
@tobias_schrdr and I are excited to share WildCat: Near-Linear Attention in Theory and Practice arxiv.org/abs/2602.10056 By attending over a spectrally-accurate optimally-weighted coreset, WildCat approximates exact attention with super-polynomial error decay in near-linear time
Lester Mackey tweet media
English
5
11
64
8.2K
Lester Mackey retweetledi
Microsoft Research
Microsoft Research@MSFTResearch·
We are pleased to announce that Doug Burger (@dcburger), Technical Fellow and Corporate Vice President at Microsoft Research, has been elected to the National Academy of Engineering Class of 2026 for advancing cloud-scale computing and networking with field-programmable systems. msft.it/6011Quo5H
Microsoft Research tweet media
English
2
6
25
4.7K
Lester Mackey retweetledi
fly51fly
fly51fly@fly51fly·
[LG] WildCat: Near-Linear Attention in Theory and Practice T Schröder, L Mackey [Imperial College London & Microsoft Research] (2026) arxiv.org/abs/2602.10056
fly51fly tweet mediafly51fly tweet mediafly51fly tweet mediafly51fly tweet media
Deutsch
0
10
41
2.4K
Lester Mackey retweetledi
Tobias Schröder
Tobias Schröder@tobias_schrdr·
Softmax attention giving you O(n²) headaches? Maybe try Weighted Iterative Low-rank Decomposed Coreset Attentions (or WildCat for short), an approximate attention module that runs fast in theory and practice. Super excited to share this rewarding collaboration with @LesterMackey.
Lester Mackey@LesterMackey

@tobias_schrdr and I are excited to share WildCat: Near-Linear Attention in Theory and Practice arxiv.org/abs/2602.10056 By attending over a spectrally-accurate optimally-weighted coreset, WildCat approximates exact attention with super-polynomial error decay in near-linear time

English
0
3
8
870