Keshi Dai

243 posts

Keshi Dai banner
Keshi Dai

Keshi Dai

@daikeshi

Building ML Tools @ Spotify, 🏔️✈️☕⚽🏃⛷️🦙

Katılım Mayıs 2009
476 Takip Edilen207 Takipçiler
Keshi Dai retweetledi
Yangqing Jia
Yangqing Jia@jiayq·
I probably have some credibility as a person who has worked on @TensorFlow and @PyTorch both (also, caffe / @onnxai / distbelief / a few others that never saw the light), so here are my two cents: (1) Speed doesn't really matter today as long as it is not particularly bad. Under the hood, many are just calling optimized CUDA code anyway. (2) For ultimate speed, especially in the LLM field, many are writing their own runtime engines, and not vanilla framework. For example vLLM, by @zhuohan123 @woosuk_k et al, is a great open source solution. (Lepton builds its own too, and is probably the fastest not-a-framework llm engine right now) (3) TensorFlow and PyTorch are successful in their own terms. For me, TF taught me how to think like a system architect, and PyTorch taught me how to prioritize users' needs over optimization. (4) Simply listing a table showing performance is relatively useless, because every use case come with their own needs. (4.1) For example, if you are running ads and feed recommendation systems, then 1% performance matters, and you pretty much need to write even your own FusedGemmAndReluAndMyWeirdOpsAfterThatOp to make things fast. (4.2) For research though, the flexibility was of ultimate importance, and losing 20-30%, sometimes even 100% speed, is fine - you gain from reducing people hours. (4.3) I want to express thanks to @soumithchintala and the FAIR team, because we had long-lasting arguments back at Meta: I was serving all ads/feed needs so 4.1 was my only priority, and those arguments were a window into the 4.2 land. In retrospect, those was some of the most rewarding experiences in my career. (5) People hours are important for a simple reason: salaries are increasing every year, and NVidia is reducing the dollar cost per TFlops very quickly. (6) A side note... Keras PyTorch wrapper exhibits a universal 2x overhead over native PyTorch. This is... not good. A wrapper abstraction shouldn't bring such a big overhead. (7) Ah the old good war of framework days. I feel nostalgic to have been part of it, and I am excited to move beyond frameworks and to build a truly AI-native cloud at @LeptonAI .
English
13
65
456
127.5K
Keshi Dai retweetledi
Keshi Dai
Keshi Dai@daikeshi·
@omnific9 Oh nice! I’ll be there too! Haven’t seen you for years!
English
1
0
1
60
Keshi Dai retweetledi
Robert Nishihara
Robert Nishihara@robertnishihara·
We've built a ton of #LLM applications recently. Reasoning about performance & feasibility is painful without reference points. Here are the reference points we use to anchor our intuition (inspired by @JeffDean's "Numbers every engineer should know"). github.com/ray-project/ll…
Robert Nishihara tweet media
English
12
87
378
80.7K
Keshi Dai
Keshi Dai@daikeshi·
Llama, Alpaca, Vicuña, or Guanaco
Español
0
0
1
115
Keshi Dai
Keshi Dai@daikeshi·
If Brighton is a “no-frills" ski resort, Deer Valley is completely on the opposite with all the amazing services. The vibes are so different, but with the Ikon pass, you can access both!
English
0
0
0
193
Keshi Dai retweetledi
NASA Webb Telescope
NASA Webb Telescope@NASAWebb·
Stars: always making a dramatic exit! 🌟 Webb’s powerful infrared eye has captured never-before-seen detail of Cassiopeia A (Cas A). 11,000 light-years away, it is the remnant of a massive star that exploded about 340 years ago: go.nasa.gov/3ZJnk72
NASA Webb Telescope tweet media
English
264
3.2K
17.8K
2.8M
Keshi Dai
Keshi Dai@daikeshi·
4DX movie experience was too much. Why did I get pushed in the back when two guys were fighting on the screen. I was just watching them.
English
0
0
0
118
Keshi Dai retweetledi
Jay Hack
Jay Hack@mathemagic1an·
My thoughts on Toolformer IMO the most important paper in the past few weeks. arxiv.org/abs/2302.04761 Teach an LLM to use tools, like a calculator or search engine, in a *self-supervised manner* Interesting hack to resolve many blind spots of current LLMs Here's how 👇
Jay Hack tweet media
English
39
350
2.1K
541.9K
Keshi Dai retweetledi
François Chollet
François Chollet@fchollet·
Trust is your most important asset. It's easy to destroy, hard to rebuild. In periods of fast change and extreme uncertainty, focus on preserving trust.
English
12
65
466
0
Keshi Dai retweetledi
Vala Afshar
Vala Afshar@ValaAfshar·
An octopus changing colors in her sleep may be an indication she is dreaming
English
128
2.3K
12.8K
0
Keshi Dai
Keshi Dai@daikeshi·
I had to explicitly ask VS Code to go down 5 levels to fix my imports "python.analysis.packageIndexDepths": [["", 5],]
English
0
1
1
0
Keshi Dai retweetledi
Josh Baer
Josh Baer@jbx____·
We're a few years into our ML Platform journey at Spotify, building infrastructure powering Machine Learning that makes Spotify "magic" for many of our users. And I'm thrilled to have a Product Manager opening on one of our cornerstone teams focusing o…lnkd.in/eauKcuem
English
0
2
3
0
Keshi Dai
Keshi Dai@daikeshi·
Food hall in a church? Brilliant idea!
Keshi Dai tweet media
English
1
0
1
0