Sergey Serebryakov

1.6K posts

Sergey Serebryakov

@megaserg

ML Engineer, AI Infra expert. Ex-@weHRTyou, ex-@Cruise, ex-@Tesla, ex-@Twitter, ex-RocketFuel, ex-@JetBrains, ex-@Facebook, ex-@Google. mostly puns

London, United Kingdom Katılım Nisan 2010

853 Takip Edilen753 Takipçiler

Sergey Serebryakov@megaserg·22 Ara

.@matroid not sure who this is on your speakers list, but it's not @aelluswamy

English

163

Sergey Serebryakov@megaserg·3 Eki

I saw the best minds of my generation developing AI to make compulsive short-form videos

English

153

Sergey Serebryakov@megaserg·15 Kas

sucks to be limited by the speed of light tbh

English

362

Sergey Serebryakov@megaserg·13 May

@andruyeung How the heck are we almost a quarter of the way through the XXI century?

English

Andrew Yeung@andruyeung·13 May

How the heck are we almost halfway through 2024?

English

6.6K

Sergey Serebryakov retweetledi

Isomorphic Labs@IsomorphicLabs·8 May

How could #AlphaFold 3 transform drug discovery? Most drugs are small molecules known as ligands that bind to proteins to change how they interact in human health and disease. AlphaFold 3 can predict these interactions to atomic accuracy.

English

117

444

75.2K

Sergey Serebryakov@megaserg·13 Nis

@levwalkin "мне пришла в голову мысль, но ушла не застав меня"

Русский

Lev Walkin@levwalkin·13 Nis

— Эта мысль меня посещала. Но мы с ней разошлись в мнениях.

Русский

1.1K

Sergey Serebryakov@megaserg·13 Nis

There are vast and obvious inefficiencies wherever there was no dedicated optimization effort, and possibly even where there was Also true for organizations

Andrej Karpathy@karpathy

This post became popular; Few more thoughts / pointers on the topic for the interested reader. Example of the complexity involved: @cHHillee has a great post "Making Deep Learning Go Brrrr From First Principles" horace.io/brrr_intro.html I was always struck by this diagram from this post. Left to right is time. Look at all these functions stacked up vertically that are dispatched until 30 layers deep you get the actual computation (addition in this example). All of this stuff is PyTorch function overhead. In practical settings this overhead becomes narrow in comparison to the actual computation because the arrays we're adding are so large, but still. What is all this stuff? We're just trying to add numbers. Second: startup latency. Open up Python interpreter and try to import the PyTorch library (`import torch`). On my computer this takes about 1.3 seconds. This is just the library import, before you even do anything. In a typical training run you'll end up importing a lot more libraries, so even just starting your training script can often add up to tens of seconds of you just waiting around. A production-grade distributed training run can even add up to minutes. I always found this very frustrating. Computers are *fast* - even a single CPU core (of up to ~dozens on your computer) does billions of operations in one second. What is happening? In llm.c, all this startup latency is ~gone. Right after allocating memory your computer just directly dives into useful computation. I love the feeling of hitting Enter to launch your program, and it just goes. Direct to useful computation on your problem. No waiting. Third thought: LLM as a compiler. It feels likely to me that as LLMs get much better at coding, a lot more code might be written by them, to target to whatever narrow application and deployment environment you care about. In a world where very custom programs are "free", LLMs might end up being a kind of compiler that translates your high level program into an extremely optimized, direct, low-level implementation. Hence my LLM Agent challenge earlier of "take the GPT-2 PyTorch training script, and output llm.c", as one concrete example. Lastly I also wanted to mention that I don't mean to attack PyTorch at all, I love the library and I have used it for many years. And I've worked in Python for much longer. These are a lot more general problems and tradeoffs that are really fun to think through - between flexibility, generality, hackability, security, abstractions overhead, code complexity, speed (latency / throughput), etc. The fun and magic of pareto optimal infrastructure, and of programming computers.

English

616

Sergey Serebryakov@megaserg·8 Nis

more like NotaryScam amirite

English

147

Sergey Serebryakov@megaserg·8 Nis

Do not use NotaryCam. Find a local notary public instead.

English

358

Sergey Serebryakov@megaserg·3 Nis

@syhw @jadecopet @b_roziere @qcar_ @FabianGloeckle @KunhaoZ @jnsgehring @TacoCohen @adiyossLC @FelixKreuk Nothing in London though :(

English

372

Gabriel Synnaeve@syhw·20 Mar

The CodeGen team at FAIR *in Paris* is recruiting junior and senior research engineers! metacareers.com/jobs/417615930… Come work with us @jadecopet @b_roziere @qcar_ @FabianGloeckle @KunhaoZ et al., and folks in EMEA @jnsgehring @TacoCohen @adiyossLC @FelixKreuk et al.

English

104

102K

Sergey Serebryakov@megaserg·3 Şub

Funding secured enhanced.org/2024/01/29/see…

Sergey Serebryakov@megaserg

Вчера изобрёл: Допингийские игры, Киборгийские игры и Генетийские игры. Ищу инвестора

English

661

Sergey Serebryakov@megaserg·3 Oca

You want to build and install an iOS app on your device? Your iOS is too new, an Xcode update is required. To update Xcode, a macOS update is required. This development toolchain is so broken.

English

386

Sergey Serebryakov retweetledi

vik@vikhyatk·1 Kas

spent all this time studying ML and all i got was an addiction to training large models on expensive GPU clusters

English

8.7K

Sergey Serebryakov@megaserg·24 Eki

@jmmv Nice post! I was bitten by EBS I/O performance as well when debugging bottlenecks of deep learning training nodes. Workload visualization is also super useful there :)

English

Julio Merino@jmmv·20 Eki

Remember that cool graph I posted a few weeks ago about visualizing the behavior of our Bazel build farm and using it to root-cause problems? We ❄️ now have a post! Read on for the cool stuff we are doing at Snowflake in dev tools. medium.com/snowflake/buil…

English

Sergey Serebryakov@megaserg·22 Eyl

shrinkflation is when your therapist charges you the same price for a 55-minute session

English

539

Sergey Serebryakov retweetledi

Стартапы и бизнес@vcru·23 Ağu

Прокуратура обвинила в организации экскурсии по коллекторам, во время которой погибло восемь человек, Александра Кима — основателю сервиса «Спутник». Ким проходил по делу как свидетель. Его сервис не организует экскурсии vc.ru/legal/806543

Русский

7.9K

Sergey Serebryakov@megaserg·19 Nis

Convinced that if everyone personally filed their taxes by hand, there would be no socialists.

English

318

Sergey Serebryakov@megaserg·19 Nis

And California, when taxing nonresident income, applies tax bracket computed from the total income. Amazing.

English

406

Sergey Serebryakov@megaserg·19 Nis

In New York, with a high enough wage, your effective income tax starts to equal marginal state tax, by design. Crazy that it's not talked about more! bogleheads.org/forum/viewtopi…

English

573

Sergey Serebryakov@megaserg·19 Nis

@sedrak_ No, a similar worksheet applies as soon as income is above $107,650

English

Sedrak@sedrak_·19 Nis

@megaserg Basically one needs to avoid making between $2,155,350 and $25,000,000?

English

Keşfet

@matroid @aelluswamy @andruyeung @levwalkin @syhw @jadecopet @b_roziere @qcar_