Ivan Chelombiev

482 posts

Ivan Chelombiev banner
Ivan Chelombiev

Ivan Chelombiev

@savelichic

Research Eng @IsomorphicLabs. Curious about learning. Technology, sustainability and architecture enthusiast.

Katılım Şubat 2013
539 Takip Edilen208 Takipçiler
Sabitlenmiş Tweet
Ivan Chelombiev
Ivan Chelombiev@savelichic·
1/n While everybody’s been busy packing for #NeurIPS2023, our team at @graphcoreai has been busy with this beauty. Let me introduce: ✨SparQ Attention✨ TL;DR This is a plug-and-play inference Attention block for pre-trained LLMs, which evaporates the KV cache bandwidth 🧵
English
1
10
10
3.1K
Ivan Chelombiev retweetledi
Max Jaderberg
Max Jaderberg@maxjaderberg·
The Iso team has cooked something incredible: our new technical report unveils the latest results from our drug design engine, the IsoDDE, progressing far beyond AlphaFold 3. This breaks new ground compared to AF and other similar methods by a significant degree across all key benchmarks. 1/7
Max Jaderberg tweet media
English
34
118
692
168K
Ivan Chelombiev retweetledi
Isomorphic Labs
Isomorphic Labs@IsomorphicLabs·
Today we share a technical report demonstrating how our drug design engine achieves a step-change in accuracy for predicting biomolecular structures, more than doubling the performance of AlphaFold 3 on key benchmarks and unlocking rational drug design even for examples it has never seen before. Head to the comments to read our blog.
Isomorphic Labs tweet media
English
67
522
3K
1.3M
Ivan Chelombiev retweetledi
Simon Evans
Simon Evans@DrSimEvans·
+++NEW ANALYSIS+++ UK electricity was the cleanest ever in 2024, with emissions per unit falling by more than two-thirds in a decade Highlights: 🏭end of coal power after 142yrs 🔥fossil fuels at record-low 29% share 🌄renewables at record-high 45% carbonbrief.org/analysis-uks-e… 1/9
Simon Evans tweet media
English
415
289
1.1K
132.4K
Ivan Chelombiev retweetledi
Gabriele Berton
Gabriele Berton@gabriberton·
Libraries and tools that every deep learning project should use: loguru, tqdm, torchmetrics, einops, python 3.11, black. Optional: prettytable. Good for debugging: lovely_tensors. Any other ones I've missed? Below a few words on each of them:
Gabriele Berton tweet media
English
41
116
1.2K
133.1K
Ivan Chelombiev
Ivan Chelombiev@savelichic·
@EdConwaySky the “evil twin” chart is just a mirage! if you use a zero-aligned y-axis, you can see coal consumption has been stagnant in china for the last 10 years, despite a big rise in power usage. considering that solar displaces coal, there’s good reason to hope coal use will go down
Ivan Chelombiev tweet media
English
0
0
0
15
Ed Conway
Ed Conway@EdConwaySky·
It's just more complex and nuanced than you might have thought from The Most Hopeful Chart in the World. Part of the reason solar is cheap & plentiful is because China is using the cheapest & most plentiful source of firm power to make them - coal. We need to remember this.
English
23
37
308
31K
Ed Conway
Ed Conway@EdConwaySky·
The Most Hopeful Chart in the World shows how each year the @IEA predicted that the amount of solar output around the world would plateau or rise v slowly in the following years. But instead solar output defied all expectations, rising exponentially. That's great news.
English
10
11
213
40.4K
Ivan Chelombiev
Ivan Chelombiev@savelichic·
@EdConwaySky @IEA the “evil twin” chart is just a mirage! if you use a zero-aligned y-axis, you can see coal consumption has been somewhat stagnant in china for the last 10 years, despite a big rise in power usage. considering that solar displaces coal, there’s good reason to coal will go down
Ivan Chelombiev tweet media
English
0
0
0
21
Ed Conway
Ed Conway@EdConwaySky·
Every year the @IEA forecasts global and Chinese COAL demand. And every year it predicts it will essentially plateau. And every year it is proved wrong. As China produces ever more products, inc solar panels, with ever more coal, that red line keeps climbing 👇
Ed Conway tweet media
English
16
137
612
73.4K
Ivan Chelombiev retweetledi
Nicolas Fulghum
Nicolas Fulghum@nicolasfulghum·
☀️ Solar power has scaled up faster than any other source of electricity in history ↗️ It took just 8 years to go from 100 TWh to 1000 TWh. It will only take 3 years to go from 1000 TWh to 2000 TWh!
Nicolas Fulghum tweet media
English
25
141
369
51.7K
Ivan Chelombiev
Ivan Chelombiev@savelichic·
@Haonan_Wang_ Did you try observing the effect of float16 vs bloat16 by any chance? I wonder if the increased precision of float16 can compensate sufficiently not require float32 in training
English
0
0
0
54
Haonan Wang
Haonan Wang@Haonan_Wang_·
🔍 Digging deeper: 1. We discovered that the first token in a sequence contributes most significantly to numerical errors under BFloat16. 2. As sequence length increases, RoPE deviates more from its intended relative positional encoding, amplifying the issue. [1/n]
Haonan Wang tweet media
English
1
0
18
3.1K
Haonan Wang
Haonan Wang@Haonan_Wang_·
🚀 New Paper📜 When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training 🤯 RoPE is Broken because of... BFloat16! > Even if RoPE is computed in Float32 (like in Llama 3 and the current Transformers library), casting tensors to BFloat16 in FlashAttention2 causes RoPE to deviate from its intended relative positional encoding properties. 📷 Key Highlights: 🛠️ Critical Issue Identified: BFloat16 introduces numerical errors in RoPE, compromising its relative encoding. 🔍 Main Culprit Found: The first token significantly contributes to deviations as context length increases. ⚡ Introducing AnchorAttention: A plug-and-play method that improves long-context performance, reduces training time by over 50%, and preserves the model's general capabilities. But that's not all! 💻 Our code supports FlashAttention and FlexAttention, enabling efficient and scalable computations. 🛠️ Easily customize the attention mechanism to suit your specific needs. 🌐 Versatile Applications: Apply our method to various domains like video understanding and generation. 📄 Paper: arxiv.org/abs/2411.13476 💻 Github: github.com/haonan3/Anchor… Details in the thread 🧵 #AI #MachineLearning #NLP #DeepLearning #LLMs
Haonan Wang tweet media
English
7
57
331
63.4K
Ivan Chelombiev retweetledi
Simon Evans
Simon Evans@DrSimEvans·
Today is the day the UK becomes the first G7 country to completely phase out coal power Since opening "Jumbo", the world's first coal power plant in 1882, the UK's coal plants have burned through a whopping 4.6bn tonnes of coal, emitting 10.4GtCO2 – which is more than most countries have ever released Here's the story of how that all came to an end: interactive.carbonbrief.org/coal-phaseout-…
Simon Evans tweet media
English
243
597
1.8K
180.7K
Ivan Chelombiev retweetledi
Roman Gaditskii
Roman Gaditskii@gadirom_·
SGD vs Adam💡 I punish x10 for distance to mouse in the loss function, but Adam's gradient normalisation eliminates the effect. #MLX #SwiftUI #MachineLearning
English
238
174
1.4K
319.4K
Ivan Chelombiev retweetledi
Tom Harwood
Tom Harwood@tomhfh·
Why oh why is growth at zero?
Tom Harwood tweet mediaTom Harwood tweet mediaTom Harwood tweet mediaTom Harwood tweet media
English
222
535
4K
1M
Ivan Chelombiev retweetledi
zed
zed@zmkzmkz·
it's 1:28AM and I just finished this abomination. fully illustrated toy calculation of 1 transformer layer. why would I make this? idk ask my thesis advisor, "not everyone knows how a transformer works, you have to give an example"
zed tweet media
English
118
309
4.2K
405.3K
Ivan Chelombiev retweetledi
Mark Cummins
Mark Cummins@mark_cummins·
It’s no secret that LLM training data is running out. How close are we to the limit? To answer that, here's an estimate of the total amount of text in the world from every major source:
English
75
333
2K
737.6K
Ivan Chelombiev retweetledi
Daniel Kaiser
Daniel Kaiser@spectate_or·
llama3-120b randomly creates a new word "prefaceate" that has 0 search results on google
Daniel Kaiser tweet mediaDaniel Kaiser tweet media
English
97
98
2K
334.2K
Ivan Chelombiev retweetledi
Ziming Liu
Ziming Liu@ZimingLiu11·
MLPs are so foundational, but are there alternatives? MLPs place activation functions on neurons, but can we instead place (learnable) activation functions on weights? Yes, we KAN! We propose Kolmogorov-Arnold Networks (KAN), which are more accurate and interpretable than MLPs.🧵
GIF
English
117
1K
5K
1M
Ivan Chelombiev retweetledi
The Economist
The Economist@TheEconomist·
Europe’s carbon price is its biggest climate achievement. It is part of the reason that the continent’s emissions fell by a steep 15.5% in 2023 econ.st/3UjIuYV 👇
English
14
16
37
54.2K
Samswara
Samswara@samswoora·
LinkedIn is cosmic horror - everytime I log on I feel this sense of dread at the people stuck there competing in a game where there is no prize
English
125
1.1K
10.4K
703.9K
Ivan Chelombiev retweetledi
Santiago Ortiz
Santiago Ortiz@moebio·
Take a look into the mind of the machine! visit my new project here: moebio.com/mind/ I repeated the same completion prompt "Intelligence is " hundreds of times and used this to peer into the statistical and semantic behavior of chatgpt
English
35
298
1.2K
95.5K