Ivan Chelombiev

482 posts

Ivan Chelombiev

@savelichic

Research Eng @IsomorphicLabs. Curious about learning. Technology, sustainability and architecture enthusiast.

Katılım Şubat 2013

539 Takip Edilen208 Takipçiler

Sabitlenmiş Tweet

Ivan Chelombiev@savelichic·11 Ara

1/n While everybody’s been busy packing for #NeurIPS2023, our team at @graphcoreai has been busy with this beauty. Let me introduce: ✨SparQ Attention✨ TL;DR This is a plug-and-play inference Attention block for pre-trained LLMs, which evaporates the KV cache bandwidth 🧵

English

3.1K

Ivan Chelombiev retweetledi

Max Jaderberg@maxjaderberg·10 Şub

The Iso team has cooked something incredible: our new technical report unveils the latest results from our drug design engine, the IsoDDE, progressing far beyond AlphaFold 3. This breaks new ground compared to AF and other similar methods by a significant degree across all key benchmarks. 1/7

English

118

692

168K

Ivan Chelombiev retweetledi

Isomorphic Labs@IsomorphicLabs·10 Şub

Today we share a technical report demonstrating how our drug design engine achieves a step-change in accuracy for predicting biomolecular structures, more than doubling the performance of AlphaFold 3 on key benchmarks and unlocking rational drug design even for examples it has never seen before. Head to the comments to read our blog.

English

522

1.3M

Ivan Chelombiev retweetledi

Simon Evans@DrSimEvans·2 Oca

+++NEW ANALYSIS+++ UK electricity was the cleanest ever in 2024, with emissions per unit falling by more than two-thirds in a decade Highlights: 🏭end of coal power after 142yrs 🔥fossil fuels at record-low 29% share 🌄renewables at record-high 45% carbonbrief.org/analysis-uks-e… 1/9

English

415

289

1.1K

132.4K

Ivan Chelombiev retweetledi

Gabriele Berton@gabriberton·19 Ara

Libraries and tools that every deep learning project should use: loguru, tqdm, torchmetrics, einops, python 3.11, black. Optional: prettytable. Good for debugging: lovely_tensors. Any other ones I've missed? Below a few words on each of them:

English

116

1.2K

133.1K

Ivan Chelombiev@savelichic·30 Ara

@EdConwaySky the “evil twin” chart is just a mirage! if you use a zero-aligned y-axis, you can see coal consumption has been stagnant in china for the last 10 years, despite a big rise in power usage. considering that solar displaces coal, there’s good reason to hope coal use will go down

English

Ed Conway@EdConwaySky·19 Ara

It's just more complex and nuanced than you might have thought from The Most Hopeful Chart in the World. Part of the reason solar is cheap & plentiful is because China is using the cheapest & most plentiful source of firm power to make them - coal. We need to remember this.

English

308

31K

Ed Conway@EdConwaySky·19 Ara

The Most Hopeful Chart in the World shows how each year the @IEA predicted that the amount of solar output around the world would plateau or rise v slowly in the following years. But instead solar output defied all expectations, rising exponentially. That's great news.

English

213

40.4K

Ivan Chelombiev@savelichic·30 Ara

@EdConwaySky @IEA the “evil twin” chart is just a mirage! if you use a zero-aligned y-axis, you can see coal consumption has been somewhat stagnant in china for the last 10 years, despite a big rise in power usage. considering that solar displaces coal, there’s good reason to coal will go down

English

Ed Conway@EdConwaySky·19 Ara

Every year the @IEA forecasts global and Chinese COAL demand. And every year it predicts it will essentially plateau. And every year it is proved wrong. As China produces ever more products, inc solar panels, with ever more coal, that red line keeps climbing 👇

English

137

612

73.4K

Ivan Chelombiev retweetledi

Nicolas Fulghum@nicolasfulghum·4 Ara

☀️ Solar power has scaled up faster than any other source of electricity in history ↗️ It took just 8 years to go from 100 TWh to 1000 TWh. It will only take 3 years to go from 1000 TWh to 2000 TWh!

English

141

369

51.7K

Ivan Chelombiev@savelichic·22 Kas

@Haonan_Wang_ Did you try observing the effect of float16 vs bloat16 by any chance? I wonder if the increased precision of float16 can compensate sufficiently not require float32 in training

English

Haonan Wang@Haonan_Wang_·21 Kas

🔍 Digging deeper: 1. We discovered that the first token in a sequence contributes most significantly to numerical errors under BFloat16. 2. As sequence length increases, RoPE deviates more from its intended relative positional encoding, amplifying the issue. [1/n]

English

3.1K

Haonan Wang@Haonan_Wang_·21 Kas

🚀 New Paper📜 When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training 🤯 RoPE is Broken because of... BFloat16! > Even if RoPE is computed in Float32 (like in Llama 3 and the current Transformers library), casting tensors to BFloat16 in FlashAttention2 causes RoPE to deviate from its intended relative positional encoding properties. 📷 Key Highlights: 🛠️ Critical Issue Identified: BFloat16 introduces numerical errors in RoPE, compromising its relative encoding. 🔍 Main Culprit Found: The first token significantly contributes to deviations as context length increases. ⚡ Introducing AnchorAttention: A plug-and-play method that improves long-context performance, reduces training time by over 50%, and preserves the model's general capabilities. But that's not all! 💻 Our code supports FlashAttention and FlexAttention, enabling efficient and scalable computations. 🛠️ Easily customize the attention mechanism to suit your specific needs. 🌐 Versatile Applications: Apply our method to various domains like video understanding and generation. 📄 Paper: arxiv.org/abs/2411.13476 💻 Github: github.com/haonan3/Anchor… Details in the thread 🧵 #AI #MachineLearning #NLP #DeepLearning #LLMs

English

331

63.4K

Ivan Chelombiev retweetledi

Simon Evans@DrSimEvans·30 Eyl

Today is the day the UK becomes the first G7 country to completely phase out coal power Since opening "Jumbo", the world's first coal power plant in 1882, the UK's coal plants have burned through a whopping 4.6bn tonnes of coal, emitting 10.4GtCO2 – which is more than most countries have ever released Here's the story of how that all came to an end: interactive.carbonbrief.org/coal-phaseout-…

English

243

597

1.8K

180.7K

Ivan Chelombiev retweetledi

Roman Gaditskii@gadirom_·15 Haz

SGD vs Adam💡 I punish x10 for distance to mouse in the loss function, but Adam's gradient normalisation eliminates the effect. #MLX #SwiftUI #MachineLearning

English

238

174

1.4K

319.4K

Ivan Chelombiev retweetledi

Tom Harwood@tomhfh·12 Haz

Why oh why is growth at zero?

English

222

535

Ivan Chelombiev retweetledi

zed@zmkzmkz·5 Haz

it's 1:28AM and I just finished this abomination. fully illustrated toy calculation of 1 transformer layer. why would I make this? idk ask my thesis advisor, "not everyone knows how a transformer works, you have to give an example"

English

118

309

4.2K

405.3K

Ivan Chelombiev retweetledi

Mark Cummins@mark_cummins·10 May

It’s no secret that LLM training data is running out. How close are we to the limit? To answer that, here's an estimate of the total amount of text in the world from every major source:

English

333

737.6K

Ivan Chelombiev retweetledi

Daniel Kaiser@spectate_or·8 May

llama3-120b randomly creates a new word "prefaceate" that has 0 search results on google

English

334.2K

Ivan Chelombiev retweetledi

Ziming Liu@ZimingLiu11·1 May

MLPs are so foundational, but are there alternatives? MLPs place activation functions on neurons, but can we instead place (learnable) activation functions on weights? Yes, we KAN! We propose Kolmogorov-Arnold Networks (KAN), which are more accurate and interpretable than MLPs.🧵

GIF

English

117

Ivan Chelombiev retweetledi

The Economist@TheEconomist·27 Nis

Europe’s carbon price is its biggest climate achievement. It is part of the reason that the continent’s emissions fell by a steep 15.5% in 2023 econ.st/3UjIuYV 👇

English

54.2K

Ivan Chelombiev@savelichic·24 Mar

@samswoora @tophinity SF life sounds fun

English

Samswara@samswoora·24 Mar

@tophinity Twitter has IRL clout at stake

English

8.5K

Samswara@samswoora·24 Mar

LinkedIn is cosmic horror - everytime I log on I feel this sense of dread at the people stuck there competing in a game where there is no prize

English

125

1.1K

10.4K

703.9K

Ivan Chelombiev retweetledi

Graphcore Research@GCResearchTeam·22 Mar

At 2pm today Graphcore researchers @luka_ribar & @savelichic will be presenting at @letsunifyai's popular reading group. We'll be covering our recent SparQ paper - a method for increasing LLM inference throughput by sparsifying attention. Live stream: youtube.com/watch?v=xq_8dg…

YouTube

English

1.6K

Ivan Chelombiev retweetledi

Santiago Ortiz@moebio·21 Mar

Take a look into the mind of the machine! visit my new project here: moebio.com/mind/ I repeated the same completion prompt "Intelligence is " hundreds of times and used this to peer into the statistical and semantic behavior of chatgpt

English

298

1.2K

95.5K

Keşfet

@EdConwaySky @IEA @Haonan_Wang_ @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates