Lucio Dery Jnr Mwinm

273 posts

Lucio Dery Jnr Mwinm

Lucio Dery Jnr Mwinm

@derylucio

Pittsburgh, PA Katılım Ocak 2015
993 Takip Edilen620 Takipçiler
Sabitlenmiş Tweet
Lucio Dery Jnr Mwinm
Lucio Dery Jnr Mwinm@derylucio·
New paper alert: "Latent Space Communication via K-V Cache Alignment" arxiv.org/abs/2601.06123 We propose a method for Large Language Models to communicate directly via their internal states, bypassing the need for discrete text generation 🧵
English
2
2
13
729
Lucio Dery Jnr Mwinm retweetledi
Arthur Douillard
Arthur Douillard@Ar_Douillard·
The DiLoCo team at Google DeepMind and Google Research is proud to release Decoupled DiLoCo, the next frontier for resilient AI pre-training. Decoupled DiLoCo enables training with datacenters across the world, using heterogeneous hardware, and never halting the system despite hardware failures.
GIF
English
33
86
606
2.7M
Lucio Dery Jnr Mwinm retweetledi
Asher Trockman
Asher Trockman@ashertrockman·
The distillation debate is looking pretty one-sided around here! I've been thinking about this for quite a while (along with Yash Savani), so let me add some variety. Regardless of where you stand on the IP issues: Distillation attacks make the AI ecosystem LESS original, less safe, and -- believe it or not -- also less open. We expand on these points in a blog post linked in the replies. Our favorite realization is that distillation attacks lead to AI monoculture, and this monoculture could create unanticipated, systemic security risks for both individuals and companies.
Anthropic@AnthropicAI

We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.

English
2
3
21
4.3K
Lucio Dery Jnr Mwinm retweetledi
Lucio Dery Jnr Mwinm
Lucio Dery Jnr Mwinm@derylucio·
In summary, K-V Cache Alignment offers a robust protocol for dense, latent-space communication in multi-agent systems. Read the full paper here: arxiv.org/abs/2601.06123
English
0
0
0
102
Lucio Dery Jnr Mwinm
Lucio Dery Jnr Mwinm@derylucio·
We hope this work opens new avenues for modular AI systems. By decoupling the communication method from text generation, we enable the construction of pools of specialized models that can collaborate efficiently, without the latency of de-tokenization and with higher bandwidth.
English
1
0
0
116
Lucio Dery Jnr Mwinm
Lucio Dery Jnr Mwinm@derylucio·
New paper alert: "Latent Space Communication via K-V Cache Alignment" arxiv.org/abs/2601.06123 We propose a method for Large Language Models to communicate directly via their internal states, bypassing the need for discrete text generation 🧵
English
2
2
13
729
Paul Azunre
Paul Azunre@pazunre·
@ven1925143 @ghnewssummary @derylucio @KhayaAI Errm @KhayaAI is a Ghanaian limited liability company. As in it is BASED IN GHANA. CAPE COAST to be specific. I was born in Ghana with a single passport - a Ghanaian one. You are mad that I earned a National Interest Waiver in the US through hard work? Be clear
English
1
4
13
792
Ghana News Summary
Ghana News Summary@ghnewssummary·
If they still don't believe you, tell them Dery Lucio(@derylucio ), a Ghanaian, works at Google DeepMind as full time employee. Remind them, @pazunre (MIT PhD), another Ghanaian, is building @KhayaAI , AI for African languages. Indeed, great things can come from small places!
English
2
1
17
20.6K
Lucio Dery Jnr Mwinm retweetledi
Pratyush Maini
Pratyush Maini@pratyushmaini·
I reverse engineered a phase change in GPT's training data... with the seahorse emoji 🌊🐴 My forensic investigation reveals why non-thinking models have started "thinking out loud" & what it reveals about how frontier labs train their latest models pratyushmaini.substack.com/p/reverse-engi…🧵
Pratyush Maini tweet media
English
7
31
312
65.7K
Lucio Dery Jnr Mwinm retweetledi
Sara Hooker
Sara Hooker@sarahookr·
I'm starting a new project. Working on what I consider to be the most important problem: building thinking machines that adapt and continuously learn. We have incredibly talent dense founding team + are hiring for engineering, ops, design. Join us: adaptionlabs.ai
English
215
191
2.5K
224.7K
Lucio Dery Jnr Mwinm retweetledi
Hamidah Oderinwale
Hamidah Oderinwale@didaoh·
1/ With @BenDLaufer and Jon Kleinberg, we constructed the largest dataset of its kind to date: 1.86M Hugging Face models. In a new paper, we mapped how the open-source AI ecosystem evolves by tracing fine-tunes, merges, and more. Here's what we found 🧵
Hamidah Oderinwale tweet media
English
8
33
225
58.7K
Lucio Dery Jnr Mwinm retweetledi
Arthur Douillard
Arthur Douillard@Ar_Douillard·
30+ accepted papers 6 oral papers 6 guest speakers join us at @iclr_conf on the 27th Hall 4 #3 for a full day of workshop on Modularity for Collaborative, Decentralized, and Continual Learning sites.google.com/corp/view/mcdc… @derylucio, Fengyuan Liu, and myself will be organizing that day in person
Arthur Douillard tweet media
Arthur Douillard@Ar_Douillard

Workshop alert 🚨 We'll host in ICLR 2025 a workshop on modularity, encompassing collaborative + decentralized + continual learning. Those topics are on the critical path to building better AIs. Interested? submit a paper and join us in Singapore! sites.google.com/corp/view/mcdc…

English
3
27
102
60.7K