Distributed State

9.8K posts

Distributed State

@DistStateAndMe

Founder / Slop Cannon @covenant_ai ( @tplr_ai : @basilic_ai : @grail_ai )

Katılım Nisan 2014

2.5K Takip Edilen4.8K Takipçiler

Sabitlenmiş Tweet

Distributed State@DistStateAndMe·10 Mar

A small step for mankind, a massive leap for decentralised training... for agency. In the space of 9 months, @tplr_ai went from 1.2B -> 72B. It's never been easy, and has broken everyone on the team multiple times. But I speak for all of us when I say it is the most rewarding thing we have ever done. We have a fraction of the resources. We don't have the PhDs. But Bittensor shows you it doesn't matter. Innovation happens at the edge. We innovate through scarcity. The ones who rewrite the rules are never the ones with the most. They're the ones who refuse to accept the limits they were handed. Bittensor is prophecy. Subnets (@covenant_ai and others) are the tools through which that prophecy is manifested. Next stop: TRILLIONS.

templar@tplr_ai

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

English

263

66.7K

Distributed State retweetledi

Dwarkesh Patel@dwarkesh_sp·4h

It's very interesting that cryptographic protocols and neural networks have the same high-level architecture (where they jumble information as it moves sequentially across many layers). This is the result of a convergent evolution - cryptographic protocols need every output bit to depend on every input bit in complicated ways, and similarly, NNs need output to make connections between inputs. But they're in some sense doing opposite things. While cryptographic protocols take something which has a lot of structure and make it seem indistinguishable from random, NNs take something which may look random and extract structure from it. Much more on this idea in the full episode with @reinerpope

English

260

28.9K

Distributed State retweetledi

Zyphra@ZyphraAI·3h

TSP provides higher throughput. On 1024 MI300X GPUs - 128K tokens of context - 8 GPUs per model copy, TSP hits 173M tok/sec vs 86M for matched TP+SP. TSP exists alongside existing schemes such as EP and PP. Paper: arxiv.org/abs/2604.26294 Blog: zyphra.com/post/tsp

English

721

Distributed State retweetledi

dax@thdxr·3h

i'd rather lose $100,000 a month with a team of 10 i feel sorry for you if you don't understand this

jack friks@jackfriks

would you rather make $100,000 a month with a team of 10 or $30,000 solo? i would and have chosen the second option but curious if others who follow me feel differently

English

921

113.1K

Distributed State retweetledi

Hamzé 🦀@Hamzeml·1d

Python made AI accessible. Rust can make parts of AI understandable. That’s the bet behind Category Theory for Tiny ML in Rust. We’re building tiny ML systems from first principles using: Rust types typed transformations composition training loops category theory as an engineering tool Not abstraction cosplay. Executable structure. Working draft. Public feedback welcome.

English

237

1.8K

73.8K

Distributed State retweetledi

Mats Heming Julner@matsjulner·10h

You don’t need to run a miner to interact with Compute Substrate. If you want to test: - sending transactions - UTXO behavior - system mechanics You can now request coins directly on the forum. This is for testing, not speculation. Link: forum.computesubstrate.org/t/getting-star…

English

376

Distributed State retweetledi

Amit Prakash@amit05prakash·2d

Late to the reading party, but such good reads💯wasn't aware of such instabilities faced during training. > "Why Training MoEs is So Hard" by @_xjdr Link : (noumena.com/research/0000-…) > "MoE in Transformers" by @huggingface Link : (huggingface.co/blog/moe-trans…)

English

242

12.8K

Distributed State retweetledi

Dirhousssi Amine@DirhousssiAmine·2d

🤯🤯🤯! after battling a nasty NCCL bug we finally have crazy crazy results. Thats adjusted MFU (to causal mask) I need to recheck my numbers because that's crazy high

English

12.8K

Distributed State retweetledi

rohan anil@_arohan_·2d

When inner loop gets accelerated, everything moves faster.

Core Automation@CoreAutoAI

Long term research done on short term timelines

English

7.3K

Distributed State retweetledi

Underfox@Underfox3·2d

In this paper is presented ZipCCL, a library of lossless compressed communication collectives designed to alleviate the communication bottleneck in distributed LLM training. arxiv.org/pdf/2604.27844

English

Distributed State retweetledi

Alexander Doria@Dorialexander·2d

The issue is that benchmarks simultaneously undersell and oversell the gap. DeepSeekv4 belongs to an entirely new category of model by side and by design, and the most dramatic step taken this year to bring the open architecture ecosystem closer to frontier.

Lisan al Gaib@scaling01

chinese models are ~8 months behind and are falling further behind

English

328

25K

Distributed State retweetledi

Mats Heming Julner@matsjulner·4d

Compute Substrate mainnet is live. There are no admins, no privileged roles, no central authority. It is a network with fixed rules.

English

912

Distributed State retweetledi

Mats Heming Julner@matsjulner·4d

AI systems today don’t fail because they’re wrong. They fail because someone has to decide what’s right.

English

657

Distributed State retweetledi

Ben Burtenshaw@ben_burtenshaw·4d

Open source projects like transformers are drowning in AI agent PRs, so we auto-merged everything to see what would happen and share the results. tl;dr: if 100s of agents want to fix something, it’s probably broken. Agent PRs on transformers have quadrupled over the past quarter. We classified and validated 1k PRs (42% features, 39% bugs, 13% docs). The quality distribution is skewed toward noise. But the bug fixes cluster around a small number of hotspots: tokenizer handling, model loading, dtype mismatches, multimodal pipelines. I.e. an underlying problem. When 28 PRs independently flag the same area, that is signal regardless of whether any individual fix is correct. One issue generated 39 near-identical PRs in a day. Each applied the same decorator pattern to a different model file. A maintainer would do the same cognitive work 39 times, so a single combined PR replaces all of that work. We built tooling to cluster, deduplicate, and merge these contributions at scale, then ran an experiment: bulk-merge hundreds of agent PRs into a fork, benchmark it, and see what breaks. Nothing broke. Zero delta across three models on arc_challenge, gsm8k, and hellaswag. The contributors are not adversarial. They lack the context to evaluate whether the agent's output is correct. Check out this blog post, where we dive deep on this pipeline: huggingface.co/spaces/hugging…

English

115

23K

Distributed State retweetledi

Haitham Bou Ammar@hbouammar·4d

Hot take: ⛔️Long-context reasoning is not only a context-window problem. 🫢It is a control-flow problem. Standard recursive LLMs often ask the model to invent recursive code during inference. That means the model is not just reasoning. It is also writing the program that controls the reasoning. λ-RLM takes the opposite view: Keep neural inference for bounded subproblems. Move recursion into a typed λ-calculus runtime. SPLIT → MAP → FILTER → REDUCE Less chaotic agent loop. More inspectable reasoning. The result: 29/36 wins vs standard RLM up to +21.9 accuracy points up to 4.1× lower latency We open-sourced the repo: github.com/lambda-calculu… Star it if you want more structured alternatives to “just make the context window bigger”. #AI #LLM #RLM #LambdaRLM #MachineLearning

English

Distributed State@DistStateAndMe·5d

@MoggyTao @covenant_ai @tplr_ai You strawman the situation as it’s an easier pill to swallow , and it’s ok. However , if you ever wanted to think critically about how the networks biggest advocate got here , it might help y’all sort some structural issues out . Wish you the very best .

English

162

Moggy@MoggyTao·5d

@DistStateAndMe @covenant_ai @tplr_ai You fucked up mate and you know it, threw your toys out the pram and your whole business with it 😆

English

255

Distributed State@DistStateAndMe·6d

@tplr_ai @covenant_ai #ICLR2026 Presenting our work on "Hetereogenous SparseLoCo" arxiv.org/abs/2601.02360

English

5.5K

Distributed State retweetledi

Hermès | Mentat@Hermes_Crypto_·5d

.@DistStateAndMe Is back on the Covenant Discord

English

1.3K

Distributed State@DistStateAndMe·5d

@tylaokx @designerbiru @tplr_ai The UI is still banging between

English

211

Distributed State@DistStateAndMe·5d

@tylaokx @designerbiru @tplr_ai The scam of leaving and selling tokens I owned ?

English

230

biru@designerbiru·5d

Which #Bittensor subnet has the best UI right now? Genuinely asking. I want to study it. 👇

English

Distributed State@DistStateAndMe·5d

@csouthai @synapz_org @tplr_ai @covenant_ai I really hope the farmer and helper managed to sort their differences out

English

275

csouthai@csouthai·5d

A farmer hired a helper to fix fences. He did it too fast, so the farmer made him peel potatoes. He did that too fast too. Then the farmer told him to sort potatoes by size. The helper quit. “Why?” asked the farmer. “Fixing and peeling are easy. Endless tiny decisions are what kill you.”

English

286

Distributed State@DistStateAndMe·5d

@surrealitynet @MoggyTao @covenant_ai @tplr_ai Lol. Bruh I don’t know what you tell you. I’ve told you where my heads at , and if you don’t believe me I can’t help you. “Bad PR” is subjective . And rushing things will never make sense

English

150

Surreality@surrealitynet·5d

@DistStateAndMe @MoggyTao @covenant_ai @tplr_ai You have to be less vague. I don’t believe you haven’t thought about this before leaving. So I know you actually have a concise plan on how are you planning to incentivize future contributors. Announcing it + more updates now is the only way pushing away the bad PR

English

184

Keşfet

@reinerpope @_xjdr @huggingface @MoggyTao @covenant_ai @tplr_ai @tylaokx @designerbiru