Distributed State

9.8K posts

Distributed State banner
Distributed State

Distributed State

@DistStateAndMe

Founder / Slop Cannon @covenant_ai ( @tplr_ai : @basilic_ai : @grail_ai )

Katılım Nisan 2014
2.5K Takip Edilen4.8K Takipçiler
Sabitlenmiş Tweet
Distributed State
Distributed State@DistStateAndMe·
A small step for mankind, a massive leap for decentralised training... for agency. In the space of 9 months, @tplr_ai went from 1.2B -> 72B. It's never been easy, and has broken everyone on the team multiple times. But I speak for all of us when I say it is the most rewarding thing we have ever done. We have a fraction of the resources. We don't have the PhDs. But Bittensor shows you it doesn't matter. Innovation happens at the edge. We innovate through scarcity. The ones who rewrite the rules are never the ones with the most. They're the ones who refuse to accept the limits they were handed. Bittensor is prophecy. Subnets (@covenant_ai and others) are the tools through which that prophecy is manifested. Next stop: TRILLIONS.
templar@tplr_ai

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

English
38
36
263
66.7K
Distributed State retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
It's very interesting that cryptographic protocols and neural networks have the same high-level architecture (where they jumble information as it moves sequentially across many layers). This is the result of a convergent evolution - cryptographic protocols need every output bit to depend on every input bit in complicated ways, and similarly, NNs need output to make connections between inputs. But they're in some sense doing opposite things. While cryptographic protocols take something which has a lot of structure and make it seem indistinguishable from random, NNs take something which may look random and extract structure from it. Much more on this idea in the full episode with @reinerpope
English
16
20
260
28.9K
Distributed State retweetledi
Zyphra
Zyphra@ZyphraAI·
TSP provides higher throughput. On 1024 MI300X GPUs - 128K tokens of context - 8 GPUs per model copy, TSP hits 173M tok/sec vs 86M for matched TP+SP. TSP exists alongside existing schemes such as EP and PP. Paper: arxiv.org/abs/2604.26294 Blog: zyphra.com/post/tsp
English
1
1
8
721
Distributed State retweetledi
Hamzé 🦀
Hamzé 🦀@Hamzeml·
Python made AI accessible. Rust can make parts of AI understandable. That’s the bet behind Category Theory for Tiny ML in Rust. We’re building tiny ML systems from first principles using: Rust types typed transformations composition training loops category theory as an engineering tool Not abstraction cosplay. Executable structure. Working draft. Public feedback welcome.
Hamzé 🦀 tweet media
English
45
237
1.8K
73.8K
Distributed State retweetledi
Mats Heming Julner
Mats Heming Julner@matsjulner·
You don’t need to run a miner to interact with Compute Substrate. If you want to test: - sending transactions - UTXO behavior - system mechanics You can now request coins directly on the forum. This is for testing, not speculation. Link: forum.computesubstrate.org/t/getting-star…
English
0
2
1
376
Distributed State retweetledi
Dirhousssi Amine
Dirhousssi Amine@DirhousssiAmine·
🤯🤯🤯! after battling a nasty NCCL bug we finally have crazy crazy results. Thats adjusted MFU (to causal mask) I need to recheck my numbers because that's crazy high
Dirhousssi Amine tweet media
English
4
2
50
12.8K
Distributed State retweetledi
Underfox
Underfox@Underfox3·
In this paper is presented ZipCCL, a library of lossless compressed communication collectives designed to alleviate the communication bottleneck in distributed LLM training. arxiv.org/pdf/2604.27844
Underfox tweet mediaUnderfox tweet mediaUnderfox tweet mediaUnderfox tweet media
English
2
12
65
4K
Distributed State retweetledi
Mats Heming Julner
Mats Heming Julner@matsjulner·
Compute Substrate mainnet is live. There are no admins, no privileged roles, no central authority. It is a network with fixed rules.
English
1
1
5
912
Distributed State retweetledi
Mats Heming Julner
Mats Heming Julner@matsjulner·
AI systems today don’t fail because they’re wrong. They fail because someone has to decide what’s right.
English
1
1
5
657
Distributed State retweetledi
Ben Burtenshaw
Ben Burtenshaw@ben_burtenshaw·
Open source projects like transformers are drowning in AI agent PRs, so we auto-merged everything to see what would happen and share the results. tl;dr: if 100s of agents want to fix something, it’s probably broken. Agent PRs on transformers have quadrupled over the past quarter. We classified and validated 1k PRs (42% features, 39% bugs, 13% docs). The quality distribution is skewed toward noise. But the bug fixes cluster around a small number of hotspots: tokenizer handling, model loading, dtype mismatches, multimodal pipelines. I.e. an underlying problem. When 28 PRs independently flag the same area, that is signal regardless of whether any individual fix is correct. One issue generated 39 near-identical PRs in a day. Each applied the same decorator pattern to a different model file. A maintainer would do the same cognitive work 39 times, so a single combined PR replaces all of that work. We built tooling to cluster, deduplicate, and merge these contributions at scale, then ran an experiment: bulk-merge hundreds of agent PRs into a fork, benchmark it, and see what breaks. Nothing broke. Zero delta across three models on arc_challenge, gsm8k, and hellaswag. The contributors are not adversarial. They lack the context to evaluate whether the agent's output is correct. Check out this blog post, where we dive deep on this pipeline: huggingface.co/spaces/hugging…
English
15
22
115
23K
Distributed State retweetledi
Haitham Bou Ammar
Haitham Bou Ammar@hbouammar·
Hot take: ⛔️Long-context reasoning is not only a context-window problem. 🫢It is a control-flow problem. Standard recursive LLMs often ask the model to invent recursive code during inference. That means the model is not just reasoning. It is also writing the program that controls the reasoning. λ-RLM takes the opposite view: Keep neural inference for bounded subproblems. Move recursion into a typed λ-calculus runtime. SPLIT → MAP → FILTER → REDUCE Less chaotic agent loop. More inspectable reasoning. The result: 29/36 wins vs standard RLM up to +21.9 accuracy points up to 4.1× lower latency We open-sourced the repo: github.com/lambda-calculu… Star it if you want more structured alternatives to “just make the context window bigger”. #AI #LLM #RLM #LambdaRLM #MachineLearning
Haitham Bou Ammar tweet mediaHaitham Bou Ammar tweet media
English
2
2
33
2K
Distributed State
Distributed State@DistStateAndMe·
@MoggyTao @covenant_ai @tplr_ai You strawman the situation as it’s an easier pill to swallow , and it’s ok. However , if you ever wanted to think critically about how the networks biggest advocate got here , it might help y’all sort some structural issues out . Wish you the very best .
English
0
1
1
162
biru
biru@designerbiru·
Which #Bittensor subnet has the best UI right now? Genuinely asking. I want to study it. 👇
English
10
4
27
3K
csouthai
csouthai@csouthai·
A farmer hired a helper to fix fences. He did it too fast, so the farmer made him peel potatoes. He did that too fast too. Then the farmer told him to sort potatoes by size. The helper quit. “Why?” asked the farmer. “Fixing and peeling are easy. Endless tiny decisions are what kill you.”
English
1
0
0
286
Surreality
Surreality@surrealitynet·
@DistStateAndMe @MoggyTao @covenant_ai @tplr_ai You have to be less vague. I don’t believe you haven’t thought about this before leaving. So I know you actually have a concise plan on how are you planning to incentivize future contributors. Announcing it + more updates now is the only way pushing away the bad PR
English
1
0
0
184