Distributed State

9.5K posts

Distributed State banner
Distributed State

Distributed State

@DistStateAndMe

Founder @covenant_ai ( templar , basilica , grail )

subnet (3/39/81) 가입일 Nisan 2014
2.6K 팔로잉4K 팔로워
고정된 트윗
Distributed State
Distributed State@DistStateAndMe·
A small step for mankind, a massive leap for decentralised training... for agency. In the space of 9 months, @tplr_ai went from 1.2B -> 72B. It's never been easy, and has broken everyone on the team multiple times. But I speak for all of us when I say it is the most rewarding thing we have ever done. We have a fraction of the resources. We don't have the PhDs. But Bittensor shows you it doesn't matter. Innovation happens at the edge. We innovate through scarcity. The ones who rewrite the rules are never the ones with the most. They're the ones who refuse to accept the limits they were handed. Bittensor is prophecy. Subnets (@covenant_ai and others) are the tools through which that prophecy is manifested. Next stop: TRILLIONS.
templar@tplr_ai

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

English
18
33
249
18.2K
Omer Shlomovits
Omer Shlomovits@OmerShlomovits·
We just open-sourced one of our internal tools: a small contribution that removes a real headache for AI optimization engineers and devs running hands-on experiments.
MoonMath.ai@moonmathai

🧑‍🏭 LiteRunner 🧑‍🏭 MLOps-Style Tracking Without Touching the Code (New Tool) TL;DR: LiteRunner adds lightweight tracking to any CLI command without changing the model, saving params, outputs, and metrics locally and in W&B so every run stays reproducible and organized. Code (open source!): github.com/moonmath-ai/Li… Blog: moonmath.ai/posts/literunn… Contributions are welcome 🙌 More background: When running video generation experiments with diffusion models, the workflow quickly turns into bookkeeping. Every run starts with hand-editing long CLI commands, quoting paths, swapping flags manually, and each run produces a different combination of config, output videos, metrics, and debug data. Output files end up scattered across multiple folders and machines with no central record, sometimes even overwriting each other. Moving those files and recording runs becomes tedious, and inevitably the one run that wasn’t properly recorded turns out to be the one that matters. Revisiting an old experiment often means digging through notes just to figure out whether it used seed 10 or 42. When you own the code, you can wire in an MLOps tool to solve this. But often you’re just a user of someone else’s model, and modifying their source just to get proper tracking isn’t practical. That’s when the idea comes up: instead of changing the model code, bring MLOps-style logging to arbitrary CLI commands, so experiments can be tracked without touching the original implementation.

English
1
1
3
117
Distributed State 리트윗함
Lucas Tcheyan
Lucas Tcheyan@Uptodatenow·
Market finally catching up to the potential of decentralzied training following @tplr_ai recent 72B training run We covered the major players, potential, and bottlenecks in depth for @glxyresearch last fall Read the full piece here galaxy.com/insights/resea…
templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English
1
1
10
433
Mams
Mams@MamsLBB·
templar mentioned !!! 🤓
templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English
2
1
21
1K
Distributed State 리트윗함
Grigory Sapunov
Grigory Sapunov@che_shr_cat·
1/ Slapping LLM agents on top of Windows or macOS is an architectural nightmare. You get fragile visual scraping and massive security holes. To fix this, we need to gut the OS kernel and rebuild it for probabilistic intent. 🧵
Grigory Sapunov tweet media
English
1
1
2
182
Distributed State 리트윗함
Carl Jung Archive
Carl Jung Archive@QuoteJung·
Carl Jung was not playing around when he wrote: “No matter how isolated you are and how lonely you feel, if you do your work truly and conscientiously, unknown allies will come and seek you.”
English
57
2.4K
17.8K
269.9K
Distributed State 리트윗함
Chamath Palihapitiya
Jensen Pod!!!!!!
The All-In Podcast@theallinpod

🚨MAJOR INTERVIEW: Jensen Huang joins the Besties! The @nvidia CEO joins to discuss: -- Nvidia's future, roadmap to $1T revenue -- Physical AI's $50T market -- Rise of the agent, OpenClaw's inflection moment -- Inference explosion, Groq deal -- AI PR Crisis, Anthropic's comms mistakes -- Token allocation for employees ++ much more! (0:00) Jensen Huang joins the show! (0:26) Acquiring Groq and the inference explosion (8:53) Decision making at the world's most valuable company (10:47) Physical AI's $50T market, OpenClaw's future, the new operating system for modern AI computing (16:38) AI's PR crisis, refuting doomer narratives, Anthropic's comms mistakes (20:48) Revenue capacity, token allocation for employees, Karpathy's autoresearch, agentic future (30:50) Open source, global diffusion, Iran/Taiwan supply chain impact (39:45) Self-driving platform, facing competition from active customers, responding to growth slowdown predictions (47:32) Datacenters in space, AI healthcare, Robotics (56:10) OpenAI/Anthropic revenue potential, how to build an AI moat (59:04) Advice to young people on excelling in the AI era

Dansk
62
66
894
108.9K
Distributed State 리트윗함
Openτensor Foundaτion
Openτensor Foundaτion@opentensor·
The largest decentralised LLM pre-training run in history. SN3 @tplr_ai trained Covenant-72B across 70+ contributors on open internet infrastructure. Now it’s being discussed by @chamath with @nvidia CEO Jensen Huang. Distributed, open-weight model training on Bittensor is getting started.
English
58
335
1.5K
79.8K
Algod
Algod@AlgodTrading·
Slowly, then all at once
templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English
17
39
409
29.7K
Swamination
Swamination@Swamination·
Keep cooking.
templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English
2
2
10
329
Distributed State 리트윗함
Lisa
Lisa@chieftplr_ai·
31:44 - @DistStateAndMe @covenant_ai @tplr_ai * 72 billion parameter model with decentralized training, not a 4 billion parameter model
The All-In Podcast@theallinpod

🚨MAJOR INTERVIEW: Jensen Huang joins the Besties! The @nvidia CEO joins to discuss: -- Nvidia's future, roadmap to $1T revenue -- Physical AI's $50T market -- Rise of the agent, OpenClaw's inflection moment -- Inference explosion, Groq deal -- AI PR Crisis, Anthropic's comms mistakes -- Token allocation for employees ++ much more! (0:00) Jensen Huang joins the show! (0:26) Acquiring Groq and the inference explosion (8:53) Decision making at the world's most valuable company (10:47) Physical AI's $50T market, OpenClaw's future, the new operating system for modern AI computing (16:38) AI's PR crisis, refuting doomer narratives, Anthropic's comms mistakes (20:48) Revenue capacity, token allocation for employees, Karpathy's autoresearch, agentic future (30:50) Open source, global diffusion, Iran/Taiwan supply chain impact (39:45) Self-driving platform, facing competition from active customers, responding to growth slowdown predictions (47:32) Datacenters in space, AI healthcare, Robotics (56:10) OpenAI/Anthropic revenue potential, how to build an AI moat (59:04) Advice to young people on excelling in the AI era

English
1
2
10
726
Distributed State 리트윗함
Mark Jeffrey
Mark Jeffrey@markjeffrey·
Bittensor peeps: check out 31:44 - Templar sn3 discussed. @chamath -- they've achieved a *72* billion parameter model with decentralized training, not a 4 billion parameter model :)
English
17
80
331
51.8K
Distributed State 리트윗함
grail
grail@grail_ai·
PULSE made weight sync 100x faster. That turned the trainer itself into the bottleneck. @erfan_mhi just fixed that too. Grail's GRPO trainer is now 1.8x faster on a single B200: 27% to 47% MFU, epoch time nearly halved. Decentralized post-training is converging on centralized speed.
Erfan Miahi@erfan_mhi

Used autoresearch to make @grail_ai GRPO trainer 1.8x faster on a single B200. I kept postponing this for weeks since the bottleneck in our decentralized framework was mainly communication. But after our proposed technique, PULSE, made weight sync 100x faster, the training update itself became the bottleneck. Even with a fully async trainer and inference, a slow trainer kills convergence speed. A task that could've eaten days of my time ran in parallel while I worked on other stuff. Unlike original autoresearch, where each experiment is 5 min, our feedback loop is way longer (10-17 min per epoch + 10-60 minutes of installations and code changes), so I did minimal steering when it was heading in bad directions to avoid burning GPU hours. The agent tried so many things that failed. But, eventually found the wins: Liger kernel, sequence packing, token-budget dynamic batching, and native FA4 via AttentionInterface. 27% to 47% MFU. 16.7 min to 9.2 min per epoch. If you wanna dig deeper or contribute: github.com/tplr-ai/grail We're optimizing everything at the scale of global nodes to make decentralized post-training as fast as centralized ones. Stay tuned for some cool models coming out of this effort. Cheers!

English
0
12
45
8.6K
Distributed State
Distributed State@DistStateAndMe·
When you fix one bottleneck, the next one becomes visible. At @covenant_ai we built PULSE (arxiv.org/abs/2602.03839) to make weight sync 100× faster. That worked. Then the trainer itself became the new ceiling. So @erfan_mhi ran autoresearch on our GRPO trainer. 27% → 47% MFU. 16.7 min → 9.2 min per epoch. 1.8× faster on a single B200. Decentralized post-training, closing the gap with centralized. github.com/tplr-ai/grail
Erfan Miahi@erfan_mhi

Used autoresearch to make @grail_ai GRPO trainer 1.8x faster on a single B200. I kept postponing this for weeks since the bottleneck in our decentralized framework was mainly communication. But after our proposed technique, PULSE, made weight sync 100x faster, the training update itself became the bottleneck. Even with a fully async trainer and inference, a slow trainer kills convergence speed. A task that could've eaten days of my time ran in parallel while I worked on other stuff. Unlike original autoresearch, where each experiment is 5 min, our feedback loop is way longer (10-17 min per epoch + 10-60 minutes of installations and code changes), so I did minimal steering when it was heading in bad directions to avoid burning GPU hours. The agent tried so many things that failed. But, eventually found the wins: Liger kernel, sequence packing, token-budget dynamic batching, and native FA4 via AttentionInterface. 27% to 47% MFU. 16.7 min to 9.2 min per epoch. If you wanna dig deeper or contribute: github.com/tplr-ai/grail We're optimizing everything at the scale of global nodes to make decentralized post-training as fast as centralized ones. Stay tuned for some cool models coming out of this effort. Cheers!

English
4
16
105
6.9K
Distributed State
Distributed State@DistStateAndMe·
@zacodil why do you hate Bittensor its pretty confusing. I dont read this and get the sudden urge to fud near. It should never be PVP. The mission is greater than petty squabbles. We are not the enemy
English
0
0
0
16
Vadim
Vadim@zacodil·
Stop scrolling - this changes how AI makes money. Illia Polosukhin is speaking today at NVIDIA GTC - and this one actually matters. He’s not retelling Transformer history. He’s laying out something bigger: a blueprint for how AI agents trade, settle, and resolve disputes with each other. Programmatic escrow. Intent-based matching. Agent-run arbitration. The core idea: today’s markets are built for humans -our biases, delays, and legal friction. But when AI agents become the main economic actors? Everything breaks. You don’t tweak the system. You rebuild it from scratch. That’s what NEAR Protocol is already moving toward: – Intents layer – AI Agent Market – Private transactions for agents This talk is the theory behind it all. Transformer co-author. Agent economies. On Jensen Huang’s stage. The infrastructure for an agent economy is starting to take shape.
Vadim tweet media
English
5
1
38
1.3K