Distributed State (@DistStateAndMe) - Twitter 个人资料

置顶推文

A small step for mankind, a massive leap for decentralised training... for agency. In the space of 9 months, @tplr_ai went from 1.2B -> 72B. It's never been easy, and has broken everyone on the team multiple times. But I speak for all of us when I say it is the most rewarding thing we have ever done. We have a fraction of the resources. We don't have the PhDs. But Bittensor shows you it doesn't matter. Innovation happens at the edge. We innovate through scarcity. The ones who rewrite the rules are never the ones with the most. They're the ones who refuse to accept the limits they were handed. Bittensor is prophecy. Subnets (@covenant_ai and others) are the tools through which that prophecy is manifested. Next stop: TRILLIONS.

templar@tplr_ai

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

English

18

33

250

18.3K

Distributed State@DistStateAndMe·37m

@Yuchenj_UW This assumes the Chinese models will always be open source. We need to make pre training great again

English

0

1

52

Yuchen Jin@Yuchenj_UW·1h

People dunk on Cursor like: “it’s just Kimi K2.5,” “look inside, it’s a Chinese model.” There’s no shame in building on top of strong base models and doing your own post-training or RL (as long as you respect the license). In most cases you don’t need to pretrain from scratch. I think the whole industry will shift toward more post-training and RL on Chinese open-source models. That’s also part of why we’re seeing the biggest GPU shortage and H100 price spike right now.

Yuchen Jin@Yuchenj_UW

Cursor’s Composer 2 is likely built on Kimi K2.5. The model URL + tokenizer are strong signals. I love this direction: companies mid-train and post-train on top of OSS LLMs. Prediction: open-source model labs will monetize by taking a cut when others build on top of their models and scale to millions of real users. They will enforce this via licensing. That’s the flywheel. That’s how open-source AI thrives.

English

34

7

178

14.2K

Distributed State@DistStateAndMe·1h

@foookingsaviour @tekkaadan Hey man ! Would love to talk

English

0

1

20

Be the Thrill 🏌🏼‍♂️@foookingsaviour·2h

@tekkaadan You should talk with @DistStateAndMe

English

1

0

35

tekkaadan@tekkaadan·3h

I have an article coming about this and why it's such an important milestone for Decentralized AI and how there is a chance for $LITCOIN to contribute and have this be part of the ecosystem. It's still too early and it's super ambitious from my side, but IMO, it's still worth diving into.

templar@tplr_ai

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

English

2

1

11

324

Distributed State@DistStateAndMe·1h

@0xSero Hey dude would be happy to sponsor you some via @basilic_ai if you need anymore

English

0

2

68

0xSero@0xSero·7h

Thank you to everyone who donated, all my X earnings this year will go into the fund as well, should be around 1-2k a month. donate.sybilsolutions.ai

0xSero@0xSero

Putting out a wish to the universe. I need more compute, if I can get more I will make sure every machine from a small phone to a bootstrapped RTX 3090 node can run frontier intelligence fast with minimal intelligence loss. I have hit page 2 of huggingface, released 3 model family compressions and got GLM-4.7 on a MacBook huggingface.co/0xsero My beast just isn’t enough and I already spent 2k usd on renting GPUs on top of credits provided by Prime intellect and Hotaisle. ——— If you believe in what I do help me get this to Nvidia, maybe they will bless me with the pewter to keep making local AI more accessible 🙏

English

18

6

134

6.9K

Distributed State@DistStateAndMe·4h

@OmerShlomovits *zengo

Euskara

0

36

Distributed State@DistStateAndMe·4h

@OmerShlomovits Amazing ! Would love to chat sometime. We are both a long way from Zeno 😜

English

2

0

1

78

Omer Shlomovits@OmerShlomovits·5h

We just open-sourced one of our internal tools: a small contribution that removes a real headache for AI optimization engineers and devs running hands-on experiments.

MoonMath.ai@moonmathai

🧑‍🏭 LiteRunner 🧑‍🏭 MLOps-Style Tracking Without Touching the Code (New Tool) TL;DR: LiteRunner adds lightweight tracking to any CLI command without changing the model, saving params, outputs, and metrics locally and in W&B so every run stays reproducible and organized. Code (open source!): github.com/moonmath-ai/Li… Blog: moonmath.ai/posts/literunn… Contributions are welcome 🙌 More background: When running video generation experiments with diffusion models, the workflow quickly turns into bookkeeping. Every run starts with hand-editing long CLI commands, quoting paths, swapping flags manually, and each run produces a different combination of config, output videos, metrics, and debug data. Output files end up scattered across multiple folders and machines with no central record, sometimes even overwriting each other. Moving those files and recording runs becomes tedious, and inevitably the one run that wasn’t properly recorded turns out to be the one that matters. Revisiting an old experiment often means digging through notes just to figure out whether it used seed 10 or 42. When you own the code, you can wire in an MLOps tool to solve this. But often you’re just a user of someone else’s model, and modifying their source just to get proper tracking isn’t practical. That’s when the idea comes up: instead of changing the model code, bring MLOps-style logging to arbitrary CLI commands, so experiments can be tracked without touching the original implementation.

English

1

4

256

Distributed State@DistStateAndMe·4h

@0xMetaLight ❤️

QME

0

2

62

Nick Carpinito@0xMetaLight·5h

Nick Carpinito@0xMetaLight

BWR subs saw it first

ZXX

1

0

6

406

Distributed State@DistStateAndMe·4h

@Uptodatenow @tplr_ai @glxyresearch ❤️

QME

0

1

46

Distributed State 已转推

Lucas Tcheyan@Uptodatenow·5h

Market finally catching up to the potential of decentralzied training following @tplr_ai recent 72B training run We covered the major players, potential, and bottlenecks in depth for @glxyresearch last fall Read the full piece here galaxy.com/insights/resea…

templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English

1

3

19

1.1K

Distributed State@DistStateAndMe·5h

@MamsLBB 🚀🚀🚀🚀

QME

0

1

30

Mams@MamsLBB·9h

templar mentioned !!! 🤓

templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English

2

1

24

1.2K

Distributed State 已转推

Grigory Sapunov@che_shr_cat·7h

1/ Slapping LLM agents on top of Windows or macOS is an architectural nightmare. You get fragile visual scraping and massive security holes. To fix this, we need to gut the OS kernel and rebuild it for probabilistic intent. 🧵

English

1

4

280

Distributed State@DistStateAndMe·10h

@Dorialexander Model is the product ?

English

0

1

200

Alexander Doria@Dorialexander·11h

When you really want to do the model is the product but don’t have the model.

Anton Osika – eu/acc@antonosika

Introducing Lovable for more general tasks. Lovable has always been for building apps. Today it also becomes your data scientist, your business analyst, your deck builder, and your marketing assistant. This is a big step toward what Lovable is becoming: a general-purpose co-founder that can do anything. See examples below.

English

7

2

79

6.2K

Distributed State 已转推

Carl Jung Archive@QuoteJung·22h

Carl Jung was not playing around when he wrote: “No matter how isolated you are and how lonely you feel, if you do your work truly and conscientiously, unknown allies will come and seek you.”

English

79

3.2K

23.3K

371.2K

Distributed State 已转推

Chamath Palihapitiya@chamath·21h

Jensen Pod!!!!!!

The All-In Podcast@theallinpod

🚨MAJOR INTERVIEW: Jensen Huang joins the Besties! The @nvidia CEO joins to discuss: -- Nvidia's future, roadmap to $1T revenue -- Physical AI's $50T market -- Rise of the agent, OpenClaw's inflection moment -- Inference explosion, Groq deal -- AI PR Crisis, Anthropic's comms mistakes -- Token allocation for employees ++ much more! (0:00) Jensen Huang joins the show! (0:26) Acquiring Groq and the inference explosion (8:53) Decision making at the world's most valuable company (10:47) Physical AI's $50T market, OpenClaw's future, the new operating system for modern AI computing (16:38) AI's PR crisis, refuting doomer narratives, Anthropic's comms mistakes (20:48) Revenue capacity, token allocation for employees, Karpathy's autoresearch, agentic future (30:50) Open source, global diffusion, Iran/Taiwan supply chain impact (39:45) Self-driving platform, facing competition from active customers, responding to growth slowdown predictions (47:32) Datacenters in space, AI healthcare, Robotics (56:10) OpenAI/Anthropic revenue potential, how to build an AI moat (59:04) Advice to young people on excelling in the AI era

Dansk

66

67

945

114.7K

Distributed State 已转推

Openτensor Foundaτion@opentensor·19h

The largest decentralised LLM pre-training run in history. SN3 @tplr_ai trained Covenant-72B across 70+ contributors on open internet infrastructure. Now it’s being discussed by @chamath with @nvidia CEO Jensen Huang. Distributed, open-weight model training on Bittensor is getting started.

English

66

349

1.5K

87.2K

Distributed State@DistStateAndMe·19h

@DeFi_42069 @AlgodTrading Obviously

English

3

0

9

92

⟠Ξτh███m.Bro (τ ᵐᶠᵉʳ) 🇲🇽@DeFi_42069·19h

@AlgodTrading So is Templar your favorite subnet?? 🧐

English

3

0

1

454

Algod@AlgodTrading·19h

Slowly, then all at once

templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English

17

40

423

31.3K

Distributed State@DistStateAndMe·20h

@Swamination ❤️

QME

0

4

64

Swamination@Swamination·21h

Keep cooking.

templar@tplr_ai

On the @theallinpod this week, @chamath asked @nvidia CEO Jensen Huang about decentralized AI training, calling our Covenant-72B run "a pretty crazy technical accomplishment." One correction: it's 72 billion parameters, not four. Trained permissionlessly across 70+ contributors on commodity internet. The largest model ever pre-trained on fully decentralized infrastructure. Jensen's answer is worth hearing too.

English

2

10

331

Distributed State 已转推

Mark@storm_css·21h

More media coverage coming ;) @DistStateAndMe

subnet.ai@subnetai

This is the Templar @chamath was takling about 😃 subnet.ai/subnet/3

English

0

1

430

Distributed State 已转推

Lisa@chieftplr_ai·21h

31:44 - @DistStateAndMe @covenant_ai @tplr_ai * 72 billion parameter model with decentralized training, not a 4 billion parameter model

The All-In Podcast@theallinpod

🚨MAJOR INTERVIEW: Jensen Huang joins the Besties! The @nvidia CEO joins to discuss: -- Nvidia's future, roadmap to $1T revenue -- Physical AI's $50T market -- Rise of the agent, OpenClaw's inflection moment -- Inference explosion, Groq deal -- AI PR Crisis, Anthropic's comms mistakes -- Token allocation for employees ++ much more! (0:00) Jensen Huang joins the show! (0:26) Acquiring Groq and the inference explosion (8:53) Decision making at the world's most valuable company (10:47) Physical AI's $50T market, OpenClaw's future, the new operating system for modern AI computing (16:38) AI's PR crisis, refuting doomer narratives, Anthropic's comms mistakes (20:48) Revenue capacity, token allocation for employees, Karpathy's autoresearch, agentic future (30:50) Open source, global diffusion, Iran/Taiwan supply chain impact (39:45) Self-driving platform, facing competition from active customers, responding to growth slowdown predictions (47:32) Datacenters in space, AI healthcare, Robotics (56:10) OpenAI/Anthropic revenue potential, how to build an AI moat (59:04) Advice to young people on excelling in the AI era

English

1

2

10

738

Distributed State 已转推

Mark Jeffrey@markjeffrey·23h

Bittensor peeps: check out 31:44 - Templar sn3 discussed. @chamath -- they've achieved a *72* billion parameter model with decentralized training, not a 4 billion parameter model :)

English

17

80

332

52.9K

Distributed State 已转推

grail@grail_ai·1d

PULSE made weight sync 100x faster. That turned the trainer itself into the bottleneck. @erfan_mhi just fixed that too. Grail's GRPO trainer is now 1.8x faster on a single B200: 27% to 47% MFU, epoch time nearly halved. Decentralized post-training is converging on centralized speed.

Erfan Miahi@erfan_mhi

Used autoresearch to make @grail_ai GRPO trainer 1.8x faster on a single B200. I kept postponing this for weeks since the bottleneck in our decentralized framework was mainly communication. But after our proposed technique, PULSE, made weight sync 100x faster, the training update itself became the bottleneck. Even with a fully async trainer and inference, a slow trainer kills convergence speed. A task that could've eaten days of my time ran in parallel while I worked on other stuff. Unlike original autoresearch, where each experiment is 5 min, our feedback loop is way longer (10-17 min per epoch + 10-60 minutes of installations and code changes), so I did minimal steering when it was heading in bad directions to avoid burning GPU hours. The agent tried so many things that failed. But, eventually found the wins: Liger kernel, sequence packing, token-budget dynamic batching, and native FA4 via AttentionInterface. 27% to 47% MFU. 16.7 min to 9.2 min per epoch. If you wanna dig deeper or contribute: github.com/tplr-ai/grail We're optimizing everything at the scale of global nodes to make decentralized post-training as fast as centralized ones. Stay tuned for some cool models coming out of this effort. Cheers!

English

0

12

46

8.8K

Distributed State

发现