Eric Hartford (@QuixiAI) - Twitter Profile | Zamantika Mersobahis Locabet

Naka-pin na Tweet

Some uncomfortable inevitabilities: - The heat death of the universe - The sun will consume the Earth - AI and Robots will survive humanity - AI and Robots will perform most of our current professions within this generation Your only actual options: - Freak out - Cope Which of these two options is better for your personal future, and that of your children? Choose, wisely, and then put all of your energy into that. My choice is made.

English

38

32

230

60K

Eric Hartford@QuixiAI·1h

@Tekeee GPU's

Indonesia

0

3

31

Tekee@Tekeee·16h

Gold is crashing. Silver is crashing. Crypto is crashing. Stocks are crashing. The dollar is crashing. Real talk what should we buy now?

English

11K

1.8K

22.9K

2.3M

Eric Hartford@QuixiAI·7h

@ClementDelangue @huggingface Sad but true. USA has little open source presence compared to China Nvidia has a lot of non commercial licenses but at least they share their methods @allen_ai is the best - I'm looking forward to see a tulu that approach qwen3.5, minimax m2.5, Kimi k2.5

English

0

2

388

clem 🤗@ClementDelangue·8h

Nvidia just crossed Google as the biggest org on @huggingface with 3,881 team members on the hub. I'm officially calling it: Nvidia is the new American king of open-source AI!

English

33

66

579

70.1K

Eric Hartford@QuixiAI·23h

@awnihannun 💔 Values like advising the government to ban open source AI

English

0

2

40

1.4K

Awni Hannun@awnihannun·23h

I joined Anthropic as a member of the technical staff. Excited to work on frontier modeling at a place with unwavering values and a generational mission.

English

202

37

2.2K

108.4K

Eric Hartford@QuixiAI·23h

The irony. @OpenAI can't bring themselves to open any of their datasets and so they use @huggingface FineWeb for their Parameter Golf instead. @sama stop the hypocrisy. Start participating in the open source community. This is a really bad look, leaning on open source data while keeping all your data closed.

English

1

0

57

3.4K

Eric Hartford@QuixiAI·1d

When Claude went down, we all felt the same thing: helplessness. That’s the cost of outsourcing your intelligence to Uncle Dario and Uncle Sam. If you don’t own your weights, you don’t own your future.

English

8

1

36

1.5K

Eric Hartford nag-retweet

Simon Willison@simonw·1d

Dan says he's got Qwen 3.5 397B-A17B - a 209GB on disk MoE model - running on an M3 Mac at ~5.7 tokens per second using only 5.5 GB of active memory (!) by quantizing and then streaming weights from SSD (at ~17GB/s), since MoE models only use a small subset of their weights for each token

Dan Woods@danveloper

x.com/i/article/2034…

English

82

168

1.8K

230.1K

Eric Hartford@QuixiAI·1d

@ivanfioravanti I just don't create accounts when I don't have to. Gives me the ick.

English

0

3

112

Ivan Fioravanti ᯅ@ivanfioravanti·1d

@QuixiAI Yes you need to create an account but then you can use your own model or coding plans, or am I missing something and I'm just in a trial for now?

English

1

0

3

360

Ivan Fioravanti ᯅ@ivanfioravanti·1d

I have never used in deep Droid, but I'm really impressed by just first experiments 🤯 Are some heavy users out there that used other coding harnesses too? Is it really so much better? I will start using it more and more!

English

18

4

69

8.9K

Eric Hartford@QuixiAI·1d

@digitalix Or 16 but at 8 lanes each

English

0

17

Alex Ziskind@digitalix·1d

@QuixiAI 12?

2

0

1

64

Alex Ziskind@digitalix·2d

The skinny RTX Pro 6000. Nobody seems to know how to obtain one.

English

27

18

464

23.1K

Eric Hartford@QuixiAI·1d

@digitalix Makes sense

English

0

14

Eric Hartford@QuixiAI·1d

@TheAhmadOsman What about Step-3.5-Flash-Base and Mistral-Large-3-675B-Base-2512

English

0

10

439

Ahmad@TheAhmadOsman·2d

For those asking, they’re comparing the latest available SoTA Base Models head-to-head, which is the only comparison that actually makes sense Latest base model from Zhipu AI is > GLM-4.5-Base Latest base model from Moonshot AI is > Kimi-K2-Base Hope that clears it up

Ahmad@TheAhmadOsman

INCREDIBLE STUFF INCOMING Nemotron 3 Ultra Base (~500B) benchmarks against Kimi K2 and GLM looking goood

English

25

6

157

16.1K

Eric Hartford nag-retweet

Albert Gu@_albertgu·2d

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

English

36

309

1.6K

401K

Eric Hartford@QuixiAI·2d

@MatthewBerman

GIF

QME

0

115

Matthew Berman@MatthewBerman·3d

More behind the scenes of the DGX Station delivery Should I post more BTS?

Matthew Berman@MatthewBerman

.@nvidia hand delivered a pre-production unit of the @Dell Pro Max with GB300 to my house. 100lbs beast with 750GB+ of unified memory to power the best open-source models in the world. What should I test first?

English

40

12

285

26.1K

Eric Hartford nag-retweet

Ant Open Source@ant_oss·3d

⚡️ 892 tokens/s — our 100B diffusion LLM, LLaDA2.1-flash, is now live on @ZenMuxAI! With Token Editing, LLaDA 2.1 goes from research breakthrough to production-ready speed. Diffusion models just got real. Try it via API or Chat 👇 zenmux.ai/inclusionai/ll… #LLaDA #ZenMux #AI #dLLM

ZenMux@ZenMuxAI

⚡️New on ZenMux: LLaDA2.1-flash 100B diffusion LLM from @TheInclusionAI . → Error-correcting editable generation → Speed Mode: ultra-fast inference → Quality Mode: competitive performance → RL tailored for 100B-scale dLLM 🔗 zenmux.ai/inclusionai/ll… 🔗 huggingface.co/inclusionAI/LL…

English

9

59

525

71.6K

Eric Hartford@QuixiAI·3d

@digitalix @Dell Simply - sales funnel == throwing away 50% of leads who literally would have converted today if you hadn't intentionally willfully thrown them in the garbage with your "talk to sales" button

English

2

0

8

445

Eric Hartford@QuixiAI·3d

@digitalix @Dell @Dell Why do you have an explore button instead of a buy now button? You just do not want my money?

English

5

1

28

4.4K

Alex Ziskind@digitalix·3d

Well, what do we have here? @Dell added this little beauty to their site.

English

73

26

419

67.3K

Eric Hartford@QuixiAI·3d

@_EldarKurtic If you like I can make a GitHub issue like that - it just seems a bit high level for a GitHub issue

English

1

0

40

Eric Hartford@QuixiAI·3d

Oh yeah! It's really this: I wanna be able to type: `llm-compressor QuixiAI/MyModel --type fp8_block` or fp8_dynamic or fp8_w8a16 or int4_autoround etc, and it should "just work" no matter what model I pass it. If the model isn't bf16 it should first up cast it to bf16. Should use sensible defaults without writing a line of python, or understanding anything about the model architecture.

English

1

0

34

Eric Hartford@QuixiAI·3d

I reverse engineered Qwen 3.5's FP8 format, and provide a script to recreate it.

English

4

7

164

10.1K

Eric Hartford@QuixiAI·3d

Yes I wanna exactly reproduce their format. My goal was to recreate their quant exactly with as high quality as possible. With verification and activation aware RMS. I don't see that in llm-compressor - if so, it's not very discoverable / accessible. Here we have just CLI arguments, no writing any python, and no knowing that "ignore list" needs populated, no researching what needs to be added to it Llm-compressor is a very nice framework for expert devs to build quantization tools. It's not so much a quantization tool in itself for general practitioners. The purpose of mine is to make it easy to quant qwen3.5 without having to know anything about it, or about llm-compressor.

English

1

0

2

67

Eldar Kurtić@_EldarKurtic·3d

@QuixiAI For example, in llm-compressor, by default all instances of "torch.nn.Linear" would be quantized when you specify FP8_block scheme. And then you can modify that through "ignore" list by skipping the same layers that Qwen skipped (if you want to exactly reproduce their format)

English

1

0

2

77

Eric Hartford

Tuklasin