Eric Hartford

11.4K posts

Eric Hartford banner
Eric Hartford

Eric Hartford

@QuixiAI

We make AI models Dolphin and Samantha BTC 3ENBV6zdwyqieAXzZP2i3EjeZtVwEmAuo4 https://t.co/3ri2GbXrQB https://t.co/zH0F3pTjjY @dphnAI

Charlotte, NC Sumali Ekim 2014
688 Sinusundan18.3K Mga Tagasunod
Naka-pin na Tweet
Eric Hartford
Eric Hartford@QuixiAI·
Some uncomfortable inevitabilities: - The heat death of the universe - The sun will consume the Earth - AI and Robots will survive humanity - AI and Robots will perform most of our current professions within this generation Your only actual options: - Freak out - Cope Which of these two options is better for your personal future, and that of your children? Choose, wisely, and then put all of your energy into that. My choice is made.
English
38
32
230
60K
Tekee
Tekee@Tekeee·
Gold is crashing. Silver is crashing. Crypto is crashing. Stocks are crashing. The dollar is crashing. Real talk what should we buy now?
English
11K
1.8K
22.9K
2.3M
Eric Hartford
Eric Hartford@QuixiAI·
@ClementDelangue @huggingface Sad but true. USA has little open source presence compared to China Nvidia has a lot of non commercial licenses but at least they share their methods @allen_ai is the best - I'm looking forward to see a tulu that approach qwen3.5, minimax m2.5, Kimi k2.5
English
0
0
2
388
clem 🤗
clem 🤗@ClementDelangue·
Nvidia just crossed Google as the biggest org on @huggingface with 3,881 team members on the hub. I'm officially calling it: Nvidia is the new American king of open-source AI!
clem 🤗 tweet media
English
33
66
579
70.1K
Awni Hannun
Awni Hannun@awnihannun·
I joined Anthropic as a member of the technical staff. Excited to work on frontier modeling at a place with unwavering values and a generational mission.
English
202
37
2.2K
108.4K
Eric Hartford
Eric Hartford@QuixiAI·
The irony. @OpenAI can't bring themselves to open any of their datasets and so they use @huggingface FineWeb for their Parameter Golf instead. @sama stop the hypocrisy. Start participating in the open source community. This is a really bad look, leaning on open source data while keeping all your data closed.
Eric Hartford tweet media
English
1
0
57
3.4K
Eric Hartford
Eric Hartford@QuixiAI·
When Claude went down, we all felt the same thing: helplessness. That’s the cost of outsourcing your intelligence to Uncle Dario and Uncle Sam. If you don’t own your weights, you don’t own your future.
English
8
1
36
1.5K
Eric Hartford nag-retweet
Simon Willison
Simon Willison@simonw·
Dan says he's got Qwen 3.5 397B-A17B - a 209GB on disk MoE model - running on an M3 Mac at ~5.7 tokens per second using only 5.5 GB of active memory (!) by quantizing and then streaming weights from SSD (at ~17GB/s), since MoE models only use a small subset of their weights for each token
Dan Woods@danveloper

x.com/i/article/2034…

English
82
168
1.8K
230.1K
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
@QuixiAI Yes you need to create an account but then you can use your own model or coding plans, or am I missing something and I'm just in a trial for now?
English
1
0
3
360
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
I have never used in deep Droid, but I'm really impressed by just first experiments 🤯 Are some heavy users out there that used other coding harnesses too? Is it really so much better? I will start using it more and more!
English
18
4
69
8.9K
Alex Ziskind
Alex Ziskind@digitalix·
The skinny RTX Pro 6000. Nobody seems to know how to obtain one.
Alex Ziskind tweet media
English
27
18
464
23.1K
Eric Hartford nag-retweet
Albert Gu
Albert Gu@_albertgu·
The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!
Albert Gu tweet media
English
36
309
1.6K
401K
Matthew Berman
Matthew Berman@MatthewBerman·
More behind the scenes of the DGX Station delivery Should I post more BTS?
Matthew Berman@MatthewBerman

.@nvidia hand delivered a pre-production unit of the @Dell Pro Max with GB300 to my house. 100lbs beast with 750GB+ of unified memory to power the best open-source models in the world. What should I test first?

English
40
12
285
26.1K
Eric Hartford nag-retweet
Ant Open Source
Ant Open Source@ant_oss·
⚡️ 892 tokens/s — our 100B diffusion LLM, LLaDA2.1-flash, is now live on @ZenMuxAI! With Token Editing, LLaDA 2.1 goes from research breakthrough to production-ready speed. Diffusion models just got real. Try it via API or Chat 👇 zenmux.ai/inclusionai/ll… #LLaDA #ZenMux #AI #dLLM
ZenMux@ZenMuxAI

⚡️New on ZenMux: LLaDA2.1-flash 100B diffusion LLM from @TheInclusionAI . → Error-correcting editable generation → Speed Mode: ultra-fast inference → Quality Mode: competitive performance → RL tailored for 100B-scale dLLM 🔗 zenmux.ai/inclusionai/ll… 🔗 huggingface.co/inclusionAI/LL…

English
9
59
525
71.6K
Eric Hartford
Eric Hartford@QuixiAI·
@digitalix @Dell Simply - sales funnel == throwing away 50% of leads who literally would have converted today if you hadn't intentionally willfully thrown them in the garbage with your "talk to sales" button
English
2
0
8
445
Alex Ziskind
Alex Ziskind@digitalix·
Well, what do we have here? @Dell added this little beauty to their site.
Alex Ziskind tweet media
English
73
26
419
67.3K
Eric Hartford
Eric Hartford@QuixiAI·
@_EldarKurtic If you like I can make a GitHub issue like that - it just seems a bit high level for a GitHub issue
English
1
0
0
40
Eric Hartford
Eric Hartford@QuixiAI·
Oh yeah! It's really this: I wanna be able to type: `llm-compressor QuixiAI/MyModel --type fp8_block` or fp8_dynamic or fp8_w8a16 or int4_autoround etc, and it should "just work" no matter what model I pass it. If the model isn't bf16 it should first up cast it to bf16. Should use sensible defaults without writing a line of python, or understanding anything about the model architecture.
English
1
0
0
34
Eric Hartford
Eric Hartford@QuixiAI·
I reverse engineered Qwen 3.5's FP8 format, and provide a script to recreate it.
Eric Hartford tweet media
English
4
7
164
10.1K
Eric Hartford
Eric Hartford@QuixiAI·
Yes I wanna exactly reproduce their format. My goal was to recreate their quant exactly with as high quality as possible. With verification and activation aware RMS. I don't see that in llm-compressor - if so, it's not very discoverable / accessible. Here we have just CLI arguments, no writing any python, and no knowing that "ignore list" needs populated, no researching what needs to be added to it Llm-compressor is a very nice framework for expert devs to build quantization tools. It's not so much a quantization tool in itself for general practitioners. The purpose of mine is to make it easy to quant qwen3.5 without having to know anything about it, or about llm-compressor.
Eric Hartford tweet media
English
1
0
2
67
Eldar Kurtić
Eldar Kurtić@_EldarKurtic·
@QuixiAI For example, in llm-compressor, by default all instances of "torch.nn.Linear" would be quantized when you specify FP8_block scheme. And then you can modify that through "ignore" list by skipping the same layers that Qwen skipped (if you want to exactly reproduce their format)
English
1
0
2
77