John Song

335 posts

John Song banner
John Song

John Song

@JJJOOOHN

Katılım Kasım 2012
898 Takip Edilen57 Takipçiler
John Song
John Song@JJJOOOHN·
@dongxi_nlp 取决于你的业务 对速度要求不高 那可以买 毕竟大容量128GB 足够管饱
中文
0
0
0
87
马东锡 NLP
马东锡 NLP@dongxi_nlp·
有没有朋友了解 Nvidia DGX Spark 它的 ML 生态支持成熟了么?值不值得买?
中文
8
0
3
1.9K
John Song
John Song@JJJOOOHN·
@9hills 质量如何呢 用户还是关注结果好坏呢
中文
0
0
0
36
九原客
九原客@9hills·
用 ClawBench 的Smoke Test 20道题测试了下 Qwen 3.6 量化+MTP+4090 的效果,全部200k上下文,量化参数不同是保持公平对比以填满显卡。deepseek-v4-flash 作为对比。 可以看到 35B-A3B 以 237 tps的速度独步江湖,这个速度太快了。 注:因llama.cpp的问题,两个本地模型的输入tokens统计错误。
九原客 tweet media
中文
7
3
17
2.8K
wd 🔺
wd 🔺@populartourist·
Qwen3.7 spotted Can't wait for Qwen3.7 1B to beat Opus 4.8
wd 🔺 tweet media
English
12
2
173
14.9K
QMAY
QMAY@Q_May_007·
👽川普总统刚刚在他的社交平台发了这些图片!
QMAY tweet mediaQMAY tweet mediaQMAY tweet media
中文
26
160
369
33.5K
Azeez
Azeez@AtlasInference·
DGX Spark just benched 200+ tok/s for Qwen3.6-35B with @AtlasInference on @spark_arena 🔥 How's that possible? Providers like Codex and Claude get ~60. Other major engines don't come close 🦥 We haven't seen speeds like this on GB10. NO ONE HAS. Atlas is shattering records 🚀
Azeez tweet media
English
26
16
132
50.4K
wd 🔺
wd 🔺@populartourist·
Qwen3.6 27B and 35B-A3B are amazing models, but nothing reaches the efficiency of GPT-OSS yet. Qwen3.6 35B-A3B is as fast as GPT-OSS-20B but nowhere near the prefill performance.
English
21
1
98
20.6K
John Song
John Song@JJJOOOHN·
@0xSero it is more expensive than last month!
English
0
0
0
128
0xSero
0xSero@0xSero·
I am buying a DGX Spark today. Rejoice, I'm going to make the Spark competitive.
0xSero tweet media
English
52
7
525
25.1K
Mike Key
Mike Key@1337hero·
Why is downloading form Hugging Face so painfully slow?
English
15
0
11
2.5K
John Song
John Song@JJJOOOHN·
@takayan660 True. With this requirement, yes dgx spark is a better option.
English
0
0
1
84
たかやん
たかやん@takayan660·
@JJJOOOHN Haha, Thor is tempting too, but for my use case I wanted something more like a compact local AI workstation rather than an embedded/robotics platform.
English
1
0
1
232
たかやん
たかやん@takayan660·
DGX Spark買っちゃった
たかやん tweet mediaたかやん tweet media
日本語
17
33
644
55.5K
John Song
John Song@JJJOOOHN·
@1337hero My cases are more complicated. Tables, formulas etc
English
0
0
1
31
Mike Key
Mike Key@1337hero·
@JJJOOOHN That's pretty dope. I have a pretty solid OCR workflow - but luckily that's mostly just for being paperless.
English
1
0
1
43
Mike Key
Mike Key@1337hero·
Spent $3998.98 total to have 96gb of VRAM using AMD's AI Pro R9700 Cards. (brand new) Comparatively I had spent $1520.00 on two used RX 7900 XTX's for 48gb of VRAM. If ur team RED, a single XTX is CHEAPER than a RTX 3090. Should I have bought a Mac or DGX Spark instead?
Mike Key tweet mediaMike Key tweet media
English
30
2
116
15.9K
John Song
John Song@JJJOOOHN·
@1337hero I run Qwen 3.6 27b at dgx spark and use it as OCR tool and AI assistant.
English
1
0
1
102
John Song
John Song@JJJOOOHN·
@Q_May_007 @kimi1383987 七哥以前说过中美关系突然变好的时候,就是共产党要遭受巨大打击的时候
中文
0
0
2
101
QMAY
QMAY@Q_May_007·
现在:“这是一项莫大的荣幸。今天真是美好的一天。 “我要感谢我的朋友习主席给予如此盛大的欢迎。” “这确实是一场无与伦比的盛大欢迎。而且您如此优雅地接待我们进行这次具有历史意义的国事访问。” “今晚是我们朋友之间又一次珍贵的交流机会,讨论今天所谈的一些事情。这一切对美国和中国都有益。而且能与您在一起真是莫大的荣幸。”——川普总统
QMAY tweet media
中文
11
61
91
8.3K
QMAY
QMAY@Q_May_007·
总统抵达北京天坛 上楼梯 很稳 这次访问 尽看上下楼梯稳不稳了🙂
中文
11
63
108
13.1K
John Song
John Song@JJJOOOHN·
@CuiMao 你让phd的老脸放在哪里 啪啪打脸
中文
0
0
0
57
CuiMao
CuiMao@CuiMao·
写了好长好长的一篇的文章,删掉了,不发了,不如多干点实事。写文章会写上瘾的,最后一事无成。
中文
54
1
68
13.9K
John Song
John Song@JJJOOOHN·
@sudoingX Why not 35b? Similar performance but 9 times faster
English
0
0
0
28
Sudo su
Sudo su@sudoingX·
i declare qwen 3.6 27b dense q4 the king of a single rtx 3090 card. not even close. this model is absolute beast on local ai, ruthless on agentic loops, owns its own thinking. anyone can use it on single 3090, the weights are open, the stack is reproducible, the prompt is canonical, every claim below is verifiable on your own hardware. the octopus invaders one shot you are seeing is the visible test. i run these models on workloads you wouldn't think to ask for and i couldn't show you if i wanted to, and qwen 3.6 27b dense q4 quietly does the heavy lifting on a single consumer card while the rest of the field is busy explaining why it cannot. if you think a different model is king on a single 3090 right now, name it. drop your card, drop your model, drop your numbers. the throne is not crowded.
Sudo su@sudoingX

update: qwen 3.6 27b dense q4 just one shotted octopus invaders game on a single 3090. hermes agent drove the whole thing, ~41 tok/s gen 21gb vram at full 262k context, thinking mode on. one prompt in and the canonical multi-file space shooter benchmark out, the same exact prompt i ran on qwen 3.5 27b dense back in march on the same card. 3.5 needed one external scope bug fix before the game would even load on first play. 3.6 needed nothing. 11 of 11 files written, 2411 lines of code, zero steering interventions, zero external fixes, playable on first load. 16 minutes 41 seconds wall clock from prompt to playable. consumer tier king on a single 3090 is locked tonight, and the silicon underneath my desk did not change between march and now. the open source ecosystem just moved the floor. watch it ship itself, the full 16 minutes 41 seconds sped to 3 minutes 45, no human touched the keyboard between the first prompt and the final frame.

English
62
42
494
40.6K
John Song
John Song@JJJOOOHN·
@sudoingX why not 3.6 35B? It is almost 9 times faster and provide a similar performance with 27B
English
1
0
0
89
Sudo su
Sudo su@sudoingX·
this is what my setup looks like today. about to test qwen 3.6 27b dense q4 on a single rtx 3090 at ~41 tok/s gen, hermes agent driving. predecessor model qwen 3.5 dense q4 made it work in one iteration when i ran the same agentic build on the same card. i've been daily driving qwen 3.6 27b dense for weeks now, the model i keep coming back to. if 3.6 oneshots too, this becomes the best model that runs on a single rtx 3090. consumer tier king. firing the test now will report back soon.
Sudo su tweet media
English
25
9
270
81.3K
John Song
John Song@JJJOOOHN·
@CuiMao 确实如此 光看速度 没有质量 不会有人用的
中文
0
0
0
138
CuiMao
CuiMao@CuiMao·
为什么都在比本地推理速度啊,就像一场毫无意义的雌竞,Dflash也就那样,输出的质量因为无法展示和量化,所有无人在意,越来越跑偏了。 还是要自己跑一跑真实环境才知道
中文
35
1
27
12.8K