Greg Sky

92 posts

Greg Sky banner
Greg Sky

Greg Sky

@regsky_

GPU assembly and rework in the USA. Repairing nvidia cards since 2021

Grand Rapids, MI เข้าร่วม Kasım 2024
13 กำลังติดตาม17 ผู้ติดตาม
ทวีตที่ปักหมุด
Greg Sky
Greg Sky@regsky_·
New website for gpu upgrades, repairs, and conversions out. gpulab.net Specializing in NVIDIA 48gb 4090 upgrades, and datacenter card sxm repair/ pcie conversions
English
1
0
0
173
Greg Sky
Greg Sky@regsky_·
@browomo yeah didnt we all try this day one of dgx spark? what changed, or are yall running at low tps?
English
0
0
0
299
Blaze
Blaze@browomo·
This Chinese developer linked two $2,999 NVIDIA DGX Sparks into one box and runs the full Qwen3-235B at home, after dropping his $1,999-a-month cloud bill to zero. He wired 2 small boxes into a single computer, split a giant 235-billion-parameter model in half between them, and serves it across his own network at about 10 tokens a second, with no internet, no cloud, right there on the desk. No data center, no thousand-dollar graphics cards, no monthly cloud bill. Just him, 2 gold boxes the size of a sandwich, one cable between them, and 1 power strip. And here is the whole payoff. He used to pay the cloud $1,999 a month for the same model, and the meter ticked on every request. Now he paid $5,998 once for 2 boxes, they covered their cost in 3 months, and after that he sends as many requests as he wants for free, only electricity. The two Sparks talk over one fast cable, each holds 128GB of memory, and together they carry the whole model, about 73GB loaded per box, with the chip inside pinned near the limit at 96%. Both boxes work as one and keep trading data over the cable, with no cloud in the loop and no single word leaking out. The ready model sits on one local address, and any app on his network calls it as easily as ChatGPT. And here is how he described, in plain words, what this pair of boxes does: "this is a pair of boxes that holds the huge Qwen3-235B model and serves it to one network. the model is split in half, and each box owns its half. parts: // Box 1 (holds the first half of the model and starts the answer fast, the first word appears in under a second) // Box 2 (holds the second half and writes out the rest, about 10 tokens a second) // Cable (connects the 2 boxes and moves data between them on every step, with no lag) // Address (one local address where any app sends its request, like to a cloud model) // Test (a script that runs big prompts through and measures speed and delays) // Monitor (checks temperature, power draw, and load on both boxes every 2 seconds). the model never goes to the cloud. he only steps in when a box runs hotter than 80 degrees or the cable between them starts dropping data." So the system knows exactly what it is, what it is for, and where its limits are. It knows it has to hold the whole huge model across 2 boxes on its own. It knows it has to answer every request locally, with no meter, no limits, and no internet. It knows the human is only needed when a box overheats or the link between them stalls. → The setup runs around the clock on 2 boxes, each pulling under 60 watts → However many requests he sends, the monthly bill is $0, only electricity → The first box starts the answer in under a second → The second writes text at about 10 tokens a second → One request at a time: 838 tokens in 85 seconds, first word in 0.8s → Two requests at once: 697 tokens in 108 seconds, first word in 0.7s → Both boxes sit at 96% load and warm up to 76-78 degrees And only when a chip in a box runs hotter than 80 degrees or the cable between the 2 Sparks drops data does the system call the owner. And when he himself is out on a run or in a coffee shop, he still reaches his own model at home from his phone: sends a big prompt to the local Qwen3-235B, gets the full answer back in under a minute and a half, with no token meter ticking and no limit to hit. Here is what the test shows on his screen during one of the night runs: "one request at a time: 838 tokens in 84.9 seconds, first word in 0.8s, then 0.1s per token." "two requests at once: 697 tokens in 107.6 seconds, first word in 0.7s, then 0.15s per token." "Box 1: chip at 96% load, 76 degrees, 56 watts, 73GB used in memory." "Box 2: chip at 96% load, 78 degrees, 56 watts, the Qwen3-235B model fully loaded." And while everyone around is paying for AI by the month and bumping into limits, his top-tier model just sits on the desk and works as much as he wants: his own little power plant instead of a forever meter. He has no server rack of his own and no cloud account behind it. Just 2 DGX Spark boxes on a desk, one model split in half between them, one local address, and a folder of prompts next to it. Out of everything I have seen this year, this is the cleanest way to stop paying for AI: $5,998 of hardware on the desk once, $0 a month to the cloud, unlimited forever, and between them 2 gold boxes, 1 cable, and the full Qwen3-235B answering at home with no internet.
English
21
45
338
82.7K
Greg Sky
Greg Sky@regsky_·
@topvint i cant wait to start doing this to pc's and consoles.
English
0
0
1
126
Topvint
Topvint@topvint·
Long time no see, postfix adapter...
Topvint tweet media
English
6
14
333
13.1K
Greg Sky
Greg Sky@regsky_·
@birdabo maybe they are wise enough to know if youre on the top, you are the main target. "highest nail gets the hammer"
English
0
0
3
1.9K
sui ☄️
sui ☄️@birdabo·
google fell off. harvested 20 years of humanity’s data and still can’t crack the top 3 in the AI race. smh.
English
270
95
3.1K
162.6K
Greg Sky รีทวีตแล้ว
MobstersDaily
MobstersDaily@MobstersDaily·
Tony Soprano Tries ChatGPT
English
69
420
4.9K
379.5K
Greg Sky
Greg Sky@regsky_·
Excellent video by @cmuratori talking about the lack of responsibility of tech giants (over the last 2-3 decades) sitting on their thrones telling a generation of people now poisoned by the products they actively developed to just forget all that and keep the 'system' going. I think its great to see these young people boo'ing Eric on stage, it proves that the tech giants don't have a strangle-hold like they think they do, and that perhaps the pendulum is starting to swing the other way. The amount of private equity going into datacenters proves SaaS will have a lesser value, so all your cloud drives, platforms, everything they used to 'control' you may start to unravel from the giants control, so they will now go one level higher and gatekeep hardware rather than what they used to, the software. youtube.com/watch?v=tlQ7Eo…
YouTube video
YouTube
English
0
0
0
19
Greg Sky
Greg Sky@regsky_·
Computex?
English
0
0
0
4
Greg Sky
Greg Sky@regsky_·
Something is happening with ai where large companies are going to work 10-100x harder to get the same value from llm's that an individual can. It's impossible for everyone's job to get replaced with "ai" because who are those companies going to sell to? But "ai" will augment everyone's productivity. It's an additional tax on your precious time on this earth
English
0
0
0
15
Greg Sky
Greg Sky@regsky_·
@FanlessTech so you cooled the copper and heated the block to have them press fit together?
English
1
0
35
15.2K
FanlessTech
FanlessTech@FanlessTech·
Seems completely reasonable to us
FanlessTech tweet mediaFanlessTech tweet media
English
187
632
14.5K
577.9K
Greg Sky
Greg Sky@regsky_·
How can X operate without mods like Reddit? Just need X marketplaces to replace Reddit hardwareswap and I'm gone for good. Scroll on /popular and all you get is mind control and the occasional cat to keep you going
English
0
0
0
9
Greg Sky
Greg Sky@regsky_·
@Govindtwtt Not being ever in a position to have been able to compete with the big guys in the past. Now the playing field is even
English
0
0
0
35
Govind
Govind@Govindtwtt·
If everyone can build with AI now… what will actually make a startup succeed?
English
93
3
61
13K
Greg Sky
Greg Sky@regsky_·
@lauriewired We're already playing with the idea of increasing compute density in space before figuring out how to radiate the excess heat in a vacuum?
English
0
0
0
1.5K
LaurieWired
LaurieWired@lauriewired·
I’ve always wondered, why can’t we run CPU’s hotter? Look at any modern CPU, and the maximum junction temperature (TJMax) is around 95ish C. Leakage current explodes past that point, and reliability drops off a cliff. But the question is…couldn’t you make a “tougher” transistor? The answer is…sorta. The Glenn research center at NASA experiments with silicon carbide wafers. Venus is crazy hot (~470C!). Apparently, NASA has been somewhat successful running a medium-scale IC at 500C for 1 year. Ozark IC also won a contract to develop a multicore RISC-V cpu intended to operate at 500C, but I haven’t seen updates in a while. Perhaps an EE can chime in. Is ~95C TJMax just a local optimum that everyone collectively settled on? How much density would you have to give up to run things just a little bit hotter? I wonder if a special, “low density space H100” could reliably run at say, ~150C, or if that’s completely outside the realm of what’s feasible.
LaurieWired tweet media
English
198
144
2.6K
227.8K
Greg Sky
Greg Sky@regsky_·
@Polymarket Unplug your routers. No other way to prepare
English
0
0
1
1.6K
Polymarket
Polymarket@Polymarket·
JUST IN: Anthropic announces it will roll out Claude Mythos “in the coming weeks” despite growing fears over the model’s cyber capabilities.
English
333
587
8.7K
1.8M
Greg Sky
Greg Sky@regsky_·
@sudoaptupdater @RT_com cant wait. so much less corruption and useless overhead. Lets see HR as the first to go!!
English
0
0
0
54
sudoaptupdate
sudoaptupdate@sudoaptupdater·
@RT_com All management, csuite and data intensive needs will be relegated to algorithms. Once management is turned over to the algorithms then the infrastructure is designed around them and No manual labor tasks will require humans. 8-12 years out maybe. Good Luck.
English
1
0
0
2.3K
RT
RT@RT_com·
'Entire professions may go EXTINCT, replaced by Artificial Intelligence' — Vladimir Putin 'This process is irreversible and INEVITABLE'
English
139
884
3.3K
203.3K
Paul Yacoubian
Paul Yacoubian@PaulYacoubian·
bro is adding in ai women to sell his surplus heavy equipment 😂
Paul Yacoubian tweet media
English
323
285
11.5K
3M
Greg Sky
Greg Sky@regsky_·
Fresh import of 100g of shin Etsu. This stuff is great
Greg Sky tweet mediaGreg Sky tweet media
English
0
0
0
24
Greg Sky
Greg Sky@regsky_·
@AlfinCodes We all tell ourselves we're prototyping until someone else see what we're doing and think it's prod ready and we just kinda go with the hype too
English
0
0
0
18
Alfin
Alfin@AlfinCodes·
every vibe coder is just blindly accepting code written by an AI that’s also blindly guessing
English
103
4
110
6.8K
FactPost
FactPost@factpostnews·
The Trump administration has moved to provide weapons-grade plutonium to private start-ups.
FactPost tweet mediaFactPost tweet media
English
488
993
5K
3.4M
Lunix
Lunix@SolLunix·
Pavel durov, Net worth $15 billion doesn't own a phone.
English
66
175
4.2K
554.7K
CR1337
CR1337@CR1337·
A Reddit user found out that Motorola phones have started hijacking the Amazon app to insert affiliate codes - on a phone that cost $1,900 - talk about greed..
CR1337 tweet media
English
186
1.4K
14K
591.9K