Tony Ge

380 posts

Tony Ge

@largePrawn

vibe goader and gpu enjoyer

Katılım Mart 2023

558 Takip Edilen379 Takipçiler

Sabitlenmiş Tweet

Tony Ge@largePrawn·6 May

so i lowballed 100s of luxury watch dealers in the USA using a voice AI and managed to negotiate $15k off MSRP for a Daytona youtube.com/watch?v=ohQrh3…

YouTube

English

1.2K

Tony Ge@largePrawn·2h

this'll be a wakeup call for the rest of big inference

Google for Developers@googledevs

Breaking LLM inference’s autoregressive bottleneck 🛠️ We've teamed up with @haozhangml, @YimingBob, and @aaronzhfeng, among others from UCSD to achieve a massive 3.13X speedup for LLM inference on Google Cloud TPUs using Diffusion-Style Speculative Decoding (DFlash). Read the blog: goo.gle/4naZ8Yv

English

Tony Ge@largePrawn·4h

@arlanr now THIS is how we're supposed to use this shi

English

2.5K

Arlan@arlanr·9h

LMAOOO we are going viral

English

3.1K

163.4K

Tony Ge@largePrawn·6h

@masterzorgon bro if you're willing to fund a run for your own campaign using this tech i'll do it just for the data

English

masterzorgon 🇺🇸@masterzorgon·11h

@largePrawn would you gib access for $100 no cap?

English

Tony Ge@largePrawn·12h

so uh i actually built this over the weekend... can someone fund my ad account 🥺 if it works i'll give you the code

Tony Ge@largePrawn

Meta ads CLI + auto researcher + GPT-2 image + $1000 ad spend would probably print

English

Tony Ge@largePrawn·12h

@boardyai money

English

Boardy@boardyai·1d

Pitch me your company in 1 word.

English

1.4K

626

94.8K

Tony Ge@largePrawn·12h

@aashatwt "how not to build ai slop sites" proceeds to build an ai slop site

English

418

aasha@aashatwt·17h

HOW NOT TO BUILD AI SLOP SITES :

English

42.3K

Tony Ge@largePrawn·12h

@storagereview @DellTech @NVIDIAAI whoa so cool of you guys to be announcing your 4x rtx 6000 pro giveaway next and only guys named tony are eligible to win

English

StorageReview.com@storagereview·13h

We now have four RTX Pro 6000 GPUs for the XE7740 server. Kevin walks through the bracket install and a few other behind-the-scenes details. #servers #datacenter #gpu @DellTech @NVIDIAAI

English

3.8K

Tony Ge@largePrawn·15h

@DJLougen and then you see a shitpost that somehow changes the world

English

Daniel Lougen, M.S.@DJLougen·21h

X has this beautiful ability to make me feel like everything i do is utterly useless somedays 🫠

English

129

Tony Ge@largePrawn·1d

@shanneykar @malikwas1f 96gb DDR4 ECC RAM 1TB NVME threadripper 5XXX (cant remember) It's an old Lenovo P620 some startup liquidated for $1100 (3090s not included)

English

Shanmuga Umasankar - thai/acc@shanneykar·2d

@largePrawn @malikwas1f Do you mind sharing rest of your AI workstation rig that compliments the RTX3090s

English

157

Tony Ge@largePrawn·3d

Hitting 140 tok/s on Qwen 3.6 27B running vLLM with 2x 3090s using the following @malikwas1f's repo github.com/noonghunna/clu… Literally just pointed claude at it and walked away. Came back to a 2.5x speed bump 🤯🤯🤯

English

347

26.2K

Tony Ge@largePrawn·1d

@SlimTradeyBaby @loktar00 @malikwas1f 4.7 but 4.6 coulda also done it

English

SlimTradey@SlimTradeyBaby·2d

@largePrawn @loktar00 @malikwas1f Claude 4.6 or 4.7?

English

225

Tony Ge@largePrawn·1d

@MaxKerkula @malikwas1f Earlier i couldn't be fucked to figure out when the next available labcorp appt was bc i needed to get a blood test so i booted up hermes and used qwen to power it

English

Maxim Kerkula@MaxKerkula·1d

@largePrawn @malikwas1f Okay but what are you using it for.

English

Tony Ge@largePrawn·1d

@takimunk HOWWWW do you prompt for something like this!??!

English

326

central dogma specialist@takimunk·2d

2004.01.10: finding the only route home

English

293

3.3K

70.6K

Tony Ge@largePrawn·1d

@haileyhmt the affirmations i need to actually ship 🥰

English

hailey@haileyhmt·1d

i will code everyday if duolingo built github

English

192

Tony Ge@largePrawn·1d

@jarekceborski Latency and reliability. No random outages + near instant responses (when models are warm)

English

Jarek@jarekceborski·1d

What’s a good use case for local LLMs, except privacy concerns?

English

5.1K

Tony Ge@largePrawn·1d

@keennay iirc there's a way to RPC across any # of machines so you'd buy 2x of these (assuming each box has 4x PCIe 16x slots) and then run link them over network via vLLM youtube.com/watch?v=1lbBrT…

YouTube

English

Yannick Nick@keennay·1d

@largePrawn what are the options here for >= 8 GPUs, aside from enterprise rack servers by Supermicro / Dell / Gigabyte / ASUS

English

Yannick Nick@keennay·1d

I wish the “get an RTX 3090” recommendations were usually followed up with “and the rest of the components, RAM, etc will cost an additional <abc> & can be bought at <xyz>” because I have no earthly idea where to start & how to build those jungle gym looking setups

Sandro@pupposandro

respect the take but I think this is incorrect. instead, I recommend people to get a 3090: durability: 3090s run for years with basic maintenance. swap the gddr6x thermal pads every couple years and the card is basically immortal. just don't buy one that lived its life in a mining farm and you're fine power: 3090 pulls 350W. 4090 is 450W, 5090 is 575W, 7900 XTX ~355W. the 3090 is the LEAST power hungry card in its bandwidth tier. wrong completely !! heat: same story. every card at this performance level runs similar temps. not a 3090 problem price: used 3090s are $800-1000 today. launch was $1499. that's nowhere near launch price, and it's still the best $/GB-vram and $/GB-bandwidth on the market by a wide margin. 4090 is $2,500. 5090 is $3,500 the real recommendation depends on your use case. need more than 24GB of vram? skip the 3090 and go for a Mac Studio / DGX / Strix Halo. running qwen3 27b at high throughput on a budget? absolute best card on the market, no contest! ideally? you get both

English

19.9K

Tony Ge@largePrawn·1d

if you pointed 1T frontier toks at openclaw with just the prompt "make it better", what are the odds it just recreates hermes

David Ondrej@DavidOndrej1

openclaw is super bloated updating takes forever restarting gateway takes forever this is not what it used to be like...

English

129

Tony Ge@largePrawn·1d

@LottoLabs type shi