Bill

12.4K posts

Bill

@BillQueens

Internet Guy

Queens Katılım Ekim 2022

1.8K Takip Edilen590 Takipçiler

Bill@BillQueens·43m

@LottoLabs @PopVerseYT @0xSero @sudoingX DM me if you need anything happy to help.

English

Lotto@LottoLabs·44m

@PopVerseYT @BillQueens I think local llama on Reddit and @0xSero or @sudoingX have a x community. If you got a 5090 get 27b running on there and install Hermes agent. (It’s not sota but it’s pretty good)

English

Lotto@LottoLabs·10h

Hermes Agent + qwen 27b This project is to see if I can get a fully functioning and safe saas/business set up w/ hermes and 27b just off telegram. Site is pretty much done, auth is done, just hooking up stripe payments and manually auditing. Will probably do a sota audit with a couple different models to see if I miss anything.

English

Bill@BillQueens·10h

@sudoingX My wife tries this on me everyday

English

Sudo su@sudoingX·10h

Reinforcement learning

English

1.4K

Bill@BillQueens·10h

@LottoLabs Ya I was so sick of vLLM horseshit lol. I just built my own frontend today to control everything.

English

117

Lotto@LottoLabs·10h

@BillQueens Fair I’m going into those weeds now 😭

English

127

Bill@BillQueens·10h

@LottoLabs I’ve just been ripping llamma.cpp so GGUFs have been solid for me. Had so many issues with vLLM nightlies said fuck it and haven’t looked back.

English

134

Lotto@LottoLabs·10h

@BillQueens I think I ran that quant on a rtx6000 pro and it obviously ripped, I like the 27b overall though, mainly used the unsloth, might as well try nvfp if you have Blackwell chips

English

166

Bill@BillQueens·10h

@LottoLabs GGUF will look at trying this tomorrow - what are your thoughts so far?

English

156

Lotto@LottoLabs·10h

@BillQueens Have you been running the NVFP model? huggingface.co/Kbenkhaled/Qwe…

English

415

Bill@BillQueens·12h

@hyperprior @0xSero donate.sybilsolutions.ai

QME

124

hyperprior@hyperprior·13h

@0xSero where’s the link to donate?

English

2.3K

0xSero@0xSero·14h

Man, what the hell. That's a few years salary for most of the world donated in under 24 hours. I promise I will do everything in my power to make this worth it for all of you.

English

1.8K

50.5K

Bill@BillQueens·13h

@sudoingX 32GB

Sudo su@sudoingX·20h

how much VRAM do you have right now

English

193

116

16.2K

Bill@BillQueens·1d

@BoomersBetz I bet Mormans -2.5

English

172

Boomer’s Bets@BoomersBetz·1d

Alright ive cooled down. Whats the lock

English

3.7K

Bill@BillQueens·1d

@ivanfioravanti It’s a dual license ya dingus

English

Ivan Fioravanti ᯅ@ivanfioravanti·1d

I see in uv repo on github there are both MIT and Apache 2.0 licenses, which is the right one? 🤔 Relevant if someone wants to create an OpenUV fork...

English

2.2K

Bill@BillQueens·1d

@leerob what’s the one more thing ?

English

Bill@BillQueens·1d

@BoomersBetz

GIF

QME

Boomer’s Bets@BoomersBetz·1d

Welcome to selling insurance TCU

English

2.2K

Bill@BillQueens·1d

@Teknium My dev box is just a server basically. Threadripper and 5090 - 128GB ECC with Ubuntu on it.

English

Teknium (e/λ)@Teknium·1d

Do you agree? Where do you do most of your dev work? I personally have a linux server machine in my house that I run hermes-agent on and do my dev work from. How bout you?

Ryan Carson@ryancarson

100% of dev is going to be done in sandboxes in the cloud, controlled by kanban boards. Trust me, I love my local machine and gorgeous mac apps, but all of it is just a terrible form factor for running a team of agents effectively.

English

158

13.4K

Bill retweetledi

Nicky@nickturani·22 Mar

What an upset! My bracket is in shambles

English

113

10.5K

93.5K

Bill@BillQueens·2d

@TeksEdge In what world is any of this accurate

English

201

Bill@BillQueens·2d

@selfhostedmind @maria_rcks Likely the quickest best option available in her country.

English

kayhe@selfhostedmind·2d

@maria_rcks did theo really cheap out on a refurbished 2022 M2 air ?? wow so generous

English

121

8.6K

maria@maria_rcks·2d

I can't believe im typing this on a macbook, everything feels really smooth and weird (macos) but really really thank you so much theo, this is amazing, you really didnt have to, but thank you so much ❤️

Theo - t3.gg@theo

Don’t worry, I just bought her a MacBook That said - if she ships this much on a computer less powerful than a Raspberry Pi, what’s your excuse?

English

2.5K

242.3K

Bill@BillQueens·2d

@HedgehogZone_ Mine passed away

English

Hedgehog Zone@HedgehogZone_·2d

ZXX

321

6.5K

Bill@BillQueens·3d

@DailyHitman I do

The Daily Hitman@DailyHitman·3d

Spent two days straight staring at my wall visualizing the tourney. Who wants the first POD? 🧘‍♂️

English

4.6K

Bill@BillQueens·3d

@__tinygrad__ Imagine not building shipping container token machines in the year of our lord

English

273

the tiny corp@__tinygrad__·3d

People are too focused on trying to build God and not thinking about the unit economics 🤑

Milind@milindS_

Every day that I use Kimi and GLM, I realize that @__tinygrad__ is going to mint money in a couple years time The big 'labs' don't have any way to compete with cheap inference of ridiculously good models

English

235

18.9K

Bill@BillQueens·3d

@JaMarc0 @andrewdfeldman @cerebras They don’t have the supply chain.

English

JaMarco@JaMarc0·3d

@BillQueens @andrewdfeldman @cerebras but thats just a financing issue

English

Andrew Feldman@andrewdfeldman·3d

NVIDIA's biggest GTC announcement was a $20 billion bet on the same problem we solved 6 years ago. Their next-gen inference chip - not available yet - has 140x less memory bandwidth than @cerebras. To run a single 2 trillion parameter model, you need 2,000+ Groq chips. On Cerebras, that's just over 20 wafers. Even paired with GPUs, Groq maxes out at ~1,000 tokens per second. We run at thousands of tokens per second today. And every day. In production now. Why? When you connect 2,000 chips together, every interconnect has latency. Every cable has overhead. It doesn't matter what your memory bandwidth is on paper if you're bottlenecked by the wiring between thousands of tiny chips. We solved this with wafer scale. One integrated system. Little interconnect tax. Jensen told the world that fast inference is where the value is. He’s right - it’s why the world’s leading AI companies and hyperscalers are choosing Cerebras.

English

740

149.5K

Keşfet

@LottoLabs @PopVerseYT @0xSero @sudoingX @hyperprior @BoomersBetz @ivanfioravanti @elonmusk