aivrar

8.1K posts

aivrar

@aivrar

AI Enthusiast https://t.co/QZxeQwTVCJ I will work for VRAM

Mexico Katılım Nisan 2011

944 Takip Edilen378 Takipçiler

Sabitlenmiş Tweet

aivrar@aivrar·8 May

My Hermes Agent portable Windows application is getting a lot of attention. No need to use Docker or even have Python installed in your system, completely PORTABLE with a Native Windows GUI. github.com/aivrar/portabl…

English

616

aivrar@aivrar·1d

@vllm_project vLLM is so awesome I made a Windows build for it. No need to use docker or have Linux/WSL. for RTX 30/40/50-series, pre-built wheel, Windows patchset, 10 KV-cache compression dtypes, OpenAI API server fixes, Rust frontend, and Rust tool parser support. github.com/aivrar/vllm-wi…

English

247

vLLM@vllm_project·1d

🚀 Qwen3.6-27B-NVFP4 is inference ready with vLLM on NVIDIA Blackwell GPUs. This checkpoint is optimized for Blackwell and reduces GPU memory requirements by ~2.5x for local AI with open-source models. 🧠 27B params, Hybrid Attention 📊 NVFP4 evals: 86.3 on MMLU Pro, 85.5 on GPQA Diamond 🛠️ Exclusively supported on vLLM as the runtime engine Get started from the Hugging Face checkpoint: huggingface.co/nvidia/Qwen3.6…

NVIDIA RTX Spark@NVIDIARTXSpark

Fast, efficient local AI with open-source models just got easier. Qwen3.6-27B-NVFP4 is now on @HuggingFace! It's optimized for NVIDIA Blackwell GPUs & inference ready with @vllm_project. The checkpoint reduces GPU memory requirements by approximately 2.5x for powerful 27B-parameter inference on your own hardware.

English

430

49.9K

aivrar@aivrar·1d

@YC1401 @FoxNews It's a great quote though, that's how I basically feel about the World.

English

406

🐦‍⬛♕♡♕🐦‍⬛@YC1401·1d

@aivrar @FoxNews LOL, Believe no one... and not even no one!

English

2.7K

Fox News@FoxNews·1d

BREAKING: Two people have climbed to the top of the Empire State Building in New York City, holding a banner from the skyscraper's antenna reading, "When the power of love beats the love of power, the world knows peace." As of now it's unclear how the pair reached the top of the building as police work to get them down from the spire, 1,454 feet above the ground.

English

9.3K

45.2K

349.1K

70.8M

aivrar@aivrar·1d

@YC1401 @FoxNews Thanks! I didn't know that because on FB they just spam the Jimi Hendrix one.

English

13.4K

🐦‍⬛♕♡♕🐦‍⬛@YC1401·1d

@aivrar @FoxNews The original phrasing comes from 19th-century British Prime Minister William E. Gladstone: "We look forward to the time when the Power of Love will replace the Love of Power. Then will our world know the blessings of peace." Not Jimmy Hendrix lol

English

295

19.3K

aivrar@aivrar·3d

@MoundLore Wouldn't that be the dirt under our feet?

English

MoundLore@MoundLore·3d

What’s the oldest thing you’ve ever touched?

English

627

304

aivrar@aivrar·3d

@CinemaTweets1 The "Touched by an Angel" bit gets me every time.

English

2.5K

Cinema Tweets@CinemaTweets1·3d

One of the Funniest Performances in Cinema History

English

477

6.3K

425.9K

aivrar@aivrar·3d

@coinbureau That's the kind of hype that makes people start looking to open source even more.

English

Coin Bureau@coinbureau·4d

🚨ANTHROPIC CEO: OPEN SOURCE AI IS GETTING DANGEROUS Anthropic CEO Dario Amodei told lawmakers that open-source AI is moving down a “very dangerous path.” His warns that once powerful models are released openly, companies lose the ability to monitor misuse, revoke access, or update safety guardrails.

English

1.9K

494

3.7K

3.4M

aivrar@aivrar·4d

@hqmank Exact same for me, I was at 2% left and got the reset, happy Sunday indeed!.

English

290

Kai@hqmank·4d

I hit my Codex limit yesterday, but my weekly reset was coming soon. So I waited instead of pressing the reset button. Then OpenAI reset everyone anyway. I basically saved one extra reset by doing nothing.

Tibo@thsottiaux

As we are still investigating, I have reset everyone's Codex usage limits. This is a hard reset given some users had stacked up to three banked resets already that they can apply on their own schedule. Funnily enough, this week at OpenAI is called the RESET week, which is meant for folks to relax a bit. However it will be a different kind of RESET week. Enjoy.

English

106

13.3K

aivrar@aivrar·4d

@ThrillaRilla369 Biff.

English

Thrilla the Gorilla@ThrillaRilla369·4d

I need a very specific tough sounding name for a tiny chihuahua puppy Not Rocky 🐕

English

131

1.6K

274.8K

aivrar@aivrar·4d

@gofishh77 That couldn't work very efficiently without getting air into the bottom of that barrel.

English

1.4K

Richie Rich@gofishh77·4d

Redneck ingenuity is always fun!

English

114

294

2.9K

798.4K

aivrar@aivrar·4d

@LibertarianG0th

QME

🥀 🖤𝔏𝔦𝔟𝔢𝔯𝔱𝔞𝔯𝔦𝔞𝔫 𝔊𝔬𝔱𝔥🖤🥀@LibertarianG0th·4d

Playing with fire here, I know.

English

161

2.3K

aivrar@aivrar·5d

@compliantvc

GIF

QME

Henrick Johansson@compliantvc·5d

As a European, I am taking the climate pledge to NOT use air conditioning or other climate-destroying cooling devices Americans hate how strong we are (they are coddled and rely on artificial cool air) Who else is taking the pledge with me? Let's save the planet!

English

5.4K

495.9K

aivrar@aivrar·5d

@Iberian_America I doubt they were random.

English

Iberian_America@Iberian_America·5d

man this is like the 10th person who has been shot to death since I moved to tepito im starting to think tepito might not be safe idk

English

230

aivrar@aivrar·5d

@Jackkk Not the worst life I guess.

English

Jack@Jackkk·6d

Mark Zuckerberg reveals he's feeding his cows beer and macadamia nuts “On the ranch, one of my projects is I'm trying to create the highest quality beef in the world” “It's very low stakes, I’m not selling it but I'm very into the genetics of the cattle. We're trying to figure out how do you make it so that you basically can deliver the highest density diet to them” “We started growing macadamia trees because that kind of nut is extremely dense and they will eat a lot so they will put on weight and become fat quicker and become delicious” “The macadamia nuts have a lot of oil so you need to actually roast that. So now we need to design this whole process to roast the nuts so that way you can give them to the cows” “You want them to eat more. So then it's like how do you get them to eat more? Well it turns out alcohol is great for that because alcohol induces appetite” “That's actually why very high-end beef, they're fed beer. But okay, what's the right balance of beer versus water? I don't know. Let's let them choose. They get either as much cold beer as they want or as much room temperature water” “So now we're brewing all this beer and we're putting it out”

English

822

298

6.8K

3.6M

aivrar@aivrar·5d

@DefiantLs Quinceañeras.

Español

Defiant L’s@DefiantLs·5d

What's something that makes you think, "I'm too old for this shit"?

English

186

107

53.8K

aivrar@aivrar·5d

@lauriewired I'd totally game in the cloud, I need my GPUs for other AI stuff.

English

LaurieWired@lauriewired·5d

you’ll get mad at me for saying this…but cloud gaming is so obviously more economically efficient than physical hardware I think it’s going to be the default soon. your home console / pc is idle 90%+ of the day. meanwhile, data centers targets what, 5%, maybe at worst 10% idle. every second a cloud gamer isn’t gaming, that hardware is being used for someone else, training, etc. I think there should be a new measurement, something like cost-per effective FLOP hour that takes into account the TCO + effective utilization. If a gamer spends $500 on a GPU, uses it for 3 years, but it’s only fully active ~5% of that period…the cost-per relative FLOP hour is crazy high! Meanwhile, a $50,000 datacenter GPU might have a *LOWER* cost-per FLOP hour just because the effective utilization is 90+%.

English

3.2K

169

4.1K

3.4M

aivrar@aivrar·6d

@GithubProjects Then change it, what's the big deal?

English

GitHub Projects Community@GithubProjects·6d

I have a bad news

English

109

389

13K

284.6K

aivrar retweetledi

Tom Turney@no_stp_on_snek·6d

TurboQuant+ updates. 4.25→4.125 bpw, faster decode, lower KLD, crash fixed. NEED TESTERS: spent the last stretch hammering on my llama-cpp-turboquant fork and the numbers finally moved the way i wanted. this is the same-fork before vs after on qwen3.6 / 5090, so it isolates exactly what my fixes did: KLD vs f16: down 34-35%. the culprit was a mis-scaled centroid table (σ≈0.064 instead of exact Lloyd-Max). fixing it also cut PPL degradation from +0.55% to +0.16%. decode: +17% at short context, growing to +60% deep. a flash-attn launch-latency backport plus a fused-MMA decode path that reads the KV once per head-group instead of once per head. the 32K crash: fixed. turned out it was never an inference crash, just an int-to-size_t overflow in the perplexity tool. size: 4.25 down to 4.125 bpw, bit-identical quality. dropped a dead field that was just along for the ride. prefill: roughly flat. wasn't a target, it was already competitive. net: every quality, speed, and robustness axis moved the right way, and the block got smaller. loop's still live, chasing a clean -30% KLD bonus next. what i actually need now: testers. if you've got a 5090, or honestly any CUDA / HIP / Metal / Vulkan box, and you run local models, pull the branch and tell me what breaks. real hardware, real workloads, the messier the better. PR's here: github.com/TheTom/llama-c… Mixed results on pascal. need more info. plz let me know your results