Lean Kin Prak

2.6K posts

Lean Kin Prak

Lean Kin Prak

@LeanKinPrazli

انضم Aralık 2024
296 يتبع68 المتابعون
Noemi
Noemi@NoemiTitarenco·
@sudoingX I'm building a superfast browser harness that should be able to work with even dumb models because the tool use/tool discovery is fool proof.
English
1
0
4
343
Sudo su
Sudo su@sudoingX·
to all of you saying local models aren't there yet because some corporate salesman on an openai paycheck told you so. you're running their bloated tools and blaming the model. the model is fine. the bloated harness is the problem. i've tested literally every harness out there and i have the facts and receipts on my timeline and DM. openclaw is 120K+ lines of typescript bloat backed by corporate, mining your thinking while you pay for the privilege. switch to hermes agent and watch the same model become usable. don't take my word for it. just try. i have DMs from people who made the switch and their "broken" model started working instantly. same hardware, same model, different harness. if you're using hermes agent and someone near you is still on openclaw, help them get away from the bloat. they're frustrated at every step, burning tokens doing completely nothing, paying subscriptions to think on someone else's server. buy a single GPU from ebay. compile llama.cpp. install hermes. replace your openai subscriptions and think free. once you think free you start seeing light. you deserve better cognitive tools than bloat that harvests you.
English
105
53
1K
38.8K
O
O@ooo000ooo00ooo·
@LeanKinPrazli @0xvallion @DeFi_Hanzo You could open chat gpt and ask. It will guide you through the entire set up and advise you on the best qwen and deepseek models.
English
1
0
0
18
Hanzo ㊗️
Hanzo ㊗️@DeFi_Hanzo·
I bought a Mac Mini 2 months ago. People laughed. "Why not just use the server?" "Why you play on 1win?" "Local models are a toy." "You'll never match GPT-4 quality on consumer hardware." Google just released TurboQuant. An algorithm that shrinks AI model memory by 6x without losing intelligence. 8x faster. Same number of GPUs. Same quality. My 16GB Mac Mini can now run models that required a $50,000 server 18 months ago. Here is what actually changed: > kv-cache compressed to 3 bits with zero accuracy loss > models that needed 96GB of VRAM now fit in 16GB > the performance gap between local and cloud just collapsed The people who laughed at the Mac Mini are now watching Micron and Sandisk stock fall off a cliff. Because if you don't need 6x the memory to run AI - you don't need 6x the memory chips. $527 billion in combined market cap. Memory prices up 500% on AI demand.
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
30
36
448
109.7K
Lean Kin Prak
Lean Kin Prak@LeanKinPrazli·
@0xSero I don't know what it means but I'm very excited for where it looks like local LLM's are going.
English
0
0
0
115
0xSero
0xSero@0xSero·
I need someone who's not been one-shotted to validate these numbers, I have gotten consistent results but I don't have enough expertise to make sure the models are honest. github.com/0xSero/turboqu…
0xSero tweet media
English
12
1
85
6.9K
10xROE
10xROE@10xROE·
@NielsRogge Because anyone this good at computer science doesn’t care about there personal appearance lol
English
1
0
4
894
Niels Rogge
Niels Rogge@NielsRogge·
Why are local LLM wizards always these weird, anonymous accounts
Niels Rogge tweet media
English
58
9
250
22.9K
0xSero
0xSero@0xSero·
Qwen3.5-35B compressed 20% with 1%~ performance drop on average. Now you can fit this (4bits) with full context on 24GB of VRAM 700$~ or 1x 3090 huggingface.co/0xSero/Qwen-3.…
English
83
114
1.8K
102.6K
alexintosh
alexintosh@Alexintosh·
I just bough an M3 ULTRA 4TB SSD 512GB RAM 32-CORE 80-CORE GPU from @eBay at an outragiously low price. I haven't bough on ebay for years, truly hoping I do not receive a bunch of brinks. Wish me luck.
alexintosh tweet media
English
38
0
63
19.3K
Aleks
Aleks@Aleks00228561·
@LeanKinPrazli @ChuckYouSuck @suno They (Suno) give you all the rights to use their song which you can make in 10 seconds. Why wouldn't you give them the rights to use your voice. Otherwise, no one is forcing you to use your voice.
English
1
0
1
41
Suno
Suno@suno·
Meet Suno v5.5: More expressive, more you. Use your voice, your sound, and your taste to make music that's unmistakably yours, in the best and most personal Suno experience yet.
English
201
305
1.8K
291.2K
am.will
am.will@LLMJunky·
Two incredible innovations in the local AI space in a span of three days. I am so excited. ComfyUI just shipped "Dynamic VRAM" and it seems like a big deal for anyone running models locally. The problem: large AI models can have many GB of weights. If your system lacks the necessary RAM, you'd normally hit memory crashes or grind to a halt on the page file. Instead of loading the entire model into memory at once, ComfyUI now reads the model file piece by piece directly from your SSD. Only the specific parts needed for the current step get pulled into memory. Everything else stays on disk until it's actually called for. On the GPU side, they built a smart system that loads weight data at the exact moment it's needed. If your GPU runs out of space, it doesn't crash. It uses a temporary workaround to finish the calculation, then cleans up after itself. It also keeps track of what didn't fit so it doesn't waste time trying to reload things that won't fit again. The other big improvement is for workflows that use multiple models. Previously, swapping between models would pile everything into system memory and bog your machine down. Now when a model gets swapped out of the GPU, it just goes back to the "read from disk when needed" state instead of sitting in RAM. The result: a 56GB model can now run on a machine with only 32GB of memory. No crashes, no slowdowns from swap. Available now for Nvidia GPUs on Windows and Linux, with AMD support on the way. No idea how fast this is, but this seems incredible. Cannot wait to get my workstation going.
ComfyUI@ComfyUI

Upgrading your RAM is now unnecessary. Introducing our new ComfyUI Dynamic VRAM optimization. Running local models is now possible on even the most memory constrained hardware. Read more here: blog.comfy.org/p/dynamic-vram…

English
19
34
409
49K
Ed
Ed@Eduardopto·
Great question! When you upload or record your voice for Suno’s new v5.5 Voice Model feature, yes, you grant Suno a super broad, perpetual, royalty-free, irrevocable license to use your voice data, likeness, and Voice Model forever. You still own your original recordings, but they can use it to train/improve their AI, generate songs, promote the service, etc. (They won’t sell or give your raw Voice Model to other users (only the final tracks that use it)
English
1
0
1
63
Ed
Ed@Eduardopto·
SO HYPED for @suno v5.5 dropping TODAY 🔥 Finally music that’s more expressive and more YOU than ever! • Add your own voice to the mix (record or upload it actually sounds like YOU) • Create custom models trained on your personal sound • Let “My Taste” learn your favorite genres, vibes, and style so every track feels personal This is the most personal Suno experience yet. Game-changer for creators who want authentic, one-of-a-kind songs! Already queuing up my first voice drop 😍 Who else is jumping in right now? #Suno #SunoV55 #AIMusic
Suno@suno

Meet Suno v5.5: More expressive, more you. Use your voice, your sound, and your taste to make music that's unmistakably yours, in the best and most personal Suno experience yet.

English
2
0
1
192
0xVallion
0xVallion@0xvallion·
@DeFi_Hanzo So which models are you running on your mac mini?
English
2
0
11
1.8K
Suno
Suno@suno·
it's about to get personal.
English
78
24
363
29.9K
Afro Chuck
Afro Chuck@ChuckYouSuck·
@suno getting stuck at the verification
Afro Chuck tweet media
English
15
0
35
5.3K
Lean Kin Prak
Lean Kin Prak@LeanKinPrazli·
@jaesong I had ChatGPT read the Privacy Note and it's quite flimsy and definitely not safe. FUCK! Been looking forward to this. Now I split the stems in Logic and clone with my own voice Lora in RVC! was hoping this would make it easier.
English
1
0
1
16
Jae Song
Jae Song@jaesong·
@LeanKinPrazli That's a good question. So far personas have had the option to keep them private, so I'm assuming this will be the same. Haven't done a deep dive yet though.
English
1
0
0
9