munchwrap (Hypurr Holder)

11.4K posts

munchwrap (Hypurr Holder)

munchwrap (Hypurr Holder)

@munchwrap

nerd

Katılım Ağustos 2021
1.1K Takip Edilen334 Takipçiler
Rach
Rach@rachpradhan·
We made TurboAPI hit 150k req/s. In under a day. It is now 22x faster than FastAPI Thanks to the amazing contributions from the people in the comment section, which allowed me to view what made the hyper optimized frameworks work the way that they do! Here's what changed..
Rach tweet media
Rach@rachpradhan

I replaced FastAPI's entire HTTP core with Zig. Same decorator API. Same Pydantic models. 7× faster. 47,832 req/s vs FastAPI's 6,800. 2.09ms p50 latency. Introducing. TurboAPI. Here's the story..

English
22
44
659
119.8K
BowTiedIguana | DeFi & Cybersecurity Researcher
People really think this LLM slop is the future of tech Learn to code first. If you know what you're doing LLMs can save some typing. If not, technical debt multiplier "You're right to question this. The VARIABLE at line 557 is not in scope- it's defined in a different method"
English
4
2
49
3.1K
munchwrap (Hypurr Holder)
@seconds_0 Oh shit, what is a middle ground macbook m5 pro/ mini m4 pro? Quite annoying at this point to have a cost benefit analysis setup. People talk about buying used 3090s
English
0
0
0
8
0.005 Seconds (3/694)
0.005 Seconds (3/694)@seconds_0·
There will be a whole new class of models trained who's job is to call specialist models. MoE taken to the logical extreme Compaction models, search models, memory models and above all the planning dispatch model
Cody Blakeney@code_star

I have no idea what specialized model for context compaction means and I have like 5 papers and announcements to read before I can think about this. It’s crazy that for even a single model we may have a whole ecosystem of specialized models for optimization. Spec decode model, compaction. What comes after that?

English
5
2
71
2.6K
munchwrap (Hypurr Holder)
@seconds_0 I’v been really pondering on this too, hold off to get some hardware next year and buy some macbook air pointing to self-hosted LLM vs home stuff.
English
1
0
1
15
0.005 Seconds (3/694)
0.005 Seconds (3/694)@seconds_0·
@munchwrap I basically don't do any self-hosting because I don't have good local computers. It's one of the things that I've never spent nearly enough time on, like I should.
English
1
0
3
20
Sudo su
Sudo su@sudoingX·
local AI hardware tiers: $4,699 - DGX Spark (NVIDIA wants you here) $1,989 - RTX 4090 (overkill for most) $1000 - RTX 3090 used (sweet spot) $250 - RTX 3060 used (currently testing every model that fits 12GB) $0 - CPU only (it still works) jensen announced the top. i've been posting receipts from the bottom.
English
100
25
555
34.9K
munchwrap (Hypurr Holder)
@the_smart_ape Tried this for evaluating residency although my corpus was too damn big and took 6-8 hrs with crashes. Thinking to rewrite the zep part to be self hosted
English
1
0
1
342
munchwrap (Hypurr Holder)
@BowTiedOsprey i think people just dont notice it, or wont have 1:1 performance as opus 4.6. It's like when chinese frontier labs distill bunch of these into their opensource and release it after a few months
English
0
0
1
27
BowTiedOsprey
BowTiedOsprey@BowTiedOsprey·
@munchwrap How does Claude allow these to remain available on HuggingFace? Or is it just a matter of time before they get taken down?
English
1
0
0
24
BowTiedIguana | DeFi & Cybersecurity Researcher
Grok says: "The criticism stems from March 2026 reports that US military used Claude AI in Iran strikes, resulting in over 185 civilian deaths including schoolchildren, despite Anthropic's ethical guidelines." If you think you can pay for this stuff, use it to "get ahead", and escape any consequences you're either not very bright or not very worldly. Like the N95 / T cell stuff this will fall on deaf ears mostly, but if one person benefits... Look into self hosting the open weights models. I'm putting some time into making this less annoying, might cover it. Don't buy their products. You don't need them but they need your $$$. 3 months behind the state of the art is good enough.
BowTiedIguana | DeFi & Cybersecurity Researcher@BowTiedIguana

@bcherny already unsubscribed, but also unfollowed for supporting the US war effort I get that you guys don't really have a choice, but also the rest of the world doesn't have to buy from you. Bye.

English
2
1
9
2.1K
munchwrap (Hypurr Holder)
@BowTiedIguana ive used librechat with open models but have to hook up or pay for something called firecrawl/serper.dev to use the internet without getting blocked
English
0
0
1
10
munchwrap (Hypurr Holder)
@BowTiedIguana the ip would be your tailscale ip and point to a LiteLLM instance which can translate between anthropic<> openai compatible calls - i think qwen expects openai specific calls so calling directly from CC wont work 2. For 2 it's a bit more annoying imo
English
1
0
1
39