tim

15.8K posts

tim

@NERDDISCO

dx @runpod ⚉ co-org @techeurope_ applied ai conf ⚉ building

germany Katılım Aralık 2011

719 Takip Edilen2.2K Takipçiler

Sabitlenmiş Tweet

tim@NERDDISCO·28 Şub

new AGENTS.md --- this document exists for non-obvious, error-prone shortcomings in the codebase, the model, or the tooling that an agent cannot figure out by reading the code alone. no architecture overviews, file trees, build commands, or standard behavior. when you encounter something that belongs here, first consider whether a code change could eliminate it and suggest that to the user. only document it here if it can't be reasonably fixed. ---

tim@NERDDISCO

remove ~90% of toxic & costly context with this prompt: > remove everything from CLAUDE.md/AGENTS.md that can be inferred from the codebase, including high-level architecture descriptions, file trees, cli usage, build commands, and examples of standard behavior. keep only non-obvious, failure-prone decisions and hidden constraints that are not explicit in the code but would cause mistakes if misunderstood. the final file should read like a sharp-edges and gotchas document, not a project overview i am currently doing this in all my projects and it feels sooo good thx for the awesome research @nielstron, @tibglo & rest of the team

English

682

tim@NERDDISCO·10h

@0xSero awesome, let’s do that!

English

0xSero@0xSero·10h

@NERDDISCO I definitely need help. The codebase is in a real mess so I need to slowly swap out components until it’s scalable. Maybe we can work on the vLLM piece together

English

154

0xSero@0xSero·12h

I am going all in on vllm-studio, in the past my take was that if Claude can do it out then people should figure it out. I've also just been doing whatever comes to mind, but I am going to trim out most of the code and focus on a desktop electron app. Good UX coming soon

English

169

8.6K

tim@NERDDISCO·1d

@Prince_Canuma 👀👀👀

QME

Prince Canuma@Prince_Canuma·1d

Unlocked 4x speed up with this 🚀

Prince Canuma@Prince_Canuma

Wow, added a tweak inspired by @skalskip92’s supervision annotators now Sam3 label annotations on MLX take 6ms 🤯 Checkout supervision guys!

English

6.5K

tim@NERDDISCO·3d

@LukyVJ 😂

QME

Luky - A$AP Luky@LukyVJ·3d

@NERDDISCO Lmao 🤣 at first I was like « great job » Then, I read the tweet 🐥

English

tim@NERDDISCO·3d

asked codex to create a clean layout without any pills or cards, i think it did a great job

Theo - t3.gg@theo

This post was so bad that it made me crash out a bit. OpenAI models are not good at frontend. The examples in this article are embarrassing and I actually can't believe they posted it. x.com/sherwinwu/stat…

English

200

tim@NERDDISCO·3d

@andrey_cheptsov have no time, need to respond

English

Andrey Cheptsov@andrey_cheptsov·3d

@NERDDISCO I bet you are behind the scene reading the messages and clicking buttons)

English

tim@NERDDISCO·3d

one of my projects is finally live

Runpod@runpod

We put a chat interface in the Runpod console. 23 tools across the full REST API. If you can do it in the dashboard, you can ask for it in chat. Find it in the console ;)

English

150

tim@NERDDISCO·3d

@crislenta @supercell @ipaananen wow, this looks super awesome

English

Cris Lenta@crislenta·4d

😵 5 star hotel for a private AI hackathon by @supercell WE HAVE A PRIVATE CHEF > incredible breakfast > snacks, fruits, drinks > claude code credits 😂 > the goat @ipaananen in the house > vibe is off the charts THE FINNS ARE SETTING A NEW STANDARD

English

609

tim@NERDDISCO·4d

@0xSero as it should be faster than llama.cpp?

English

166

0xSero@0xSero·4d

I am going all in on Exllamav3 This is the middle ground between fast, performant, works on consumer cards, and intelligent. VLLM and Sglang are my go to but they're too finnicky below certain bits.

English

144

8.4K

tim@NERDDISCO·4d

@levelsio yes

@levelsio@levelsio·5d

Okay let's see who can reply to this

English

2.5K

2.2K

tim@NERDDISCO·5d

@Prince_Canuma this is super cool

English

2.8K

Prince Canuma@Prince_Canuma·6d

Just implemented Google’s TurboQuant in MLX and the results are wild! Needle-in-a-haystack using Qwen3.5-35B-A3B across 8.5K, 32.7K, and 64.2K context lengths: → 6/6 exact match at every quant level → TurboQuant 2.5-bit: 4.9x smaller KV cache → TurboQuant 3.5-bit: 3.8x smaller KV cache The best part: Zero accuracy loss compared to full KV cache.

Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English

147

411

5.2K

713.6K

tim@NERDDISCO·6d

1.0107 15.5mb ttt lr=0.0032, 12ep

English

tim@NERDDISCO·6d

1.0516 15.7mb ttt lr=0.0008, 8ep, 8 blocks

English

tim@NERDDISCO·23 Mar

1.1276 = autoresearch + 8xh100 sxm for 48h 12l int5 gptq, mlp 3.5x, leakyrelu(0.5)², full mha, gated attn, value residual, ema 0.997, ttt 3ep, xsa-all, partial rope, ln scale, ve128, bigramhash 8192, early qat, prune 2% thx to @runpod for the compute o7

OpenAI@OpenAI

Are you up for a challenge? openai.com/parameter-golf

English

265

tim@NERDDISCO·6d

@max4c_ @bcherny nothing less o7

English

Max Forsey@max4c_·6d

@bcherny @NERDDISCO goals for runpod labs

English

Boris Cherny@bcherny·6d

Little known fact, the Anthropic Labs team (the team I joined Anthropic to be on) shipped: - MCP - Skills - Claude Desktop app - Claude Code It was just a few of us, shipping fast, trying to keep pace with what the model was capable of. Those early Desktop computer use prototypes, back in the Sonnet 3.6 days, felt clunky and slow. But it was easy to squint and imagine all the ways people might use it once it got really good. Fast forward to today. I am so excited to release full computer use in Cowork and Dispatch. Really excited to see what you do with it!

Claude@claudeai

You can now enable Claude to use your computer to complete tasks. It opens your apps, navigates your browser, fills in spreadsheets—anything you'd do sitting at your desk. Research preview in Claude Cowork and Claude Code, macOS only.

English

463

411

9.3K

986.3K