hammad 🔍

2.5K posts

hammad 🔍

@HammadTime

normal considered harmful | cto @trychroma

Berkeley, CA Beigetreten Eylül 2009

2.6K Folgt1.8K Follower

Angehefteter Tweet

hammad 🔍@HammadTime·15 Şub

Last year at @tryramp I laid out three predictions for how language models would evolve. I was trying to clarify which bets might actually be durable over time. A lot of it is now starting to take shape. Here’s an update. Thread 👇

English

131

27.2K

hammad 🔍 retweetet

The Andalusian Edit@andalusianedit·6d

"Glad tidings to he who knows his own faults more than other people know it." — Ibn Hazm al-Andalusi

English

821

48.2K

hammad 🔍@HammadTime·3h

@trq212 Eid Mubarak!

Eesti

Thariq@trq212·5h

Eid Mubarak! a day of celebration after a month of reflection

The Andalusian Edit@andalusianedit

"Glad tidings to he who knows his own faults more than other people know it." — Ibn Hazm al-Andalusi

English

879

33.6K

hammad 🔍@HammadTime·7h

@scottbelsky @Suhail nuance - For anything custom - dedicated models, LoRAs, training etc. they are not allocating capacity. which is what I assumed op was interested in. this is 100% true. Sure “run existing model” is accessible.

English

scott belsky@scottbelsky·9h

@HammadTime @Suhail not true - they just signed deal with AWS and Ramp reports they are among the fastest growing spend across their card holders just a week ago

English

Suhail@Suhail·20h

I am now at 5 GPU providers being completely sold out for a single node of 8xH100s. I don’t think people understand the gravity of what is about to come.

Suhail@Suhail

The run on inference capacity is coming. You have been warned.

English

103

108

1.9K

420.1K

hammad 🔍@HammadTime·19h

@scottbelsky @Suhail cerebras is turning people away because openAI bought all their capacity

English

237

scott belsky@scottbelsky·19h

@Suhail Cerebras just in time for this

English

6.8K

hammad 🔍@HammadTime·1d

dudes be like frontier models are really good at filesystems but not search because they were post trained on them my brother in christ they were postrained on both

English

1.7K

hammad 🔍@HammadTime·2d

I think the CPU/Motherboard analogy is funnily the most accurate. Subagents are forming the periphery. Which can give you durable market position?

hammad 🔍@HammadTime

#3 - At first, capability gets discovered outside the model in prompts, chains, routers, tools, human supervision, and harnesses. As models improve, more of that gets trained in. This is part of why we bias toward giving models filesystem tools today. These tools are already being post-trained in. What happens when models get better at generic tool use and composition? (They will.) Is your system structured in a way that can accommodate that? Can you take advantage of it when it happens? We’ve seen similar patterns before — in computer vision, and in hardware (e.g. northbridge/southbridge consolidation). Component consolidation is a fairly natural outcome in engineering systems.

English

403

hammad 🔍@HammadTime·2d

@HanchungLee @code_star Yep

Han@HanchungLee·3d

@code_star felt like this is a extremely specialized tiny asic / controllers. so in terms of computers, this is vrm or ec on motherboard. not as core as pch or usb/storage.

English

441

Cody Blakeney@code_star·3d

I have no idea what specialized model for context compaction means and I have like 5 papers and announcements to read before I can think about this. It’s crazy that for even a single model we may have a whole ecosystem of specialized models for optimization. Spec decode model, compaction. What comes after that?

Morph@morphllm

Introducing FlashCompact - the first specialized model for context compaction 33k tokens/sec 200k → 50k in ~1.5s Fast, high quality compaction

English

13.8K

hammad 🔍@HammadTime·2d

@julianlehr 100%

Julian Lehr@julianlehr·4d

A hill I'll die on: Current LLM chat interfaces are a regression from GUIs. Actions that used to be links, buttons, or keyboard shortcuts are now things I have to spell out in conversation. Why?

English

145

1.7K

287.6K

hammad 🔍@HammadTime·5d

@thdxr perhaps a false dichotomy

English

156

dax@thdxr·6d

i think a elon type of company is better suited to make inference 10x cheaper than make a frontier model

English

1.8K

131.4K

hammad 🔍@HammadTime·6d

@r0ck3t23 yep - x.com/HammadTime/sta…

hammad 🔍@HammadTime

#2 - Most economically valuable data is siloed. And learning is often data-bound. So it’s unlikely that we get a single frontier model that you can drop into any environment and expect to perform incredibly well zero-shot. What we get instead are systems composed of specialized models, each adapted to a particular environment. We're seeing this today with many SLM companies focused on narrower domains, and the advent of industry-specific RL. If large frontier models get better at orchestrating / communicating with other models, how do you take advantage of this?

523

Dustin@r0ck3t23·6d

Perplexity CEO Aravind Srinivas just shattered the greatest illusion of the AI arms race. The entire market is waiting for a single, god-like superintelligence to win the entire board. The physics of compute are forcing the exact opposite outcome. Models are not converging into a single monopoly. They’re violently fracturing into hyper-specialized execution nodes. Srinivas: “Towards the end of 2025, what happened was models started specializing. Even within coding, which you think might be a specialization, OpenAI’s Codex models and Anthropic’s Claude models are very different in terms of what they’re good at.” Bet your entire enterprise architecture on a single AI provider? You’re hardcoding your own ceiling. You don’t want a generalized model that’s “okay” at everything. You want a swarm of apex specialists. One ruthlessly optimized for syntax. One for visual synthesis. One for predictive reasoning. The future is not one AI. It’s the instantaneous orchestration of the absolute best compute for the exact task at hand. Platform lock-in is suicide. Srinivas: “Enterprise users are always selecting multiple different models all the time. That’s actually one of the value propositions of the Perplexity product. You don’t have to feel locked into one model provider, you don’t have to have one horse in the race.” Traditional tech giants are desperately trying to trap users inside their specific algorithmic ecosystem. Winning operators completely bypass the vendor war by becoming model-agnostic. When the foundational intelligence of the world is leapfrogging itself every three months, brand loyalty is a massive liability. The operators winning the next decade won’t care whether OpenAI, Anthropic, or Google trained the model. They’ll plug into an agnostic orchestration layer that autonomously routes to whichever specialized network currently dominates that exact sector of the board. The highest-leverage position is no longer building the intelligence. It’s directing the orchestra. Srinivas: “This is one particular skill, writing is another skill, being good at images and videos is another skill. You can hope that Perplexity figures out which model is best for what purpose, and you just have to come to the product and use it.” Multi-trillion-dollar hyperscalers burning billions fighting the model wars. Sovereign orchestrator bypasses the entire war. Harvests the output of all of them. You don’t need to be an expert in the underlying architecture of a dozen different foundation models. You just need to command the routing engine. When AI transitions from a monolithic product into a fractured grid of specialized utility nodes, the ultimate monopoly belongs to the orchestrator that abstracts the complexity. Foundation model builders became interchangeable plumbing.

English

109

239

1.5K

384.9K

hammad 🔍@HammadTime·6d

the vibers have found the formalisms, I repeat the vibers have found the formalisms

BOOTOSHI 👑@KingBootoshi

HOLY FUK I JUST LEARNED ABOUT TLA+ AND IT'S SO GOOD FOR AGENTIC CODING ur telling ME that i can mathematically fact check every possible scenario of my design STATE to prevent bugs and crashes AND IF IT FINDS SOMETHING THE AGENTS GET INSTANT FEEDBACK AND LOOP FIXING IT TILL IT ALL POSSIBLE BUGS IN THE DESIGN ARE PATCHED LOL THIS IS OP

English

696

hammad 🔍@HammadTime·6d

me when I explain to my wife how I subtly prompt the 14 Claude codes I have running in parallel work trees

Stingr Golf@StingrGolf

Phil Mickelson talking about how he calculates yardages is incredible

English

324

hammad 🔍@HammadTime·6d

@mktpavlenko @trychroma 100%

Mykyta Pavlenko@mktpavlenko·14 Mar

@trychroma The underrated part is isolation, not just speed. You can test new retrieval or ranking ideas on the same data without gambling with the prod collection.

English

Chroma@trychroma·13 Mar

Collection Forking on Chroma Cloud unlocks faster workflows on top of your data without the overhead of starting from scratch.

English

5.5K

hammad 🔍@HammadTime·13 Mar

fork fork fork fork fork fork fork fork fork fork fork

Chroma@trychroma

Collection Forking on Chroma Cloud unlocks faster workflows on top of your data without the overhead of starting from scratch.

English

295

hammad 🔍@HammadTime·13 Mar

i revisit this every so often as the most glaring reminder of how many technology dichotomies are not a question of if but when.

English

102

hammad 🔍@HammadTime·12 Mar

people.csail.mit.edu/tzumao/diffvg/ Time to bring it back

English

147

hammad 🔍 retweetet

TJ@TJkrusinski·11 Mar

LLMs are so good at weird things

Joseph Viviano@josephdviviano

me: "can you use whatever resources you like, and python, to generate a short 'youtube poop' video and render it using ffmpeg ? can you put more of a personal spin on it? it should express what it's like to be a LLM" claude opus 4.6:

English

629

hammad 🔍 retweetet

Joseph Viviano@josephdviviano·10 Mar

English

550

1.2K

12.5K

1.4M

hammad 🔍@HammadTime·10 Mar

cool work

Zijian Chen@zijian42chen

🚀 Introducing AgentIR, a retriever that reads your agent’s mind (literally!) 🧠 Unlike humans, agents explicitly expose thoughts in reasoning tokens. Put them to use! 📈 Simple, substantial gains for agents on BrowseComp-Plus, 35% (BM25) ➡️ 50% (Qwen3-Embed) ➡️ 67% (AgentIR) 🧵

English

279

Entdecken

@trq212 @scottbelsky @Suhail @HanchungLee @code_star @julianlehr @thdxr @r0ck3t23