Zia Babar

@ziababar

Toronto, Canada Katılım Mayıs 2009

1.2K Takip Edilen276 Takipçiler

Zia Babar@ziababar·26 Nis

5/ Fine-tuning teaches a student. RAG hands a stranger the textbook. Recursive inference lets the stranger talk to themselves while reading. All three can produce correct answers. Only one will still know anything tomorrow.

English

Zia Babar@ziababar·26 Nis

4/ Recursive inference is the same trap with extra steps. Chain-of-thought, agentic loops, self-reflection. The model reads its own scratchpad, refines, erases. Better answers, identical weights. The "reasoning" lives in the prompt, not the network.

English

Zia Babar@ziababar·26 Nis

1/ RAG isn't "giving the model knowledge." It's an open-book exam, meaning the model never learned the material, it was just handed the relevant pages mid-answer. The "knowledge" isn't internalized within the model, rather it's copied into the prompt.

English

Zia Babar@ziababar·26 Nis

8/ CoT gets dismissed partly because it's been overhyped and partly because it looks like a hack by anthropomorphizing the model as 'thinking.' But the mechanism is real. It's about changing the optimization surface to make correct solutions easier to reach.

English

Zia Babar@ziababar·26 Nis

7/ Those tokens aren't dead weight. In a world where inference compute is increasingly the bottleneck, spending more tokens per query to get better answers is the right strategy. You're shifting compute budget from training to inference in service of quality.

English

Zia Babar@ziababar·26 Nis

1/Chain-of-thought gets dismissed as 'just printing steps,' but I've watched models use it to correct trajectories and catch errors they'd otherwise cement. It's not scaffolding for the human, but rather it's real leverage on correctness. A thread on why this matters.

English

Zia Babar@ziababar·25 Nis

4/ If you're building AI products, your real bottleneck isn't model capability. It's operational efficiency at scale. Most teams are still optimizing the wrong layer of the stack.

English

Zia Babar@ziababar·25 Nis

3/ Quantization, distillation, attention optimization, speculative decoding, batch serving strategies, etc. are problems that will separate winners from the rest. The winning companies won't be the ones with the largest models. They'll be the ones who optimized serving cheapest.

English

Zia Babar@ziababar·25 Nis

1/ Inference optimization will determine the next era of AI infrastructure, not model scale. Most builder narratives focus on training and capacity. What they miss is that every deployed model hits hard constraints around latency budgets, cost per token, power efficiency.

English

Zia Babar@ziababar·23 Nis

Most agentic systems score well on benchmarks but struggle in production. Not because the model is weak, but because the action space is poorly defined. Why do we optimize for accuracy instead of robustness in eval setups?

English

Zia Babar@ziababar·22 Nis

4/ We're building sophisticated execution engines, not genuine agents. When the system finally bears real consequences for its own choices (like when reversals are expensive or impossible) watch how differently it behaves. That's when it gets interesting.

English

Zia Babar@ziababar·22 Nis

3/ Real agents (humans, companies, evolved systems) make decisions that can't be undone. Hiring someone. Shipping product. Signing contracts. Acting on incomplete information. Living with the gap between what they intended and what happened.

English

Zia Babar@ziababar·22 Nis

1/ Most 'agentic' systems today are still just highly constrained action loops. Real agency requires tolerating irreversible decisions in uncertain environments. However, we're nowhere near that comfort level yet.

English

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry