Zia Babar

1.3K posts

Zia Babar

Zia Babar

@ziababar

Applied AI | Data Engineering | Data Platforms | ex-@PwC | ex-@Teradata | ex-@IBM Researcher | PhD @UofT

Toronto, Canada Katılım Mayıs 2009
1.2K Takip Edilen276 Takipçiler
Zia Babar
Zia Babar@ziababar·
5/ Fine-tuning teaches a student. RAG hands a stranger the textbook. Recursive inference lets the stranger talk to themselves while reading. All three can produce correct answers. Only one will still know anything tomorrow.
English
0
0
1
12
Zia Babar
Zia Babar@ziababar·
4/ Recursive inference is the same trap with extra steps. Chain-of-thought, agentic loops, self-reflection. The model reads its own scratchpad, refines, erases. Better answers, identical weights. The "reasoning" lives in the prompt, not the network.
English
2
0
0
15
Zia Babar
Zia Babar@ziababar·
1/ RAG isn't "giving the model knowledge." It's an open-book exam, meaning the model never learned the material, it was just handed the relevant pages mid-answer. The "knowledge" isn't internalized within the model, rather it's copied into the prompt.
English
1
0
1
30
Zia Babar
Zia Babar@ziababar·
8/ CoT gets dismissed partly because it's been overhyped and partly because it looks like a hack by anthropomorphizing the model as 'thinking.' But the mechanism is real. It's about changing the optimization surface to make correct solutions easier to reach.
English
0
0
1
15
Zia Babar
Zia Babar@ziababar·
7/ Those tokens aren't dead weight. In a world where inference compute is increasingly the bottleneck, spending more tokens per query to get better answers is the right strategy. You're shifting compute budget from training to inference in service of quality.
English
1
0
1
16
Zia Babar
Zia Babar@ziababar·
1/Chain-of-thought gets dismissed as 'just printing steps,' but I've watched models use it to correct trajectories and catch errors they'd otherwise cement. It's not scaffolding for the human, but rather it's real leverage on correctness. A thread on why this matters.
English
1
0
1
20
Zia Babar
Zia Babar@ziababar·
4/ If you're building AI products, your real bottleneck isn't model capability. It's operational efficiency at scale. Most teams are still optimizing the wrong layer of the stack.
English
0
0
0
17
Zia Babar
Zia Babar@ziababar·
3/ Quantization, distillation, attention optimization, speculative decoding, batch serving strategies, etc. are problems that will separate winners from the rest. The winning companies won't be the ones with the largest models. They'll be the ones who optimized serving cheapest.
English
1
0
1
28
Zia Babar
Zia Babar@ziababar·
1/ Inference optimization will determine the next era of AI infrastructure, not model scale. Most builder narratives focus on training and capacity. What they miss is that every deployed model hits hard constraints around latency budgets, cost per token, power efficiency.
English
2
0
1
31
Zia Babar
Zia Babar@ziababar·
Most agentic systems score well on benchmarks but struggle in production. Not because the model is weak, but because the action space is poorly defined. Why do we optimize for accuracy instead of robustness in eval setups?
English
0
0
1
15
Zia Babar
Zia Babar@ziababar·
4/ We're building sophisticated execution engines, not genuine agents. When the system finally bears real consequences for its own choices (like when reversals are expensive or impossible) watch how differently it behaves. That's when it gets interesting.
English
0
0
1
17
Zia Babar
Zia Babar@ziababar·
3/ Real agents (humans, companies, evolved systems) make decisions that can't be undone. Hiring someone. Shipping product. Signing contracts. Acting on incomplete information. Living with the gap between what they intended and what happened.
English
1
0
1
18
Zia Babar
Zia Babar@ziababar·
1/ Most 'agentic' systems today are still just highly constrained action loops. Real agency requires tolerating irreversible decisions in uncertain environments. However, we're nowhere near that comfort level yet.
English
1
0
1
20