Indranil Chandra

814 posts

Indranil Chandra

@IndranilChandra

Creative Technologist | Solutions Architect | ML, Data & AI Engineering Practitioner

Mumbai, Maharashtra Katılım Ağustos 2012

1.6K Takip Edilen357 Takipçiler

Indranil Chandra@IndranilChandra·3d

@WHYkalwani @cursor_ai @sanjeed_i @WHYkalwani you should connect with @naa_rang, his Mumbai office has a super cool open terrace setup.

English

Yash Kalwani@WHYkalwani·3d

Wrapped up the @cursor_ai Bengaluru meetup with @sanjeed_i Loved the vibe. Loved the venue and the fact that some of the best AI talent in Bengaluru pulled up for the event. ( wish Mumbai had open terrace venues like this )

English

1.4K

Indranil Chandra@IndranilChandra·3d

@xprilion @huggingface Crazy 👏🏼

English

Anubhav Singh@xprilion·3d

Your ML research doesn't have to be restricted to @huggingface ecosystem. As cool as HF is (much love for the 🤗 guys), I love platform independence. inspired by their awesome ml-intern repo: OpenMLR - end to end ml research ai agent, running locally: your own LLM + your own compute (machine/LAN/cloud/sandbox) for code execution! 👉 openmlr.dev Its MIT licensed, can run in air-gap environments and is yours to own. ⭐️ github.com/xprilion/OpenM…

English

307

Indranil Chandra retweetledi

Akshay 🚀@akshay_pachaar·4d

Design principles for building an Agent harness. Most agent builders get three of the seven core design decisions exactly backwards. Every production agent harness is the result of seven architectural bets. Agent count, reasoning strategy, context strategy, verification, permissions, tool scoping, and harness thickness. On three of these, the obvious answer is the wrong one. 𝗠𝗼𝗿𝗲 𝘁𝗼𝗼𝗹𝘀 𝗺𝗲𝗮𝗻𝘀 𝗮 𝗺𝗼𝗿𝗲 𝗰𝗮𝗽𝗮𝗯𝗹𝗲 𝗮𝗴𝗲𝗻𝘁. This is the first intuition that breaks. More tools feels like more capability, the same way more options on a menu feels like a better restaurant. It isn't. Every tool you expose to the model eats context, adds a decision point, and creates another chance for the model to pick the wrong function for the job. Vercel cut 80% of the tools from v0 and the agent got better. Claude Code dynamically loads only the tools needed for the current step and cuts context by 95%. The principle is the opposite of what most teams ship with. A bloated toolkit looks like capability and behaves like cognitive load. 𝗥𝗲𝗔𝗰𝘁 𝗶𝘀 𝘁𝗵𝗲 𝗺𝗼𝗱𝗲𝗿𝗻 𝘄𝗮𝘆. 𝗣𝗹𝗮𝗻𝗻𝗶𝗻𝗴 𝗶𝘀 𝗼𝗹𝗱 𝘀𝗰𝗵𝗼𝗼𝗹. ReAct is the default pattern in most tutorials. Think, act, observe, repeat. It feels sophisticated because the model reasons at every step. But reasoning at every step is expensive. Plan-and-execute, where the agent makes a plan once and then runs through it, hits 3.6x faster on many workloads. The reason is simple. Most steps in a multi-step task don't need fresh reasoning. They need execution. ReAct buys adaptability. Plan-and-execute buys speed and predictability. For bounded tasks with clear structure, planning once and executing wins cleanly. The "obviously more advanced" pattern is often the worse choice. 𝗣𝗲𝗿𝗺𝗶𝘀𝘀𝗶𝘃𝗲 𝗵𝗮𝗿𝗻𝗲𝘀𝘀𝗲𝘀 𝘀𝗵𝗶𝗽 𝗳𝗮𝘀𝘁𝗲𝗿. 𝗥𝗲𝘀𝘁𝗿𝗶𝗰𝘁𝗶𝘃𝗲 𝗼𝗻𝗲𝘀 𝘀𝗹𝗼𝘄 𝘆𝗼𝘂 𝗱𝗼𝘄𝗻. This is the one that burns teams in production. Permissive harnesses feel fast in development. The agent just works. It calls tools, mutates state, takes actions. No friction. No approval gates. Then it ships. And the first time the agent does something irreversible that it shouldn't have, the post-mortem starts. Restrictive harnesses feel slow because they ask for confirmation on high-stakes operations. That friction is the feature. A gated tool call is a tool call you can still recover from. The teams that ship permissive harnesses to production are the ones who haven't yet had the incident that makes them switch. 𝗧𝗵𝗲 𝗽𝗮𝘁𝘁𝗲𝗿𝗻 All three mistakes share a shape. The intuitive answer optimizes for what feels good during development. More capability on display, more reasoning happening, less friction in the loop. The correct answer optimizes for how the agent actually performs under real workloads. Less context pressure, fewer wasted LLM calls, fewer irreversible mistakes. The diagram below lays out all seven decisions. The ones above are where most teams are currently betting wrong. The article goes deep on each trade-off, with examples from how Anthropic, OpenAI, CrewAI, and LangChain have actually answered them. I'm also building a minimal agent harness from scratch. Didactic, easy to read, no magic. Open-sourcing it soon. Stay tuned.

Akshay 🚀@akshay_pachaar

x.com/i/article/2040…

English

430

53.5K

Indranil Chandra retweetledi

Kevin Simback 🍷@KSimback·10 Nis

🚀 Want FREE models you can plug into OpenClaw or Hermes? Here are 9 resources you can use for free access to model APIs No local setup, no credit card, just pure cloud APIs with OpenAI-compatible endpoints You can’t get free Opus quality (yet) but all of these have genuine free tiers right now (rate limits may apply) and are good enough to get started if you don’t want to spend $ to get started with agents 1️⃣ OpenRouter Free Models (Gemma 4 31B/26B, NVIDIA Nemotron 3 Super 120B MoE, MiniMax M2.5, Qwen3 variants, Llama 4/3.3, gpt-oss-120B, Arcee Trinity, etc.) • ~29 completely free $0/M token models • Insane variety + top-tier open model evals (especially coding & agents) • Best for rotating models automatically 👉 Sign up: openrouter.ai/keys 2️⃣ Google Gemini API (Gemini 2.5 Pro / Flash series) • Strongest overall free frontier model • Excellent multimodal, 1M+ context, native tool calling & agentic performance • Very generous free limits (often 5–15 RPM) 👉 Sign up: aistudio.google.com/app/apikey 3️⃣ NVIDIA (Nemotron variants, Llama 3.3 70B, Qwen3 235B, Mistral Large, etc.) • Optimized high-performance open models • Free prototyping tier (~40 RPM) 👉 Sign up: build.nvidia.com/explore/discov… 4️⃣ Grok Cloud (Llama 4 Scout, Llama 3.3 70B, Qwen3 32B, gpt-oss models, etc.) • Blazing-fast inference (hundreds of tokens/sec) • Perfect for real-time agents • Strong open-model performance with solid free tier 👉 Sign up: console.groq.com/keys 5️⃣ Cerebras Cloud (Qwen3 235B, Llama 3.3 70B, DeepSeek variants, etc.) • Massive models with excellent reasoning/coding evals • Very generous daily free limits (~30 RPM, up to 1M+ tokens/day on some) 👉 Sign up: cloud.cerebras.ai 6️⃣ Mistral La Plateforme (Mistral Large 3, Small 3.1, Ministral 8B, etc.) • Strong in coding, multilingual & agentic tasks • Solid free tier (~1 req/s, ~1B tokens/month) 👉 Sign up: console.mistral.ai/api-keys 7️⃣ Cohere (Command A, Command R+, Aya Expanse 32B, etc.) • Free tier: 20 RPM, 1K requests/month 👉 Sign up: dashboard.cohere.com/api-keys 8️⃣ GitHub Models (Llama 3.3 70B, DeepSeek R1, some GPT-4o previews, etc.) • Decent mid-tier evals with easy GitHub integration • Free tier limits (10–15 RPM) 👉 Sign up: github.com/marketplace/mo… 9️⃣ Cloudflare Workers AI (Llama 3.3 70B, Qwen QwQ 32B, etc.) • Lightweight but solid for simple agents • Free tier: 10K neurons/day 👉 Sign up: dash.cloudflare.com/profile/api-to… Pro tips for agent builders: • Most work instantly with OpenAI SDK (just change base URL + your key) • Start with OpenRouter for quality/variety (they often feature new free models) • Add Groq as speed fallback • Rotate providers when you hit caps Free intelligence for your agent is just a signup away!

English

105

773

103.4K

Indranil Chandra retweetledi

Dhruv@dhruvtwt_·6d

Why is no one talking about this? @nvidia is offering around 80 AI models via hosted APIs absolutely for free. You get access to MiniMax M2.7, GLM 5.1, Kimi 2.5, DeepSeek 3.2, GPT-OSS-120B, Sarvam-M etc. This plugs straight into OpenClaude, OpenCode, Zed IDE, Hermes agent and even with Cursor IDE. Setup: – Grab API key: build.nvidia.com/models – base_url = "integrate.api.nvidia.com/v1" – api_key = "$NVIDIA_API_KEY" – select model (e.g. minimaxai/minimax-m2.7) If you’re building or experimenting, this is basically free inference. Lock in and start building today anon. Thank me later.

English

536

1.9K

18.3K

1.6M

Indranil Chandra retweetledi

ClaudeDevs@ClaudeDevs·6d

New blog: Building agents that reach production systems with MCP. When should agents use direct APIs vs CLIs vs MCP? Plus patterns for building MCP servers, context-efficient clients and pairing MCP with skills. claude.com/blog/building-…

English

320

3.3K

466.9K

Indranil Chandra retweetledi

Walden@walden_yan·6d

x.com/i/article/2046…

ZXX

190

1.6K

610K

Indranil Chandra@IndranilChandra·18 Nis

@jayitabhattac11 Is it able to work well with structured data like tables and code snippets as well?

English

Jayita Bhattacharyya (JB)@jayitabhattac11·18 Nis

@IndranilChandra Did a dig onto the library Turns out backend is in Java and some unconventional OCR tricks X-Y-cut++ and text clustering instead of direct LLM/VLM usage Although there is option of hybrid - has compatibility with docling

English

Jayita Bhattacharyya (JB)@jayitabhattac11·18 Nis

Proven wrong again! github.com/opendataloader… 🙌 100 pgs in 2secs 🤯 Did a speed test with GLM-OCR0.9B on MLX server Beats by major margins, nowhere close.

Jayita Bhattacharyya (JB)@jayitabhattac11

GLM-OCR-0.9B >>> anything It blows my mind how a SLM achieves not just speed but also at par accuracy @Zai_org has cooked big time!

English

106

Indranil Chandra retweetledi

Boris Cherny@bcherny·16 Nis

Opus 4.7 feels more intelligent, agentic, and precise than 4.6. It took a few days for me to learn how to work with it effectively, to fully take advantage of its new capabilities. Will post a few more tips throughout the day, starting with this blog post: claude.com/blog/best-prac…

English

267

618

6.8K

762.9K

Indranil Chandra retweetledi

Boris Cherny@bcherny·16 Nis

Dogfooding Opus 4.7 the last few weeks, I've been feeling incredibly productive. Sharing a few tips to get more out of 4.7 🧵

English

336

1.1K

11.8K

1.6M

Indranil Chandra retweetledi

Harrison Chase@hwchase17·11 Nis

x.com/i/article/2042…

ZXX

109

520

3.9K

1.9M

Indranil Chandra retweetledi

Thariq@trq212·16 Nis

x.com/i/article/2044…

ZXX

288

8.5K

2.3M

Indranil Chandra@IndranilChandra·15 Nis

ZXX

Indranil Chandra retweetledi

Peter Pang@intuitiveml·13 Nis

x.com/i/article/2043…

ZXX

154

544

3.6K

1.8M

Indranil Chandra retweetledi

Akshay 🚀@akshay_pachaar·13 Nis

x.com/i/article/2043…

ZXX

306

1.9K

616.8K

Indranil Chandra retweetledi

Akshay 🚀@akshay_pachaar·6 Nis

x.com/i/article/2040…

ZXX

386

2.4K

1.2M

Indranil Chandra retweetledi

Akshay 🚀@akshay_pachaar·10 Nis

What does every big company think about the agent harness? Anthropic, OpenAI, CrewAI, LangChain. They all build agents. They all wrap their models in infrastructure to make them useful. They each call it the harness. But they agree on one thing. And disagree on everything else. The agreement: the model is not the product. The infrastructure around the model is. The disagreement: how much of that infrastructure should exist. This is the most important architectural bet in AI right now. And each company is placing a different one. 𝗔𝗻𝘁𝗵𝗿𝗼𝗽𝗶𝗰 bets on the model. Their harness is deliberately thin. A "dumb loop" that assembles the prompt, calls the model, executes tool calls, and repeats. The model makes all the decisions. The harness just manages turns. Their bet: as models get smarter, you need less infrastructure, not more. 𝗢𝗽𝗲𝗻𝗔𝗜 takes a similar but slightly thicker approach. Their Agents SDK is "code-first," meaning workflow logic lives in native Python, not in some graph DSL. But they add more structure: strict priority stacks for instructions, multiple orchestration modes, and explicit agent handoff patterns. 𝗖𝗿𝗲𝘄𝗔𝗜 adds a deterministic backbone. Their Flows layer handles routing and validation with hard-coded logic, while their Crews handle the autonomous parts. Intelligence where it matters, control everywhere else. 𝗟𝗮𝗻𝗴𝗚𝗿𝗮𝗽𝗵 bets on explicit control. The harness encodes the logic. Every decision point is a node in a graph. Every transition is a defined edge. Planning steps, routing strategies, multi-step workflows are all spelled out in the harness, not left to the model. Notice the spectrum. On one end: trust the model, keep the harness thin. On the other: encode the logic, make the harness thick. And here's where it gets interesting. The scaffolding metaphor makes this concrete. Construction scaffolding is temporary infrastructure that lets workers reach floors they couldn't access otherwise. It doesn't do the building. But without it, workers can't reach the upper floors. The key word is temporary. As the building goes up, scaffolding comes down. Manus demonstrated this perfectly. They rebuilt their agent five times in six months. Each rewrite removed complexity. Complex tool definitions became simple shell commands. "Management agents" became basic handoffs. The scaffolding did its job. So they removed it. This is also why Anthropic regularly deletes planning steps from Claude Code's harness. Every time a new model version ships that can handle something internally, the corresponding harness logic gets stripped out. But there's a catch. Models are now trained with specific harnesses in the loop. Claude Code's model learned to use the exact scaffolding it was built with. Change the scaffolding, and performance drops. The worker trained on THIS scaffolding. Swap it out, and they stumble. So the field is converging on a principle: Build scaffolding that's designed to be removed. But remove it carefully, because the model learned to lean on it. The "future-proofing test" for any agent system: if dropping in a more powerful model improves performance without adding harness complexity, the design is sound. Two products using the exact same model can perform completely differently based on this one decision: how thick is the harness? LangChain changed only the infrastructure (same model, same weights) and jumped from outside the top 30 to rank 5 on TerminalBench 2.0. The model didn't improve. The scaffolding around it did. The article below is a deep dive on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent.