Aankit Roy

291 posts

Aankit Roy banner
Aankit Roy

Aankit Roy

@AankitRoy

Building in AI agentic systems | YC ’19 alum • Built Khabri (5M+ users) | High-energy problem-solver 🧘‍♂️ https://t.co/3etBHt1xBY

Bangalore Katılım Ağustos 2017
316 Takip Edilen152 Takipçiler
Aankit Roy retweetledi
Ankur Mishra
Ankur Mishra@iAnkurMishra·
The Architect of AI x Master of the Cosmos. $1.02 Trillion plus in companion wealth together. @elonmusk @nvidia
Ankur Mishra tweet media
English
0
2
1
93
Aankit Roy
Aankit Roy@AankitRoy·
haiku 4.5 just made "frontier-level AI" feel instant and cheap. 🧠 73% SWE-bench Verified ⚡ 3-second responses 💰 $1 / $5 per million tokens sonnet plans, haiku executes → multi-agent workflows in real-time. the economics of AI just broke. #AI #Claude #Anthropic #Haiku
English
0
0
1
67
Aankit Roy
Aankit Roy@AankitRoy·
Building multi-agent systems? Don’t assume LangGraph (or any framework) saves you from model lock-in. Switched from Claude Sonnet 4 → Gemini 2.5 Pro. Simple config change turned into a 3-week rewrite. Agent systems magnify model quirks. Test early or rebuild later. #AIagents #LangGraph #LLMops
English
0
0
3
114
Aankit Roy retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.
Andrej Karpathy tweet media
English
687
3.4K
24.2K
5.8M
Aankit Roy retweetledi
Alex Hughes
Alex Hughes@alxnderhughes·
What the fuck just happened 🤯 Stanford just made fine-tuning irrelevant with a single paper. It’s called Agentic Context Engineering (ACE) and it proves you can make models smarter without touching a single weight. Instead of retraining, ACE evolves the context itself. The model writes, reflects, and rewrites its own prompt over and over until it becomes a self-improving system. Think of it like the model keeping a living notebook. Every failure becomes a lesson. Every success becomes a rule. And the results are absurd: +10.6% better than GPT-4–powered agents on AppWorld +8.6% on financial reasoning 86.9% lower cost and latency No labels. Just feedback. Everyone’s obsessed with “short, clean” prompts. ACE flips that. It builds dense, evolving playbooks that compound over time and never forget. Because LLMs don’t crave simplicity. They crave context density. If this scales, the next generation of AI won’t be fine-tuned. It’ll be self-tuned. We’re entering the era of living prompts.
Alex Hughes tweet media
English
181
588
4K
362.1K
Aankit Roy retweetledi
Aurimas Griciūnas
Aurimas Griciūnas@Aurimas_Gr·
Integrating 𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗥𝗔𝗚 Systems via 𝗠𝗖𝗣 👇 If you are building RAG systems and packing many data sources for retrieval, most likely there is some agency present at least at the data source selection for retrieval stage. This is how MCP enriches the evolution of your Agentic RAG systems in such case (𝘱𝘰𝘪𝘯𝘵 2.): 𝟭. Analysis of the user query: we pass the original user query to a LLM based Agent for analysis. This is where: ➡️ The original query can be rewritten, sometimes multiple times to create either a single or multiple queries to be passed down the pipeline. ➡️ The agent decides if additional data sources are required to answer the query. 𝟮. If additional data is required, the Retrieval step is triggered. We could tap into variety of data types, few examples: ➡️ Real time user data. ➡️ Internal documents that a user might be interested in. ➡️ Data available on the web. ➡️ … 𝗧𝗵𝗶𝘀 𝗶𝘀 𝘄𝗵𝗲𝗿𝗲 𝗠𝗖𝗣 𝗰𝗼𝗺𝗲𝘀 𝗶𝗻: ✅ Each data domain can manage their own MCP Servers. Exposing specific rules of how the data should be used. ✅ Security and compliance can be ensured on the Servel level for each domain. ✅ New data domains can be easily added to the MCP server pool in a standardised way with no Agent rewrite needed enabling decoupled evolution of the system in terms of 𝗣𝗿𝗼𝗰𝗲𝗱𝘂𝗿𝗮𝗹, 𝗘𝗽𝗶𝘀𝗼𝗱𝗶𝗰 𝗮𝗻𝗱 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗠𝗲𝗺𝗼𝗿𝘆. ✅ Platform builders can expose their data in a standardised way to external consumers. Enabling easy access to data on the web. ✅ AI Engineers can continue to focus on the topology of the Agent. 𝟯. Retrieved data is consolidated and Reranked by a more powerful model compared to regular embedder. Data points are significantly narrowed down. 𝟰. If there is no need for additional data, we try to compose the answer (or multiple answers or a set of actions) straight via an LLM. 𝟱. The answer gets analyzed, summarized and evaluated for correctness and relevance: ➡️ If the Agent decides that the answer is good enough, it gets returned to the user. ➡️ If the Agent decides that the answer needs improvement, we try to rewrite the user query and repeat the generation loop. Are you using MCP in your Agentic RAG systems? Let me know about your experience in the comment section 👇 #LLM #AI #MachineLearning
Aurimas Griciūnas tweet media
English
27
194
1.1K
65.1K
Aankit Roy retweetledi
Unwind AI
Unwind AI@unwind_ai_·
Context Engineering Template for AI Agents. 100% Opensource.
Unwind AI tweet media
English
9
39
301
25.7K
Aankit Roy retweetledi
Matt Pocock
Matt Pocock@mattpocockuk·
RE: the agent/workflow debate Agents and workflows are a spectrum. A system can be more or less 'agentic'. A pure 'agent' is too volatile to be sent to production - you need a bit of determinism to rein it in.
Matt Pocock tweet media
English
45
48
647
59.2K
Aankit Roy retweetledi
Gaurav Sen
Gaurav Sen@gkcs_·
The effect of AI on software engineering.
English
31
50
787
76.4K
Aankit Roy
Aankit Roy@AankitRoy·
@connordavis_ai If you are interested to know how it actually works in practice, DM me to understand.
English
1
0
0
304
Connor Davis
Connor Davis@connordavis_ai·
I finally understand why 99% of AI agents fail. After reading Anthropic's new context engineering guide, everything clicked. Prompt engineering is outdated. Context engineering is the game. Here's the framework that actually works:
Connor Davis tweet media
English
14
76
487
40.7K
Aankit Roy retweetledi
Santiago
Santiago@svpino·
Just like ChatGPT killed Google. Just like GPT-5 killed software engineering. Just like deep learning killed classical ML. Just like long context killed RAG. Just like MCP killed APIs. Just like synthetic data killed real data. Just like laptops killed desktops. Just like tablets killed laptops. Just like Web apps killed native apps.
Sahil@sahilypatel

openai just killed n8n

English
472
703
10.6K
1.6M
Aankit Roy
Aankit Roy@AankitRoy·
BREAKING: OpenAI just dropped AgentKit and it's actually insane 🤯 You can now build AI agents with DRAG AND DROP. What used to take MONTHS now takes HOURS. Ramp built a full procurement agent in a few hours (not quarters) 70% faster iteration cycles Visual canvas for multi-agent workflows One-click deployment The "no-code AI agent" era just started and most people are sleeping on this 👀 Companies using this early are about to have an unfair advantage. #OpenAI #AgentKit #AI #DevDay
English
1
0
3
142
Aankit Roy retweetledi
Poonam Soni
Poonam Soni@CodeByPoonam·
AI Software Engineer shares how they vibe code at FAANG
Poonam Soni tweet media
English
53
277
3.4K
288.3K
Aankit Roy retweetledi
🚨 AI News | TestingCatalog
🚨 AI News | TestingCatalog@testingcatalog·
BREAKING 🚨: OpenAI is planning to announce Agent Builder on DevDay. Agent builder will let users build their agentic workflows, connect MCPs, ChatKit widgets and other tools. This is one of the smoothest Agent builder canvases I've used so far. The year of Agents 🤖
English
247
729
5.9K
1.6M
Aankit Roy
Aankit Roy@AankitRoy·
ok this is actually crazy... been building custom chat UIs for EVERY agent like an idiot 🤦‍♂️ turns out LangChain already solved this: One Next.js app → chat with ANY LangGraph agent Python or TypeScript (doesn't matter) localhost or production (doesn't matter) npx create-agent-chat-app (literally ONE command) just point it at your agent and you're done been wasting WEEKS on this github.com/langchain-ai/a… you're welcome 🦜
English
1
0
2
96