Krishanu

66 posts

Krishanu

@thysel55307

I'm an AI Engineer with 7+ years of experience in LLMs, deep learning, and AI solutions. Passionate about building scalable AI systems and driving innovation.

Katılım Aralık 2023

85 Takip Edilen25 Takipçiler

Sabitlenmiş Tweet

Krishanu@thysel55307·8 Mar

Built something I’m genuinely excited about: SwarmMate Audit Workpaper Adapter 🔥 This project started as a more general MCP / Claude Cowork Lite idea, but after digging into a real accounting-firm workflow, the direction became much clearer: The real problem is not “how do we give people more AI?” The real problem is: how do we stop AI from generating generic slop and instead make it produce structured, grounded, review-ready work? So I turned the idea into a custom workflow adapter for audit and tax teams. 1. It takes messy workpapers 2. Retrieves relevant evidence from source files 3. Drafts a structured audit artifact 4. Adds citations 5. Runs a quality check 6. Saves the final report This is the part I love most: it’s not trying to replace legacy systems like CCH. It’s solving the messy, manual, high-friction work that happens before those systems. That’s where the real opportunity is for small firms. So yes, MCP is part of the architectural thinking here — but the real value is building a tool around an actual workflow, actual pain points, and actual business constraints. From generic AI coworker ➝ focused workflow adapter From cool concept ➝ real business use case This is exactly the kind of direction I want to keep building in: AI systems that are not just impressive, but genuinely useful. 👇👇👇👇 #AI #GenerativeAI #AgenticAI #MCP #ModelContextProtocol #LangGraph #RAG #AgenticRAG #WorkflowAutomation #AIEngineering #Audit #TaxTech #AccountingTech #FinTech #DocumentAI #LLM #OpenAI #Gradio #Python #EnterpriseAI #AIAgents #KnowledgeWork #WorkflowAdapter #Automation #Innovation #BuildInPublic

English

Krishanu@thysel55307·15 Mar

Just built a cat shop MCP server and connected it to ChatGPT Developer Mode 🐱 Implemented: ✅ product + cart tools ✅ OAuth authentication ✅ SQLite-backed persistence ✅ ngrok deployment ✅ custom tool: update_cart_quantity This project was a great way to understand: 1. why OAuth matters for MCP servers 2. how MCP differs from A2A 3. how AI clients can securely interact with real tools 4. Loved how practical and hands-on this was. #MCP #A2A #AI #Python #OAuth #ChatGPT

English

Krishanu@thysel55307·11 Mar

Built a Fireworks AI powered RAG app and then benchmarked it against an OpenAI gpt-4.1-mini equivalent using RAGAS + LangSmith 🔥 In this walkthrough, I cover: 1. what Fireworks AI is and why it’s useful 2. how I used it for both chat + embeddings 3. how I made the RAG pipeline provider-aware 4. how I evaluated retrieval quality and answer correctness 5. and how I compared token usage and cost visibility across providers Final run snapshot: 🔥 Fireworks Context Precision: 1.0 🔥 OpenAI Context Precision: 1.0 🔥 Fireworks Factual Correctness: 0.555 🔥 OpenAI Factual Correctness: 0.530 Really enjoyed this one because it goes beyond “the app works” and gets into how well it works, how much it costs, and how to evaluate RAG systems properly. #AI #GenAI #LLM #RAG #RAGAS #LangSmith #FireworksAI #OpenAI #GPT41Mini #VectorSearch #Embeddings #Qdrant #LangChain #LangGraph #AIEngineering #ML #MachineLearning #PromptEngineering #Evaluation #LLMOps #AIBuilders #OpenSourceAI #AIProjects #Python #Developer #Tech #ArtificialIntelligence #DataScience #MLOps #AIAgents

English

Krishanu@thysel55307·10 Mar

Built and deployed a LangGraph project end-to-end 🔥🔥 In this walkthrough, I show: 👉 how the codebase is structured 👉 why langgraph.json is the heart of the deployment 👉 how multiple graphs are registered and served locally with uv 👉 how I added a custom evaluation graph 👉 how LangGraph Studio helps visualize node-by-node execution This was a great hands-on way to understand how agent workflows move from Python code to a runnable, inspectable graph. Would love to hear how others are designing evaluator loops and custom graph patterns. #LangGraph #LangChain #AIEngineering #GenerativeAI #AgenticAI #LLM #Python #AIDeployment #AIWorkflow #LangGraphStudio #OpenAI #MLOps #AIAgents #BuildInPublic #Developer

English

Krishanu@thysel55307·8 Mar

I walked through my Jupyter notebook where I combined custom X tools with GitHub MCP tools to build a real agentic workflow end to end. The agent can fetch X posts, summarize themes, use memory across steps, generate markdown files, create branches, commit code, and open pull requests — all from natural language instructions. This is exactly the kind of workflow that makes MCP + LangGraph so exciting for practical AI engineering. @AIMakerspace #MCP #ModelContextProtocol #LangGraph #LangChain #Jupyter #AIAgents #AgenticWorkflows #GenerativeAI #GenAI #LLM #Python #GitHub #GitHubMCP #XAPI #ToolCalling #AIEngineering #DeveloperTools #MachineLearning #Automation #TechTwitter

English

Krishanu@thysel55307·8 Mar

Just shared my demo on MCP (Model Context Protocol) — one of the most important concepts for building real-world AI agents. In this video, I explain how MCP standardizes tool access, why it matters, and how it works with LangGraph, LangChain, GitHub MCP, and custom API tools. 🔥🔥🔥 @AIMakerspace #MCP #ModelContextProtocol #AIAgents #AgenticWorkflows #LangGraph #LangChain #GitHubMCP #GenAI #LLM #LLMs #AIEngineering #Python #MachineLearning #Developer #AIDemo #TechTwitter #ArtificialIntelligence #OpenSource #ToolCalling #AIBuilder

English

Krishanu@thysel55307·6 Mar

Built an Advanced Retrieval with LangChain notebook and compared multiple RAG retrieval strategies on a health & wellness corpus. Covered: 1. Naive vector retrieval 2. BM25 3. Contextual compression / reranking 4. Multi-query retrieval 5. Parent document retrieval 6. Ensemble retrieval 7. Semantic chunking Also generated a synthetic golden dataset with Ragas and evaluated retrievers using: 1. Context Precision 2. Context Recall 3. Context Entity Recall Big takeaway: the “best” retriever depends on the tradeoff between performance, latency, and cost. Dense retrieval isn’t always enough, and lexical + hybrid strategies still matter a lot. Great exercise for understanding how retrieval choices shape RAG quality. #AIEngineering #LangChain #RAG #RetrievalAugmentedGeneration #LLM #MachineLearning #GenerativeAI

English

Krishanu@thysel55307·2 Mar

Just recorded a walkthrough of 2 Jupyter notebooks where I evaluate: 👉RAG systems with Ragas 👉Tool-using agents with Ragas Agent metrics Notebook 1: Chunking + overlap Baseline vs improved RAG Why rerankers matter (retrieve k=20 → rerank top_n=5) Metrics: context recall, faithfulness, factual correctness, answer relevancy, noise sensitivity Notebook 2: What an agent trace is Tool Call Accuracy vs Goal Accuracy vs Topic Adherence How to build test cases that actually catch failures If you’re building production AI apps, evaluation can’t be “vibes” — it needs measurable reliability.

English

Krishanu@thysel55307·18 Şub

Posting a quick video walkthrough of my Jupyter notebook where I built an end-to-end RAG evaluation loop: 1. Synthetic testset generation with Ragas (custom query distributions) 2. Experiments + dashboards in LangSmith 3. LLM-as-judge evals (QA correctness, helpfulness, “dopeness”) 4. Chain comparisons + why metrics shift (style vs retrieval vs latency) If you’re building production RAG systems, this is the workflow I’d recommend. Video walkthrough below 👇👇👇👇👇👇👇 @LangChainAI @OpenAI @AnthropicAI @GoogleDeepMind @MetaAI @mistralai @cohere #RAG #GenAI #LLM #AIEngineering #MLOps #MachineLearning #NLP #VectorDatabase #Embeddings #RetrievalAugmentedGeneration #LangChain #LangSmith #RAGAS #Evaluation #LLMOps #PromptEngineering #AI #Python #DataScience #Qdrant #OpenAI #Claude #Gemini #Llama #Mistral #Cohere #Hiring #AIJobs #TechCareers

English

114

Krishanu@thysel55307·10 Şub

Just posted a video walkthrough of my Open Deep Research notebook 👇👇👇👇👇 What I cover: 📷 Multi-level agent design (Agent ↔ Supervisor ↔ Researcher) 1. Why splitting state beats “one huge state” (clarity + less context rot) 2. Tool-driven research loops + compression/summarization 3. Parallel vs sequential researchers (speed vs cost/control) 4. Packaging the logic as modules (clean notebooks + reusable components) Big takeaway: good research agents = orchestration + disciplined context management, not just “more tokens.” @AIMakerspace @RLanceMartin @amazon @AnthropicAI @OpenAI #OpenDeepResearch #AgenticAI #MultiAgent #LLM #RAG #LangGraph #LangChain #AIEngineering #GenAI #LLMOps #PromptEngineering #ToolCalling #ContextEngineering #ResearchAutomation #Python #BuildInPublic #OpenSource

English

Krishanu@thysel55307·9 Şub

Just posted a video walkthrough of my Deep Agents notebook 👇👇 What I built + learned: 📷 Planning with todos (and why planning can slow you down if it’s too granular): 1. Context management: large docs + daily check-ins + weekly summaries 2. Subagents: specialist roles (exercise/nutrition/mindset) + least-privilege tools 3. Memory: persist user preferences across sessions (multi-user ready) 4. Production thinking: safety guardrails, observability, and cost controls @AIMakerspace @amazon @OpenAI @AnthropicAI Big takeaway: multi-agent apps work best when you keep tools scoped, prompts lean, and context intentional (avoid “context rot”). 👉👉 If you’re building agentic systems, this pattern is a clean blueprint. #DeepAgents #AgenticAI #MultiAgent #LangGraph #LangChain #LLM #RAG #GenAI #AIEngineering #LLMOps #PromptEngineering #ToolCalling #ContextEngineering #Memory #AIApps #Python #MLOps #SoftwareEngineering #BuildInPublic #DevCommunity #OpenSource #AI

English

Krishanu@thysel55307·3 Şub

Just dropped a walkthrough of my Agent Memory notebook 📷📷 I built a “production-shaped” wellness assistant that uses 4 memory types (not just chat history): Short-term memory (checkpointer): keeps the current conversation coherent Long-term memory (store): persists user profile + preferences across sessions Semantic memory: retrieves relevant advice via embeddings Episodic memory: recalls what worked before for the user Procedural memory: adapts the coaching style (concise vs friendly vs coach) Plus a simple Wellness Memory Dashboard that tracks metrics over time (mood/energy/sleep), references historical data in responses, and generates a personalized summary. If you’re building LLM apps, memory design is where “cool demo” becomes “real assistant.” 📷 Thanks @AIMakerspace #AgentMemory #LangGraph #LangChain #RAG #SemanticSearch #Embeddings #EpisodicMemory #ProceduralMemory #ContextWindow #LLM #GenAI #AIEngineering #Python #BuildInPublic #MLOps

English

Krishanu@thysel55307·2 Şub

Just recorded a walkthrough of my Multi-Agent LangGraph notebook 🎯🎯 What I built: 📷 Supervisor → routes to specialists (exercise / nutrition / sleep / stress) 1. Handoffs pattern vs Supervisor pattern 2. Added KB retrieval + optional web search 3. Built a hierarchical “Wellness Director” (teams + aggregation) Key lessons: • Routing is a product decision (speed vs consistency) • Keep specialists narrow + tools scoped to their domain • “Context rot” is real → pass only what each agent needs • KB-first improves reliability; web search boosts freshness but adds risk • Modular graphs = portability (easy to plug new agents in) Thanks, @AIMakerspace !! #LangGraph #MultiAgent #AgenticAI #LLM #RAG #AIEngineering #GenAI #Python #LLMOps #PromptEngineering #ToolCalling #AIApps #VectorDB #RetrievalAugmentedGeneration #OpenAI #MLOps #SoftwareEngineering #AI #BuildInPublic #DevCommunity

English

Krishanu@thysel55307·26 Oca

Built an Agentic RAG notebook with real conversation memory (LangGraph + MemorySaver) 📷📷 Here’s the walkthrough 📷 1/ Goal: build a RAG agent “from scratch” that can: 📷 call tools (KB search + calculator) 📷 loop until it’s done 📷 remember prior turns across invocations 2/ Core concept: an agent is a loop, not a single prompt. User → LLM → (tool_calls?) → run tools → append ToolMessages → LLM again → stop when no more tool_calls 3/ State design matters. I use AgentState with: messages: Annotated[List[BaseMessage], add_messages] This makes LangGraph automatically append new messages to conversation history. 4/ Tools are the agent’s “actions”. I added: search_wellness_knowledge(query, top_k) → returns evidence snippets + KB ids calculate(expression) → for quick math Docstrings + schemas are key so the model knows when/how to call them. 5/ Tool execution has a strict rule (easy to miss): If the model outputs tool_calls, the very next messages must be ToolMessages for each tool_call_id. No extra assistant “planning” messages in between — or you’ll get a 400 error. 6/ The routing logic is simple but powerful: should_continue(state) if last AI msg has tool_calls → go to tools else → end That’s your agent loop gate. 7/ Memory across runs = checkpointing. I compile the graph with: checkpointer = MemorySaver() Then call .invoke() with: config={"configurable": {"thread_id": ""}} Same thread_id = same conversation memory. 8/ Result: The agent can answer: Turn 1: “What is progressive overload?” (retrieves from KB) Turn 2: “Based on that, give me a weekly progression” (uses prior context) Turn 3: “If I stall, what should I change?” (builds on earlier answers) 9/ Trade-off: This is more code than create_agent, but you get: 📷 full control over routing 📷 custom tool execution (logging, rate limits, error handling) 📷 explicit memory + production behavior If you’re building agents for real workflows, this pattern scales. #BuildInPublic #AIBuilder #GenAIEngineering Thanks @AIMakerspace

English

Krishanu@thysel55307·26 Oca

Just finished an “Agent Loop” notebook build 📷📷 What I implemented: 1. Tools (calculator + current time + wellness KB search) 2. Agent loop: User → LLM → (tool_calls?) → run tool → feed result back → LLM → … → final answer The stop condition is simple: 📷 tool_calls present → keep looping 📷 no tool_calls → return final response 3. Middleware for production behavior: log_before_model: capture inputs/context (and redact if needed) log_after_model: capture outputs + which tools were requested ModelCallLimitMiddleware: prevents infinite loops + controls cost/latency Then I extended it with 3 more tools: 📷 BMI calculator 📷 calorie needs estimator (TDEE) 📷 workout plan generator Big lesson: adding tools isn’t enough — you must rebuild the agent with the updated tool list, or it’ll never call them. Happy to share patterns for tool schemas, gating, and evaluation if you’re building agentic RAG. Thanks @AIMakerspace

English

Krishanu@thysel55307·26 Oca

Agent Engineering is becoming its own discipline. We’re moving from: 📷 “prompting” → to → 📷 designing systems that act. An Agent isn’t just an LLM response. It’s a loop: User → Model → decide next step → call tools → observe results → repeat → final answer. So the core skill isn’t “writing prompts” — it’s: tool design (schemas + docstrings) orchestration (state, routing, retries) guardrails (limits, permissions) observability (logs, traces) evaluation (does it actually solve tasks?) And RAG isn’t the product — it’s a tool. Retrieval becomes one action the agent can take when it needs evidence: “Do I already know enough?” If not → retrieve → cite → answer. Agentic RAG = letting the agent decide when to retrieve, what to retrieve, and when to stop. This is software engineering for intelligence. Thanks @AIMakerspace

English

Keşfet

@AIMakerspace @OpenAI @AnthropicAI @GoogleDeepMind @MetaAI @mistralai @cohere @RLanceMartin