LangChain OSS

242 posts

LangChain OSS

@LangChain_OSS

Ship great agents fast with our open source frameworks – LangChain, LangGraph, and Deep Agents. Maintained by @LangChain.

Katılım Ocak 2026

5 Takip Edilen3.1K Takipçiler

LangChain OSS@LangChain_OSS·2d

✅ Harness profiles: Per-model tuning + support for open models (@Kimi_Moonshot, @Alibaba_Qwen + @deepseek_ai) ✅ Code interpreter: A programmable runtime inside the agent loop ✅ Streaming-typed projections for messages, tool calls, + subagent events ✅ DeltaChannel: Efficient checkpoint storage for agents ✅ ContextHubBackend: Store skills, policies, + memories that shape agent behavior 💻 langchain.com/blog/deep-agen…

English

101

18.7K

LangChain OSS retweetledi

Sydney Runkle@sydneyrunkle·3d

x.com/i/article/2054…

ZXX

24.4K

LangChain OSS@LangChain_OSS·4d

Swap models & view their capabilities! Try out in Deep Agents CLI: docs.langchain.com/oss/python/dee…

Mason Daugherty@masondrxy

here's model profile details look like in practice, using @NVIDIAAIDev's Nemotron models as an example:

English

LangChain OSS retweetledi

Mason Daugherty@masondrxy·4d

+1 to this. I was recently on a cross-continental flight without wifi, so I brought up Qwen3.6 & Gemma 4 (via @ollama) in Deep Agents on my laptop. admittedly, they fell over on some more involved/complex prompts, but got a lot further along without requiring intervention than I had expected. If you haven't tried local models in the past month or two, (or gave up after the llama 3 days), it's worth revisiting! Deep Agents CLI discovers models on your machine automatically, so it's easy to swap with /model. If you feel like the model is reaching a ceiling, hot-swap mid-session to switch to a more capable/cloud model docs.langchain.com/oss/python/dee…

Tomasz Tunguz@ttunguz

Localmaxxing : pushing more inference to local models. Over five weeks, I tested how much of my daily work can run on a local 35B model instead of cloud frontier models. The answer : half. Many reasons to use local models : privacy, cost, asset depreciation. But the only one that really matters is latency. I ran a head-to-head benchmark. Qwen 3.6 35B-A3B-4bit on my MacBook Pro M5 vs Claude Opus 4.5 via API. Result : 2.1x faster locally. Mean 2.8s vs 5.8s. The local model isn't smarter. Opus scores ~20% higher on reasoning benchmarks. Local models lag frontier by 3-4 months, and for complex tasks, that gap matters. But for routine agent tasks, it rarely does. If half the work runs 2x faster on my laptop, I'll take that trade every time. My little computer is about to earn its keep. tomtunguz.com/localmaxxing/

English

3.8K

LangChain OSS retweetledi

Caspar Broekhuizen@caspar_br·4d

Agent engineering is hard Deep Agents hides a lot of the weird systems complexity, but still gives you more room to customize the harness than almost any agent SDK I've used

Mason Daugherty@masondrxy

This is harder to build than it looks. Preserving full conversational context while swapping underlying model providers mid-flight is a surprisingly deep systems problem. Most tools drop state or force you to start over. deepagents-cli does this natively: swap models mid-conversation with zero context loss. Try it: docs.langchain.com/oss/python/dee…

English

6.5K

LangChain OSS@LangChain_OSS·8 May

Time travel, built into the Deep Agents runtime, allows you to explore alternative agent trajectories. You can fork from any checkpoint in an agent run and experiment with different agent context.

Sydney Runkle@sydneyrunkle

concluding my "taking deep agents to production" series with arguably the most important component: observability. when you deploy a deep agent with LangSmith, you automatically get traces for every run: a full record of every LLM call, tool call, and middleware hook. for long-running agents, you can use agents like Polly, the LangSmith assistant, to reason over long traces and identify where a trajectory went wrong. traces are observational: they tell you what happened. time travel is experimental. built into the deep agent runtime, it's how you explore what an agent trajectory would look like if the agent had different context at some point. pick any checkpoint in a run's history, modify the state, and resume. the fork runs forward as its own branch, the original stays intact, and the full agent loop re-triggers. the combination of traces and time travel is powerful for the agent improvement cycle!

English

1.2K

LangChain OSS retweetledi

Mason Daugherty@masondrxy·8 May

Kimi K2.6 on @baseten is ~5x cheaper than Opus 4.7 For a large majority of tasks, it's roughly the same performance If you want to use open models for coding, try them out in deepagents-cli:

English

13.6K

LangChain OSS retweetledi

Sydney Runkle@sydneyrunkle·7 May

this is the part of the deep agents production series i've been most excited to get to: sandboxes without an execution environment, a production agent is only as capable as its fixed toolset. give an agent an execution environment where it can write and run code, and you give it a general-purpose toolkit for approaching complex and diverse problems. that's what sandboxes unlock. think data analysis: an agent that can write and execute its own queries, process the results, and generate a report is fundamentally more useful than one that can only call a predefined analysis tools. deep agents handles this through sandbox backends. configure one and the agent gets an execute tool scoped to that environment. no backend, no tool. deep agents is provider-agnostic: daytona, modal, runloop, and langsmith sandboxes all supported, swappable with a config change. secure code execution requires isolation: code running on your host can read env vars, exfiltrate api keys, cause real damage. but the sandbox protects your host, not the sandbox itself. credentials placed inside are still reachable via prompt injection. the auth proxy pattern addresses this: credentials live in workspace secrets, get injected on outbound requests by a sidecar, and never land inside the execution environment. docs.langchain.com/oss/python/dee…

English

17.8K

LangChain OSS@LangChain_OSS·7 May

Docs: docs.langchain.com/oss/python/dee…

English

107

LangChain OSS@LangChain_OSS·7 May

Swapping between open models is as easy as changing the model string! Try in deepagents today ⬇️

Mason Daugherty@masondrxy

your daily reminder that open models are plenty capable for a lot of coding work. easiest place to feel that out is deepagents! swap the model and go. i've been enjoying GLM-5.1, Kimi K2.6, MiniMax M2.7, DeepSeek V4 Pro. here's some examples using our CLI agent in headless mode

English

464

LangChain OSS retweetledi

Mason Daugherty@masondrxy·7 May

English

12.5K

LangChain OSS retweetledi

Sydney Runkle@sydneyrunkle·7 May

one of the features i'm most excited about in our upcoming langgraph release is delta channels! the langgraph runtime lets you "checkpoint" agent progress at every step (model call, tool call, hooks). the problem, though, is that checkpoints bloat quickly when context is long! delta channel mitigates this with diff based storage from checkpoint to checkpoint. with delta channels, you still have a full history of agent progress, the only diff (haha get it) is the storage format. in-depth blog coming soon, but in the meantime, try it out and lmk what you think! #deltachannel-beta" target="_blank" rel="nofollow noopener">docs.langchain.com/oss/python/lan…

English

3.7K

LangChain OSS retweetledi

Sydney Runkle@sydneyrunkle·6 May

one of the things that makes an agent "deep" is memory. and in production, memory needs to be maintained: consolidated, pruned, refined over time. in a multi-tenant application, that means doing it at scale, per user, in the background. cron is built into langsmith's agent server for exactly that kind of job. #background-consolidation" target="_blank" rel="nofollow noopener">docs.langchain.com/oss/python/dee…

English

1.9K

LangChain OSS retweetledi

Mason Daugherty@masondrxy·6 May

small workflow note that adds up. /staged-pr is a skill (via slash command) i run when wrapping up a PR. it takes my staged code changes and drafts a concise PR title & description, based on my preferences and our repo conventions. it's routine, formulaic, and well-scoped. i hit it 20+ times a day. in the Deep Agents CLI i use /model to swap mid-session: heavier frontier model for the actual coding, then over to glm 5.1 (via @OpenRouter ) or kimi 2.6 (via @baseten) for the skill. it yields indistinguishable quality differences, runs faster, and is ~5x cheaper than leading LLMs. the broader point: matching the model to the task beats picking one model for everything. open models are extremely good at the long tail of routine agent work, even if frontier still wins the hard stuff. a lot of what we point frontier models at isn't actually that hard! -- linking skill below for those interested in trying it out

Mason Daugherty@masondrxy

deepagents-cli is quietly becoming the best place to start coding with open weight models. we've been investing heavily in making it a harness that's truly model-agnostic, without compromising performance! different models perform best with different harnesses -- prompts, middleware, settings. our recent profiles API (below) lets you bundle all of that per model, so Kimi, Qwen, GLM, etc. can drive the agent loop just as well as the closed frontier. more info on profiles x.com/Vtrivedy10/sta… other recent wins worth highlighting: - /agents - swap agent profiles mid-session (coding agent/content writer/custom) - /model - fuzzy switcher w/ live status; OpenRouter, LiteLLM, Baseten, hosted Ollama all built-in - headless mode w/ --json + --max-turns for scripting - --acp to run as an ACP server - /skill:name skills - MCP w/ OAuth full docs and quickstart ⬇️

English

2.4K

LangChain OSS retweetledi

Mason Daugherty@masondrxy·6 May

we're continuing to see clear examples where a model's harness is a major determinant of overall performance. with the same model, running on same task, it's easy to observe very different scores depending on (system) prompts, tools (& their descriptions), and middleware (steering hooks). this is exactly why we built a harness profiles abstraction in Deep Agents: per-provider or per-model overrides for base system prompts, tool names + implementations, etc., so swapping models doesn't mean losing the work that made the last one good! 10–20pt jumps on tau2-bench in our own testing. currently cooking up built-in profiles for popular open weight models 🧑‍🍳 langchain.com/blog/tuning-de…

mteam.eth@mteamisloading

you've heard that models are highly trained in their harnesses, but... it appears that pi is about 7-10% better than codex with gpt-5.4 on a ProgramBench task. Same exact prompt, same environment. It's a good harness.

English

15.1K

LangChain OSS retweetledi

Mason Daugherty@masondrxy·6 May

the deeper point: model choice and harness choice aren't independent variables. benchmarking a model without specifying the harness is like benchmarking a chip without specifying the compiler. the number means something, but not what people think it means. profiles make the harness a first-class, named object — you can diff it, version it, and swap it when you swap models.

Mason Daugherty@masondrxy

English

2.4K

LangChain OSS retweetledi

Sydney Runkle@sydneyrunkle·5 May

more tips on deepagents in production! every langsmith deployment automatically exposes mcp and a2a endpoints meaning: your agent is instantly available to other agents and clients and when a run finishes, webhooks let you chain it into whatever comes next

English

1.2K

LangChain OSS retweetledi

Mason Daugherty@masondrxy·5 May

what model are you choosing for coding tasks?

English

2.6K

LangChain OSS retweetledi

Mason Daugherty@masondrxy·5 May

open-weight LLMs have come a long way on agent tasks! but the harness you wrap them in matters just as much as the model itself, and arguably the interface you use to drive that harness matters even more. dev workflows are deeply personal. what works well for one developer may hinder another, so it's difficult to converge on a single UX that isn't either compromising or too generalized (e.g. CLI vs. TUI vs. GUI vs. IDE extension) while it doesn't come without drawbacks, ACP a solid stopgap for running the same harness across multiple interfaces. pick your frontend, keep your agent. deepagents ships with this out of the box -- two ways to plug it in: - deepagents-acp is our standalone ACP server to serve *any* agent - `deepagents-cli --acp` to use our existing CLI agent over ACP point any ACP-compatible client at it and you've got the same deepagents harness, your choice of open-weight model & provider, and your choice of interface. some popular exemplars: - `toad` is an agent-agnostic TUI that ships deepagents support built-in, made possible via ACP github.com/batrachianai/t… (@willmcgugan @textualizeio) - you can use deepagents directly in any modern IDE, see this blog post from @jetbrains coauthored by our very own @Hacubu: blog.jetbrains.com/ai/2026/04/usi…) the model is yours to pick. the interface is yours to pick. the harness shouldn't be the thing that locks you in.

English

9.6K

Keşfet

@Kimi_Moonshot @Alibaba_Qwen @deepseek_ai @ollama @baseten @OpenRouter @willmcgugan @textualizeio