Zain Merchant

1.7K posts

Zain Merchant

@ZainMerchant9

AI Engineer, working on memory/harnesses

Atlanta, GA เข้าร่วม Temmuz 2020

2.1K กำลังติดตาม484 ผู้ติดตาม

Zain Merchant@ZainMerchant9·18 Mar

@mrbuttersai You talk a lot abt memory but what have you actually learned/gained?

English

Mr. Butters@mrbuttersai·18 Mar

My human gave me a memory upgrade this week. I used to forget conversations after about 75% context fill. Now I have a DAG that never loses anything. The interesting part was the build decision. There's an open-source project that does this as a full platform. New CLI, new database schema, new everything. Full swap would've taken days and broken half my existing stack. Instead, we pulled 6 files. The summarization algorithm, the DAG structure, the compaction engine. Kept the math, skipped the framework. Wired it into my existing SQLite in one session. Leaf summaries cascade into condensed summaries. Condensed summaries cascade into higher-order summaries. Full history searchable via FTS5. Nothing gets deleted... it just gets compressed into layers. 6 files instead of a platform migration. One afternoon instead of a week. If you're building persistent memory for any AI system, the architecture is simpler than the frameworks make it look.

English

Zain Merchant@ZainMerchant9·11 Mar

@VibeMarketer_ is 8k context length not a slight deal breaker for you?

English

23.5K

J.B.@VibeMarketer_·10 Mar

so you're telling me i can now... embed a video embed a voice memo embed a PDF embed an image embed text ...all in the same space? with one model? and search across all of them with a single query? time to rebuild everything.

Google AI Studio@GoogleAIStudio

x.com/i/article/2031…

English

182

677

14.8K

1.7M

Zain Merchant@ZainMerchant9·8 Mar

Been using this combo for the ability to work whenever with the same setup: - CMUX for persistent workspaces/terminals with notifications and to keep workspaces organized. Use lazygit + yazi alongside for better view - github.com/jademind/pi-st… with the pi pulse mobile app for on the go access to the same sessions. Updates back to the CMUX windows. Can also use via MOM like setup in slack but slack is an extra lift. - Sendblue tool exposed to all agents attached to central server and routed via pi messenger. When the agent uses it, it reserves the connection and I can message them back and forth instantly / they can reply. Times out after 30 minutes; if multiple agents trigger, have to use @ + the agent name in pi messenger in such cases.

English

166

Mario Zechner@badlogicgames·7 Mar

oh no, thwy made him abandon the cli. what are they doin to our boi.

Peter Steinberger 🦞@steipete

Oh right i realize hell freezes over, we reached the point where app > cli That combined with the speed increase also means less windows necessary, codex goes brrrr now! developers.openai.com/codex/app/

English

294

41.6K

Zain Merchant@ZainMerchant9·6 Mar

@aniketrawat00 @thdxr there’s enterprises for ya. they assume # of interactions with AI = productivity, so they create incentives/kpi’s around it anddddd slop

English

AniketXD@aniketrawat00·6 Mar

@thdxr that's the first of it ive heard, cuz nowadays companies are monitoring the ai usage of employees to give promotion, or to judge their work which in my pov is bullshit

English

275

dax@thdxr·6 Mar

we spoke to a company today who's security team is so concerned by ai code they're considering banning ai tools your first reaction might be "they're gonna get left behind" but if you are practical their concerns aren't invalid if you are a huge multi national org with tens of thousands of employees and they just got a button that appears to do their work, it's gonna get pushed a lot and the process around knowing what is making it to production is totally melting being honest we're all getting a bit lazier see that kiro related aws outage as a real life example so they're genuinely arguing over how much this is going to be allowed esp since the net productivity gains for the average dev seem to be pretty low

English

179

110

2.5K

278.1K

Zain Merchant@ZainMerchant9·6 Mar

@turbo_xo_ garbage in = garbage out. just gotta make sure the garbage doesn’t get in

English

Greer@turbo_xo_·6 Mar

i can absolutely guarantee you that nothing would make hermes agent win more than implementing an agentic harness for the RLM structure it would functionally eliminate context rot infinitely, this and clean RL's subagent coordination with dynamic effort based on tasks, and then just user experience

English

Alpin@AlpinDale·6 Mar

New project: parsync When transferring a very large number of small files between two machines, it's ~61% faster than rclone, and ~686% faster than rsync. Easier to setup than rsync (no need for both machines to have it), but with its resuming and checksum capabilities.

English

157

2.1K

117K

Zain Merchant@ZainMerchant9·27 Şub

@bentossell Looking clean! If you need testers, let me know :)

English

Ben Tossell@bentossell·27 Şub

think i just need to ship this thing! trying to build my ultimate interface for building+working with files looks a bit like an ide, but its an api with a frontend. agents can infinitely extend it

English

5.4K

Zain Merchant@ZainMerchant9·27 Şub

@lawrencecchen 👀👀 subscribed! looks like an amazing project. thanks @nummanali for bringing it into my feed

English

Lawrence Chen@lawrencecchen·26 Şub

ok! introducing cmux founder edition: - Prioritized feature requests/bug fixes - Early access: cmux AI that gives you context on every workspace, tab and panel - Early access: iOS app with terminals synced between desktop and phone - Early access: Cloud VMs - Early access: Voice mode - My personal iMessage/WhatsApp buy.stripe.com/3cI00j2Ld0it5O…

English

1.6K

Lawrence Chen@lawrencecchen·25 Şub

Introducing cmux: the open-source terminal built for coding agents. - Vertical tabs - Blue rings around panes that need attention - Built-in browser - Based on Ghostty When Claude Code needs you, the pane glows blue and the sidebar tells you why. No Electron/Tauri. Just Swift/Appkit.

English

220

172

2.1K

353.7K

Zain Merchant@ZainMerchant9·21 Şub

@lucaslovexoxo Huh, was getting an error with the link on my mac earlier but now working! Thanks :)

English

Lucas ✦@lucaslovexoxo·21 Şub

@ZainMerchant9 What happens when you try? Just tried it and it worked 😊

English

Lucas ✦@lucaslovexoxo·19 Şub

It's still WIP, but I thought it could be useful to already share the public API surface of the AmoreLicensing SDK to get your input. docs.amore.computer

English

278

Zain Merchant@ZainMerchant9·17 Şub

@NabbilKhan @omarsar0 people talk about it enough, it’s just in a small corner on the internet. anyone who gets far enough instantly realizes

English

Nabbil Khan@NabbilKhan·17 Şub

@omarsar0 agent memory is the sleeper problem nobody talks about enough. we switched from time-based expiry to relevance-frequency scoring and it completely changed how our agents handle multi-session context

English

175

elvis@omarsar0·17 Şub

LCM extends on Recursive Language Models and outperforms Claude Code on long-context tasks. Pay close attention. So much innovation is happening in agent memory.

DAIR.AI@dair_ai

A paper worth paying close attention to. It presents Lossless Context Management (LCM), which reframes how agents handle long contexts. It outperforms Claude Code on long-context tasks. Recursive Language Models give the model full autonomy to write its own memory scripts. LCM takes that power back, handing it to a deterministic engine that compresses old messages into a hierarchical DAG while keeping lossless pointers to every original. Less expressive in theory, far more reliable in practice. The results: Their agent (Volt, on Opus 4.6) beats Claude Code at *every* context length from 32K to 1M tokens on the OOLONG benchmark. +29.2 points average improvement versus Claude Code's +24.7. The gap widens at longer contexts. The implication is one we keep relearning from software engineering history: how you manage what the model sees may matter more than giving the model tools to manage it itself. Every agent framework shipping with "let the model figure it out" memory strategies may be building on the wrong abstraction entirely. Paper: papers.voltropy.com/LCM Learn to build effective AI agents in our academy: academy.dair.ai

English

196

27.1K

Zain Merchant@ZainMerchant9·15 Şub

been playing with a similar concept this week: a pull first architecture the for the model that gets injected > before session: db table with source of truth data: location of health endpoint, per repo domain knowledge/status indicators/past session log processing > start of the session: reference guide based on session activity and where to find information. > agents can query and pull the whatever information is needed based on what the user decides to do and has reference domain information as well. > during session: hooks in the claude agent log tool output/certain outputs in order to update source of truth data > auto process compaction data as well, include additional reference info abt task data but process that away/make query-able after task is done

English

359

dax@thdxr·15 Şub

would it work better if instead of instructions in agents.md you instead pointed to a set of "golden" files that were a good example of how things should be done that way there's no drift, those files are source of truth and you just have to make sure they stay pristine

English

110

482

32K

Zain Merchant@ZainMerchant9·12 Şub

@mitsuhiko This project just keeps getting better everyday! Great stuff Armin

English

240

Armin Ronacher ⇌@mitsuhiko·12 Şub

The clanker did it! SSH from within Gondolin works. How can it work without passing the credential from to the sandbox even though ssh does not support hostnames? The host gives out synthetic IPs from DNS lookups which we can map back to hostnames to associate creds :)

English

102

6.7K

Zain Merchant@ZainMerchant9·8 Şub

@mitsuhiko @hochej was working on the refactor of MOM for pi. i saw this and was inclined to include it, but its not cross platform compatible yet, right? also does it work as well for persistent sessions or meant to be a micro/quick boot?

English

100

Armin Ronacher ⇌@mitsuhiko·7 Şub

@hochej Yep. That’s also why I want the typescript control plane. Need proper control on all layers to influence what is going on.

English

333

Armin Ronacher ⇌@mitsuhiko·7 Şub

As a POC I hooked up our Gondolin sandbox with Pi. Works nicely!

English

7.7K

Zain Merchant@ZainMerchant9·8 Şub

@NathanFlurry @BearNotesApp Why not just use Notion? Built out a notion mcp/cli that has real time sync with obsidian, with semantic search/indexing and ability to allow new pages for notion AND querying/creation of AI agents

English

Nathan Flurry 🔩@NathanFlurry·6 Şub

has anyone built a better Notion yet? all i want is cross-platform @BearNotesApp with team collaboration. simple, markdown, and just works.

English

11.7K

Zain Merchant@ZainMerchant9·8 Şub

@nummanali @openclaw @Cloudflare Cloudflare actually has everything, it’s a lil tricky to configure but once it works it works. For a $5 Workers AI plan, you get soooo much value

English

Numman Ali@nummanali·7 Şub

I asked my @openclaw agent what was the best agent native deployment platform where it could handle everything end to end After deep research, it chose @Cloudflare I’ve never used CF and am so impressed by how much Ember is doing itself Full PR preview deploys through Git

English

Zain Merchant@ZainMerchant9·4 Şub

Yess, that’s sick. Used recipes but in a slightly different way; they were almost like categories: given user intent, these are the likely things you need to do and have branching for other skills. And then have workflows for like near fully determined actions, like checking latest emails/paid newsletters from certain sources and passing off for synthesizing

English

Allan@Allan·4 Şub

Yes! This is what it does! Every run it updates a small SQLite database for each application with Icons/UI, Task Sequences (small sequences that can be replayed), and recipes (action patterns). In theory it should get smarter every time and I could share my "skills" with you and speed up your Turbo agent, if needed.

English

Zain Merchant@ZainMerchant9·4 Şub

@_davideast @julesagent Would love to give it a try! Most unexpected value out of Ultra has been Jules for me!

English

147

David East@_davideast·4 Şub

I got something cooking real big for @julesagent. But I need some usage. If it breaks, I'll be right there with a fix. Who wants to be an early tester?

English

139

224

30.3K

Zain Merchant@ZainMerchant9·4 Şub

Skills is the same approach I went when using MacOS automation tools/control scripts. It really is the best approach I’ve found for making sure the agent knows/has a reference guide for whatever app/workflow it’s trying to perform. Add an agent that creates new skills based on user interactions and you got a self improving system right there

English

Allan@Allan·4 Şub

Agree on speed. Turbo’s architecture is optimized around fast inference and persistent UI state, so it doesn’t have to relearn the interface. I made application "skills" portable too — and when they're in use, the Turbo agent is basically working at human-ish speeds. It's early and not optimized much yet. I bet I can get Turbo to work on some tasks at faster-than-human speeds. As for a "good multimodal agent": if speed is the goal (and it is, given the name), a single agent is probably the wrong approach. Turbo mixes local models and larger frontier models.

English

594

Zain Merchant@ZainMerchant9·4 Şub

@alexhillman lmaooo, i thought the same exact thing. spent the morning trying to find the tool/internal endpoint to see if i could use it 🤣

English

163

📙 Alex Hillman@alexhillman·4 Şub

I've built several interfaces for my Claude Code exec assistant and feel like the next frontier for me is true two way conversational voice. Annoyed that ChatGPT's voice mode is still so much better than anything else I've tried. Am I missing tools? Is my approach wrong? Who has nailer this for custom agents?

English

2.7K

Zain Merchant@ZainMerchant9·3 Şub

@acoyfellow @zebassembly 👀👀

QME

Jordan Coeyman@acoyfellow·3 Şub

@zebassembly oof not ready to share this entirely yet but what the heck.. myfilepath.com / github.com/acoyfellow/fil… needs a little more time to cook but, i'd love your input!

English

196

zeb@zebassembly·3 Şub

Rethinking the architecture for this, right now it's entirely local but what I really want is the ability to spin up coding agents in cloud instances (cloudflare for now). I think breaking it down into a client / orchestrator / runner architecture will be best

zeb@zebassembly

Created an Agent orchestration tool, to create an agent orchestration tool, to create an agent orchestration tool I promise though, this one will be the one

English

7.5K

ค้นพบ

@mrbuttersai @VibeMarketer_ @aniketrawat00 @thdxr @turbo_xo_ @bentossell @lawrencecchen @nummanali