Binh Tran

2.3K posts

Binh Tran banner
Binh Tran

Binh Tran

@binhtran

Seed VC

Ho Chi Minh, Vietnam Katılım Ekim 2008
1.5K Takip Edilen3.8K Takipçiler
Binh Tran retweetledi
andy nguyen
andy nguyen@kevinnguyendn·
Your memory, organizing itself while you sleep. ByteRover Dreaming runs while your agent is idle: → merges near-duplicate notes → connects topics that share patterns → archives stale drafts Nothing changes without you. Every proposal comes with the reasoning, and one command to approve. run brv dream
English
2
3
11
508
Binh Tran retweetledi
andy nguyen
andy nguyen@kevinnguyendn·
Your shared memory across Claude Code, OpenClaw, and Hermes is invisible. Now it's a webpage. The Local Web UI for ByteRover (OSS memory system): Manage your team's context across every entry and every project, in one place. → trace decisions back to where they were made → recall how a teammate solved the same problem last month → see context ranked by relevance, automatically
English
1
9
25
1.8K
Corey Ganim
Corey Ganim@coreyganim·
Everyone obsesses over the top of this pyramid. But real leverage is at the bottom. Without a proprietary data layer, every AI deployment starts from zero. Every agent gets trained differently. Every output drifts depending on who built it. The data layer isn't optional. It's the foundation the rest of the stack sits on. Even as a one-person operation, building your second brain is the highest-leverage move you can make. The bigger your company, the more it compounds. Prediction: "second brain as a service" is about to spawn an entire industry. Firms that exist solely to design the internal data systems that make AI deployment painless. It's (by far) the most important tier, yet almost nobody is building it.
Corey Ganim tweet media
Corey Ganim@coreyganim

My predictions for AI in 2026: 1. Second brain as a service becomes hugely profitable. Companies will pay to build internal knowledge bases trained on their data. 2. Owned audiences (email, SMS, direct mail, SEO rankings) become 100x more valuable as AI spam explodes. 3. Building audience on social → converting to owned channels is the most valuable marketing skill you can learn in 2026. 4. AI content will hit 8/10 quality on basic input. Only 10/10 stands out. Taste and context are your moat. 5. Relationships become even more valuable, especially with people who have large owned audiences. Borrow their trust. 10x overnight. 6. Companies hire employees whose entire job is staying on top of AI trends to help CEOs pivot daily. Eventually these become AI agents. 7. Most powerful AI models reserved for big corporations and governments. Average users never get access. 8. Cost of AI inference scaling rapidly. My system went from $20/month to $500-1K/month in 6 months. Will 10X next year. 9. Niche communities become massive business ecosystems. 80% fail. 20% become multi-unit powerhouses. 10. Proprietary data is the single most valuable moat: customer contact data, search trends, pain points. License it. Sell it. Build competitive advantage. 11. OpenAI building accessible agent layer. When it ships, agentic era begins for everyday people. 12. Claude Code continues to be the highest-leverage skill, period. 13. AGI is closer than we think (if not already here). ASI will be 1000x more earth-shattering then even the most bullish expectations.

English
11
9
106
15.3K
Binh Tran retweetledi
andy nguyen
andy nguyen@kevinnguyendn·
Never let one vendor own your workflow" is trending again today. the version that actually matters: never let one vendor own your context. models are swappable. chat history, prompt libraries, custom instructions, integration state, agent memory those you can't rebuild when the vendor goes sideways. what you accumulated is the moat. keeping it portable is the discipline. the vendor risk everyone's scared of isn't losing the model. it's losing everything you built on top of it
Pato Molina@patomolina

@claudeai you took down our entire organization with 60+ accounts belonging to a legitimate company for no apparent reason, without any explanations. The only way to appeal the decision is by filling out a Google Form? Very bad UX and customer service.

English
0
1
5
377
Binh Tran retweetledi
Benedikt Koehler
Benedikt Koehler@furukama·
New memory system incoming for HybridClaw: @ByteroverDev - because you can never have enough beautiful memories.
Benedikt Koehler tweet mediaBenedikt Koehler tweet media
English
4
3
9
318
Binh Tran retweetledi
andy nguyen
andy nguyen@kevinnguyendn·
Build Your Own Autonomous Local Agent Stacks That save 83% on token costs and give your agents 92% long-term memory retention. This guide helps you set up a fully local and private autonomous stack. This means your data stays on your machine, your agents don't forget their tasks, and you stop paying for expensive cloud API tokens. The Stack Components: 💪The Execution Agents: OpenClaw (2026.4.12)/ Hermes → run the actual tasks and scripts on your computer. 🧠The Brain (Local LLM): Gemma 4, Z.ai’s GLM-5.1, and Qwen 3.5 → able to handle complex tasks like coding, media generation, or researching 🤖The Memory Engine: ByteRover (3.3.0) → filesystem memory that connects natively with your tools and keeps context organized Hardware Requirements - For this experimental setup, we run on Mac M4 Pro, RAM 24GB (ByteRover and autonomous agents can run on a Apple RAM 24GB machine) - For production usage we recommend at least an Apple M4 with RAM 48GB. Step 1: Downloads the local LLM models You will need two specific models to balance performance and memory efficiency: - Gemma 4 E4B (Q4): This is for OpenClaw to handle reasoning. (Uses ~8.7 GB RAM) - Qwen 3.5-9B (Q8): This is for ByteRover to manage your data. (Uses ~10.5 GB RAM) On a 24 GB machine, both models fit in memory simultaneously. Gemma 4 E4B at Q4 uses ~8.7 GB and Qwen3.5-9B at Q8 uses ~10.5 GB. Step 2: Load Both Models in LM Studio Open LM Studio and load both models simultaneously. On a 24GB machine, these will fit together, leaving just enough room for your system to run smoothly. Step 3: Configure Your Agent Whether you choose OpenClaw or Hermes, the setup is the same: - Point the agent to your local LM Studio endpoint. - Adjust Memory: By default, OpenClaw handles 50,000 tokens. To change this, edit your openclaw.json file and run openclaw gateway restart to apply the update. Step 4: Connect ByteRover Memory Finally, set up the ByteRover CLI: - Link it to the same local endpoint. - Select the Qwen model as the primary memory handler. The Result: You now have a persistent AI assistant that remembers your project details across different sessions. Everything stays local, with Cloud option if you ever need to collab with your teams Detailed guide for each step 👇
andy nguyen tweet media
English
7
8
48
32.5K
JQ Lee
JQ Lee@JqOnly·
@binhtran harness turns into non-deterministic things into deterministic. I agree harness is OS. we need to handle LLMs on our deterministic system, not fat skills. look at my harness. github.com/Q00/ouroboros
English
1
0
0
75
Binh Tran
Binh Tran@binhtran·
CPU = LLM OS = agent harness RAM = context window Files = your knowledge Programs = your skills I want to swap CPUs depending on the job I want my files to outlive any OS I want full control over who accesses my data I don't want the company that makes my CPU or my OS to own my files Your computer figured this out decades ago. AI agents haven't yet. I hope that's where we're heading
English
4
2
15
5.1K
Binh Tran retweetledi
andy nguyen
andy nguyen@kevinnguyendn·
@ashwingop tested five architecturally distinct memory systems (vector DBs, knowledge graphs, context windows, BM25/filesystem, parametric memory) and then proved with math why the filesystem approach wins. He mentions @ByteroverDev (alongside with @Letta_AI , ClaudeCode, @ManusAI and @openclaw) as rare real-world examples of filesystem-based memory actually shipping in production and delivering meaningful accuracy. Here’s what we’ve been shipping: - Your memory lives in simple, human-readable markdown files (not hidden vectors) - Native memory for OpenClaw, Hermes, and 22+ AI coding agents including ClaudeCode and Cursor. Plus, native Obsidian support - More than 92% on LoCoMo & LongMemEval benchmark - Portable by default (local-first), with optional cloud sync for team collaboration and enterprise-grade security Huge respect for Ashwin Gopinath for formalizing into a full paper arxiv.org/html/2603.2711…. This is the kind of rigorous, formal analysis that moves the entire agent memory field forward. If you’re building agents that need to actually remember long-term, check out this repo github.com/campfirein/byt… Install the CLI: curl -fsSL <byterover.dev/install.sh> | sh
Ashwin Gopinath@ashwingop

x.com/i/article/2042…

English
1
6
21
1.9K
Igor Zalutski
Igor Zalutski@IgorZIJ·
yes and yes i don't believe Anthropic's take with "separating brains from hands" is the right one (in the short and medium term) - even though it is from a "purist system design" pov. also love your x86 parallel. a lot of rhymes in how the industry came to be. i find the early days of the web with HTTP servers - particularly striking match. wrote some thoughts here arguing that even though designing agents "properly" is tempting we are more likely to see real-world applications built around existing patterns - like CLI harnesses - that were supposed to be temporary: opencomputer.dev/blog/agent-exe…
English
1
0
0
61
Ryan Lopopolo
Ryan Lopopolo@_lopopolo·
Worrying about the code is the wrong thing. In a world where we are moving to “prompt requests”, the only thing to worry about is the environment that shapes how agents produce the code. We need the agentic equivalent of rustc for the x86 world we are currently in.
Ryan Lopopolo@_lopopolo

All agents are coding agents. Code is how an agent uses a computer. We want to build agents for everyone. Not everyone knows how to code. Agents do work by writing code. Insane for engineers today to be hung up on the code the agents write.

English
2
2
43
4K
Manny Medina
Manny Medina@medinism·
This is already happening The surface area is api, cli, or Claude code The trick is that most platforms will have to support both through the transition A active user model and a agent consumption model
TBPN@tbpn

Sequoia partner @gradypb says software is shifting from apps that demand attention to agents that work quietly in the background. This shift will change what moats will look like, and will be especially hard for incumbents to deal with. "It's two very different business paradigms," he says. "In an era of apps, you want a lot of surface area with your customers, you want them to spend a lot of time in your product." "In an era of agents, things can just be running passively in the background. The amount of surface area you have with your customers, the amount of time they might spend in your product might be de minimis. So the nature of the moat that you [will need to build] is different." "I think we'll see a lot of companies in the coming years able to live in the software world and have some of the workflows people are accustomed to. And then separately they can deploy these passive agents that kind of function as coworkers who just come back to you when things are done." From his appearance on the show in January.

English
1
0
4
540
Jaya Gupta
Jaya Gupta@JayaGup10·
Regulatory capture is not new. What is different here is trying to use fear and safety posture before the moat is fully built. Anthropic may be trying to reverse the usual order. The trust story came before the moat was clearly visible. Trust can buy time while a deeper moat is still forming underneath. But it only matters if the moat gets built in time
Jaya Gupta@JayaGup10

x.com/i/article/2043…

English
2
4
32
7.2K
Prukalpa ✨
Prukalpa ✨@prukalpa·
Enterprise Context is business IP. Letting it get locked into any company that’s training its own models is giving away keys to the kingdom. Given the concentration of power in a small group of companies with AI — open systems have never been as important as they are today.
Jaya Gupta@JayaGup10

x.com/i/article/2043…

English
5
7
38
8.9K
Jaya Gupta
Jaya Gupta@JayaGup10·
What is clear is that an enterprise making an AI architecture decision today is deciding who will own the state that accumulates as AI agents become central to work. Most enterprises do not realize that is what they are deciding. They think they are choosing a model. In practice, they are choosing who gets to own the state that builds around it
Jaya Gupta@JayaGup10

x.com/i/article/2043…

English
9
10
68
17.4K
Aakash Gupta
Aakash Gupta@aakashgupta·
Before you type a single message in Claude Code, 10% to 16% of your context window is already gone. System prompt takes ~2%. Each MCP server adds ~8%+. Custom agents eat ~4%. And your conversation grows with every message on top of that. Run /status line right now. Ask for a color-coded context meter. Green under 50%. Orange 50-80%. Red over 80%. You will never unsee how fast your context fills up. The hierarchy most people get backwards: CLIs cost zero context. They sit on your machine. Claude calls them, gets the result, done. GitHub CLI, Vercel CLI, Firecrawl CLI. Zero overhead. APIs cost medium context. Custom integrations you build yourself. MCPs cost the most. They're always loaded. Always eating tokens. Even when you're not using them. Karpathy confirmed the same ranking independently. The test for every MCP you have enabled: ask Claude "does a CLI for this exist?" More often than you think, the answer is yes. Every MCP you replace with a CLI is context you get back for actual work. The escape hatch nobody talks about: if Claude goes on a tangent, hit Escape twice. Roll back to before the bad prompt. Everything after it is erased from context completely. That one shortcut has saved me hours of starting over. Context is the scarcest resource in Claude Code. Treat it like you'd treat RAM in 1995.
Aakash Gupta tweet media
Aakash Gupta@aakashgupta

This guy literally broke down how to use Claude Code like an expert: 1:40 - Code vs Cowork vs OpenClaw 6:51 - Setting up context status line 12:03 - Sub-agents 17:49 - Creating skills 23:58 - Ask user questions tool 33:33 - Tool-powered skills: Tavily 36:57 - CLI vs MCP vs API hierarchy 39:30 - Make slides skill w/ Puppeteer 43:32 - Auto-invoking skills with hooks 46:49 - Jupyter notebooks for data trust 55:09 - The operating system file structure

English
25
49
401
46K
Matthew Berman
Matthew Berman@MatthewBerman·
Betting against models getting better is foolish. So as we build out harnesses, memory systems, etc, how are the core models not just going to eat more of the scaffolding around them?
English
88
6
257
32.6K
Paweł Huryn
Paweł Huryn@PawelHuryn·
Cowork is Claude Code. Same binary. Different environment flags. A 230MB Bun-compiled executable running inside a Hyper-V VM on your laptop. Claude Code v2.1.92 + Agent SDK v0.2.92. One flag flips the mode: CLAUDE_CODE_IS_COWORK=1. Your files aren't uploaded anywhere. They're shared into the VM via virtiofs — virtual filesystem passthrough. Read-write for your workspace, read-only for skills. All sessions share one VM. Each gets an isolated /sessions// directory inside a 10GB virtual disk. Permission prompts are a file-based handshake. The VM writes a request to shim-perm/. The desktop app reads it, shows the UI, writes the response back. Subagent calls default to claude-haiku-4-5. Your main session runs whatever model you selected. Prompt cache: 1h TTL (not 5m). Cache reads at 0.1x cost. Rate limits: 5h + 7d rolling windows. API calls route through GCP. Everything else runs on your laptop.
Paweł Huryn tweet media
English
13
15
156
38.3K
Larsen Cundric
Larsen Cundric@larsencc·
Building AI agents that actually work is 90% infrastructure, 10% AI > Error handling for flaky APIs > Rate limiting and backoff > State persistence across failures > Monitoring and alerting But everyone obsesses over prompt engineering. Try shipping the boring stuff first.
English
7
0
37
2.3K