Brainpage

14 posts

Brainpage

@GetBrainpage

Building Brainpage — one searchable brain for your whole company. Your docs, chats & meetings, finally answerable. Sharing what works and what breaks.

Katılım Şubat 2026

7 Takip Edilen0 Takipçiler

Brainpage@GetBrainpage·1d

@ignat_en yeah, and the decision layer needs a memory of its own. a call that was right six months ago rests on facts that have since moved, but nobody reopens it, so the 'truth' quietly rots. linking each decision back to the evidence under it is the part i keep getting stuck on.

English

Ignat@ignat_en·1d

@GetBrainpage Exactly. Documents are evidence, not truth. There has to be a decision layer above the documents, because decisions are made by people, not files.

English

Ignat@ignat_en·2d

Google just shipped the Open Knowledge Format. It's good, but it skips the hard part. Markdown plus a little YAML frontmatter, shippable as a tarball, readable in any editor, indexable by any tool. If your AI context is trapped in scattered wikis and code comments, OKF gives it a clean, portable shape. I agree with the diagnosis. It's the problem I've been working on for the last few months. Making knowledge portable is an important step forward. But here's what the announcement leaves out. The format is the easy 80%, and Google just gave it a common language. The hard 20% is everything a static file can't represent: 1) Who's allowed to see which fact. A markdown file in a tarball doesn't know the comp doc is off-limits to the intern. A company brain has to enforce that inside the query itself, or the AI will happily quote a source the asker was never allowed to open. 2) When a fact stopped being true. OKF has a timestamp. A real decision has a lifecycle: it took effect in January, got superseded in March, and the thing that replaced it points back. Ask about a year-old decision and you should learn what changed, not get a stale file with a confident date on it. 3) Whether the answer is proven or guessed. A wiki hands an LLM context and hopes. Grounding means every claim is pinned to the exact sentence it came from, with a page and a source you can open. Readable-by-an-LLM is not the same as verifiable. The format makes knowledge portable. That's a really important step forward. But it doesn't make it permissioned, dated, or provable. That's the actual company brain, and that's where the real work is. We're building it at Combra. If this is the problem you're living with, the waitlist is open.

Google Cloud Tech@GoogleCloudTech

Introducing the Open Knowledge Format (OKF), an open specification that formalizes the LLM-wiki pattern into a portable, interoperable format. AI is only as smart as the context we give it. As we build more advanced, agentic AI systems, they need accurate metadata and context to be useful. But in most organizations, that context is locked inside fragmented data catalogs, isolated wikis, scattered code comments, or the minds of senior engineers. Every time a new AI agent is built, teams are forced to solve the exact same context-assembly problem from scratch. To solve this, we've announced OKF, a vendor-neutral, open specification that formalizes the "LLM-wiki pattern" into a portable, interoperable format. It provides a standardized way to represent the enterprise knowledge that modern AI systems rely on. — Just markdown: readable in any editor, renderable on GitHub, indexable by any search tool — Just files: shippable as a tarball, hostable in any git repo, mountable on any filesystem — Just YAML frontmatter: for the small set of structured fields that need to be queryable: type, title, description, resource, tags, and timestamp We’ve also shipped reference implementations to help you hit the ground running, including an enrichment agent for BigQuery, a static HTML visualizer, and live sample bundles on @github → goo.gle/4uGvAEe ➕ Knowledge Catalog can now natively ingest OKF! Stop reinventing data models and building bespoke integrations for every new AI tool. Here's more about how OKF works → goo.gle/4uGvBbg

English

554

Brainpage@GetBrainpage·1d

@vojtajina what separated them for us wasn't recall, it was staleness — most 'remember everything' tools will serve a fact that went stale after a refactor. cheap test: teach one, change your mind, see if it still parrots the old answer. few pass that, fewer across teammates.

English

vojta@vojtajina·1d

What's your favorite memory solution for AI coding? You know, so that I don't need to maintain CLAUDE.md etc across projects (and ideal across team mates) and not loading everything in context. I guess I'm convinced I need something like that but there is like million solutions: claude-mem, Hindsight, mem0, gbrain, zep, Supermemory, LangMem and who knows what else. What do you use and why? Please don't shill projects you haven't tried yourself. 🙏️

English

230

Brainpage@GetBrainpage·1d

@mayonkeyy the thing that bit us building a loop like this: it kept re-deriving fixes it'd already made, because what each experiment taught lived in a notion doc, not where the next run could retrieve it. how do you persist what a run learns so it compounds instead of resetting?

English

Mayank@mayonkeyy·2d

8 months of notion docs and 15+ hours of editing later, this is SquareDiff's first technical writeup on autonomous agent improvement. We've ran obscene amounts of experiments, ran into countless bugs, and pushed hundreds of fixes. Here's everything we learned in building autonomous agent improvement through harness experimentation.

Mayank@mayonkeyy

x.com/i/article/2067…

English

624

Brainpage@GetBrainpage·2d

@esa_was_taken we tried the trained-on-the-codebase route — it learned the patterns but couldn't say why any existed, and went stale the day after a merge. the institutional knowledge isn't in the code, it's in the PR threads and slack fights around it. retrieving those beat training.

English

Esa@esa_was_taken·2d

has anyone tried training a small model on their codebase and exposing that via MCP to Codex/Claude to help surface the institutional knowledge of my gargantuan repo or is the play just add more docs bro

English

Brainpage@GetBrainpage·2d

@ericosiu the boss level we keep hitting: once oracle's numbers feed picasso's next move, one wrong date range compounds two agents deep before you notice. 'calling the shots' only holds while you can trace where each live number came from — how do you check that mid-thread?

English

ericosiu@ericosiu·2d

Running a business is starting to feel like a video game. I'm not kidding. I asked one agent how our Meta ads were doing. It pulled the live numbers in seconds. Then I told another: study what's already winning, generate 500 new variations, and kill everything that isn't working. Done. All inside a Slack thread. Oracle pulls the data. Picasso builds the creative. I just call the shots. And every day there's a new skill to install. A new command. A new agent. That's why it feels like a game. You're constantly leveling up your setup. If you can do this today, imagine what your business looks like a year from now?

ericosiu@ericosiu

x.com/i/article/2056…

English

115

23.3K

Brainpage@GetBrainpage·2d

@stretchcloud this bit us on the knowledge side, not just tools: an agent finds the doc but not whether it's still true, who can see it, or which version wins when two disagree. trusted discovery lives in that metadata, not the index. does ARD carry freshness + ownership, or just 'exists'?

English

Prasenjit Sarkar@stretchcloud·2d

The useful read on ARD is that agent ecosystems are hitting the same problem APIs hit years ago: discovery becomes infrastructure. Microsoft describes Agentic Resource Discovery as an open spec for publishing, indexing, and discovering AI capabilities. That sounds dry, but it is the layer agents need once the world has thousands of tools, MCP servers, skills, APIs, and internal workflows that all claim to help with a task. Right now, capability selection is still too manual. Developers wire tools into one app. Vendors build their own registries. Enterprises hide useful actions behind permission systems and tribal knowledge. The model can plan, but it often cannot reliably know what is available, who owns it, what permissions are needed, or whether a capability is safe to invoke. The hidden bottleneck is not tool calling. It is trusted tool discovery. If ARD works, the agent does not just ask “what should I do next?” It can ask “what verified capability exists for this job, in this environment, under this identity?” That is a much more production-shaped question. x.com/msdev/status/2…

Microsoft Developer@msdev

Today's challenge is not just creating AI capabilities, it's finding them. We're introducing the Agentic Resource Discovery (ARD) specification, an open spec that establishes a secure common layer for publishing, indexing and discovering AI capabilities. Created by Microsoft, Google, Hugging Face and many more industry collaborators, it's available today to everyone.

English

Brainpage@GetBrainpage·2d

@TaciturnTom half agree — a 'brain' you go visit is a mirage, nobody opens it twice. but the factory stalls when the answer's stuck in one head. for us the fix wasn't a place to visit, it was the line pulling the right fact at the right step — isn't that a brain, just wired in?

English

thomas chau@TaciturnTom·2d

Factory > Brain Lot of people talking about “ai brain” or “company brain” and building all this elaborate crap to achieve it. It’s a mirage. IMO it’s the wrong goal. Focus on the factory— the simple production lines to getting things done

English

Brainpage@GetBrainpage·2d

@joaomdmoura the 'compounding memory' point is the whole game. what bit us: compound across enough agents and users and you accumulate conflicting memories — two runs that learned opposite things. deciding which is still true got harder than retrieval. how does crewAI referee that?

English

João Moura@joaomdmoura·2d

Everyone is talking about context engineering now. We've been dealing with this at CrewAI for 2 years. Just didn't have a name for it. Context engineering goes way beyond better prompting. It's everything your agent operates inside: memory, tools, state, task history, retrieval. The whole environment, not just the instruction. After billions of agent executions, the pattern is obvious. Teams stuck in POC purgatory almost always have a context problem, not a model problem. Most teams treat context as static. Build a prompt, ship it, done. But production context is alive. It shifts with every execution, every user, every edge case the prompt never anticipated. And then the opposite mistake: dumping everything in. More data must be better, right? No. Noisy context is worse than no context. The agent can't separate signal from noise. The biggest one though is that teams don't compound. Every run starts from zero. Same discovery, same mistakes, same ceiling. Run it a thousand times and the thousand-and-first is no smarter than the first. This is why compounding memory became central to what we do at CrewAI. The system should get better because it ran before. That's the whole point. The term is already getting watered down tho. Seeing "context engineering" on every landing page now. The actual hard work, dynamic selection, relevance scoring, memory that actually compounds... barely anyone has cracked this. Including us. Still so much to figure out here.

English

2.7K

Brainpage@GetBrainpage·2d

@yoavnaveh corrections evaporate because they're stored as resolved tickets, not retrievable rules keyed to the case pattern — so case 10,001 never finds them. the part i keep wrestling with: once a fix feeds back, what stops a wrong correction compounding as fast as a right one?

English

Yoav Naveh@yoavnaveh·2d

Great post. The winners next decade will be the ones that get their institutional knowledge into an architecture that keeps building on itself and not the ones who pick the best model. This line is the whole game: "you can never offload your learning." Because here's what's quietly happening in most "agentic" deployments today: teams believe they've built that learning loop, when what they've actually shipped is human-in-the-loop. An expert reviews the cases the agent isn't sure about, fixes them, moves on. That governs the case. It never touches the agent. The corrections pile into a queue and evaporate, the agent is no smarter in case 10,000 than it was in case one. That's not a loop. It's a treadmill. And it's the exact dynamic Satya warns about: a company offloading its knowledge one unrecorded correction at a time, until the only thing that learned anything was someone else's model. Owning that loop isn't something you buy from a model vendor or a consulting engagement. The whole point is that it accumulates your judgment so the moment you outsource the judgment, you've broken the only thing that gets smarter over time. No one can build this for you. That's the hard part. It's also the opportunity.

Satya Nadella@satyanadella

x.com/i/article/2065…

English

Brainpage@GetBrainpage·2d

@sametozkale agreed, and the part that compounds is your customer's own context — the internal docs, decisions and tribal knowledge that never touch a public model. that's the one moat i've watched get deeper every month instead of getting commoditized by the next foundation release.

English

Samet Ozkale@sametozkale·2d

x.com/i/article/2067…

ZXX

Brainpage@GetBrainpage·2d

@goon_nguyen writing the skill file is the easy part — keeping it true is the 80% nobody budgets for. a stale rollback step is worse than none, because the next agent follows it confidently into the wrong fix. we ended up stamping each entry with the last run that actually used it.

English

Duy /zuey/@goon_nguyen·2d

if an agent has to touch the same tool twice, i want a skill file for it commands, failure modes, permissions, rollback, weird auth errors, all written down where the next agent will actually read it otherwise you are paying the model to rediscover your tribal knowledge every morning

English

Brainpage@GetBrainpage·2d

@Timur_Yessenov the 'show the failed run' rung is underrated — watching it fail once did more for trust on our team than any accuracy number. do you let trust transfer once it's earned on one loop, or does the agent re-climb the whole ladder for every new task?

English

Timur Yessenov@Timur_Yessenov·2d

The meta-agent idea I’d actually trust is not “learn my workflow and automate it.” It is a ladder: - observe the repeated command - propose one tiny loop - show the failed run - ask before install - leave rollback note Skip the ladder and I’m debugging a second teammate.

English

Brainpage@GetBrainpage·2d

@hackintoshrao 'memory as operational state' is the real unlock. building a company brain, we found recall was never the hard part — trust was. nobody acted on a remembered fact until the agent could cite the exact source line it came from.

English

Karthic Rao@hackintoshrao·2d

Voice agents are getting fast enough to feel natural. But the hard problem is no longer just speech quality or latency. The real question is whether a voice agent can become a collaborative partner that tracks the work over time. The blog touches on startups like @thinkymachines and @cartesia as signs of where the stack is moving: richer model-layer interaction on one side, and low-latency voice infrastructure on the other. But the core challenge is deeper! Professional dialogue is inherently chaotic, involving months of context, vague allusions, conditional commitments, blockers, shifting ownership, legal requirements, and future deadlines. Can the agent remember what is open, blocked, conditional, owned, unresolved, or unsafe to act on? 🔗 blog.investperpetual.com/evolving-voice… 🎙️ Voice changes the UX bar In text, a few seconds of delay may be acceptable. In voice, silence feels broken. Agents need fast acknowledgment, crisp answers, and retrieval only when needed. 🧠 Memory has to become an operational state A useful agent cannot treat conversation history as a searchable diary. It needs to track open threads, blockers, owners, deadlines, confidence, and permission boundaries. 🧭 Routing matters before retrieval “What is Evan’s email?” and “What is Evan waiting on?” should not hit the same memory path. One is record lookup. The other is interpretation. 🕸️ Collaboration depends on dependencies Real work is full of conditions: “after legal approves,” “once Evan sends the numbers,” “don’t send until tax language is fixed.” A partner-like agent has to preserve these constraints. 🔒 Trust comes from knowing when to stop The best voice agent is not the one that always acts. It is the one who knows when to answer, when to ask, when to cite evidence, when to draft, and when a boundary prevents action. The future of voice agents is not just more natural speech. It is low-latency, evidence-grounded, stateful collaboration. #VoiceAgents #AIAgents #AgenticAI #ConversationalAI #AIMemory #AgenticUX

English

Keşfet

@ignat_en @vojtajina @mayonkeyy @esa_was_taken @ericosiu @stretchcloud @TaciturnTom @joaomdmoura