Grant Gochnauer

13.6K posts

Grant Gochnauer banner
Grant Gochnauer

Grant Gochnauer

@GrantGochnauer

Dad, Co-Founder & CTO at @Vodori, Engineer, Builder, Geek, Guitar, Music, LGBT, Philanthropy, ESTJ. Relentlessly Curious.

Chicago, IL Katılım Şubat 2008
472 Takip Edilen722 Takipçiler
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
@trondw @NotebookLM @GoogleLabs Oh btw, roadmap request: NotebookLM MCP or CLI tool so I can wire up my Hermes agent to my saved YouTube, aggregate in NotebookLM, and then push into my PKM system
English
0
0
1
114
Trond Wuellner
Trond Wuellner@trondw·
I have some news: I’ve started a new chapter helping lead product for @NotebookLM at @GoogleLabs NotebookLM is a genuine partner for research, learning, and project organization, built entirely from your own sources. That transparency is why I believe it’s a core pillar of Google’s AI future. My mission is to scale this product while ensuring our commitment to grounding and user trust remains our North Star. Huge thanks to @joshwoodward, @tokumin and the NLM team for the incredible foundation of trust. I’m excited to build in the open, stay close to your feedback, and continue building this with you. Let's get to work! 🚀
English
73
32
783
32.3K
Grant Gochnauer retweetledi
Graeme
Graeme@gkisokay·
The LLM Cheat-Sheet for OpenClaw + Hermes agents (04.17.26) Claude Opus 4.7 just dropped and replaces 4.6 at the same $5/$25 pricing. SWE-Verified jumps to 87.6%, SWE-Pro to 64.3%, task budgets, and xhigh effort for agentic loops. There's one caveat where the new tokenizer may use up to 35% more tokens per request so its the same price with a higher effective cost. Watch your limits. MiMo V2 Pro joins Role 2. Xiaomi's agent-native model of 1T+ params, 42B active, 1M context, built specifically for OpenClaw/Hermes workflows. Here's the full landscape: 19 models, 4 roles, every one earning its place. Role 1 — Frontier - Claude Opus 4.7: #1 SWE-Verified and SWE-Pro, 3.75MP vision - GPT-5.4: best Terminal-Bench in Role 1, super app capabilities announced - GLM-5.1: #1 SWE-Pro globally, 8-hour autonomous execution, MIT license Role 2 — Execution - MiniMax M2.7: 97% skill adherence, built for agents - MiMo V2 Pro: purpose-built for OpenClaw, ClawEval approaches Opus 4.7, 1M context - Kimi K2.5: long-horizon stability, agent swarm - DeepSeek V3.2: frontier reasoning at 1/50th the cost Role 3 — Balanced - Claude Sonnet 4.6: 98% of Opus at 1/5 the cost - GPT-5.4 mini: 93.4% tool-call reliability, runs on OAuth - Grok 4.20: lowest hallucination rate on the market, native multi-agent, 2M context - Gemini 3.1 Pro: only option with native video + audio. Pick it if your stack needs multimodal - Qwen3.6 Plus: near-frontier coding and reasoning - Llama 4 Maverick: open-weight, self-host at zero marginal cost - Mistral Small 4: one model replacing three — reasoning, vision, and agentic coding, Apache 2.0 Role 4 — Local / $0 for 16GB/32GB (unquantized) - Qwen3.5-9B: always-on subconscious loop, 16GB RAM, beats models 13x its size - Qwen3.5-27B: stronger instruction following, 32GB RAM - Gemma 4 31B: best local reasoning, Apache 2.0, commercial-ready - DeepSeek R1 distill: best chain-of-thought at $0 - GLM-4.5-Air: purpose-built for agent tool use and web browsing, not a trimmed general model Full breakdown with benchmarks, costs, and use cases in the table ↓
Graeme tweet media
Graeme@gkisokay

The LLM Cheat-Sheet for Hermes + OpenClaw Agents (04.12.26) The community has flagged Claude Opus 4.6 underperforming lately while GLM 5.1 has exploded on the scene to claim frontier capabilities. A lot has changed since the last version. Here's what moved: GLM-5.1 just proved its frontier capabilities with #1 SWE-Pro globally, 8-hour autonomous execution, and cheaper than Opus on input. It earns a Tier 1 spot. Grok 4.20 enters Tier 2 with the lowest hallucination rate of any tested model, a native multi-agent API running up to 16 parallel agents, and a 2M context window. Gemini 3.1 Pro drops to Tier 3. The price and multimodal story is strong, but the new frontier bar left it behind on reasoning. Mistral Small 4 joins Tier 3. One model replacing three specialist pipelines (reasoning, vision, agentic coding) at $0.15/M input. Apache 2.0. Here's the full landscape: 18 models in 4 tiers. Tier 1 - Frontier Models - Claude Opus 4.6: #1 agentic terminal coding; watch for inconsistency reports - GPT-5.4: superhuman computer use, real planning. and introduced a $100/month plan - GLM-5.1: #1 SWE-Pro globally, 8-hour autonomous execution, MIT license Tier 2 - Execution - MiniMax M2.7: 97% skill adherence, built for agents. API only, not open weights - Kimi K2.5: long-horizon stability, agent swarm - Grok 4.20: lowest hallucination rate on the market, native multi-agent, 2M context - DeepSeek V3.2: frontier reasoning at 1/50th the cost Tier 3 - Balanced - Claude Sonnet 4.6: 98% of Opus at 1/5 the cost - GPT-5.4 mini: 93.4% tool-call reliability, runs on OAuth - Gemini 3.1 Pro: best multimodal value, native video+audio in one call - Qwen3.6 Plus: near-frontier coding, completely free via OpenRouter - Llama 4 Maverick: open-weight, self-host at zero marginal cost - Mistral Small 4: one model replacing three; reasoning, vision, agentic coding, Apache 2.0 Tier 4 - Local / $0 - Runs on 32GB RAM or less - Qwen3.5-9B: always-on subconscious loop, 16GB RAM, beats models 13x its size - Qwen3.5-27B: stronger instruction following, 32GB RAM - Gemma 4 31B: best local reasoning, Apache 2.0, commercial-ready - DeepSeek R1 distill: best chain-of-thought at $0 - GLM-4.5-Air: purpose-built for agent tool use and web browsing, not a trimmed general model Full breakdown with benchmarks, costs, and use cases in the table ↓

English
15
21
147
26.2K
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
I get this when I try it: Mem0 is not working as-is. Both `mem0_search` and `mem0_profile` are failing with the same error: > "Filters are required and cannot be empty. Please refer to docs.mem0.ai/api-reference/…" This is an API change on the mem0 side — their v2 API now requires filters (like user ID, org ID, or agent ID) on every call. The Hermes tool wrappers aren't passing them. Two things to look into: 1. **Do we need to configure mem0 credentials/filters?** — There may be a config file (likely `~/.hermes/config.yaml` or similar) where a `user_id` or `filters` block needs to be set so th e tools automatically include them. 2. **This might be a Hermes-internal fix needed** — If the filters are supposed to be auto-populated but aren't, it's a bug in how the mem0 plugin talks to their API.
English
0
0
0
32
Teknium 🪽
Teknium 🪽@Teknium·
Hermes Agent now supports @plastic_lab's Honcho, @mem0ai, @openvikingai, @Vectorizeio's Hindsight, @retaindb, and @ByteroverDev memory systems! Try them now with `hermes update` then `hermes memory setup` We have rehauled our memory system to be much more maintainable and pluggable, so anyone can make their own memory system to build on top of Hermes easily and cleanly with a special class of plugin! Which memory system is your favorite?
Teknium 🪽 tweet media
English
111
109
1K
138.6K
Grant Gochnauer retweetledi
am.will
am.will@LLMJunky·
Introducing Subagents for Codex - as a skill! While we wait for the official release of Codex Subagents (they are coming), I created an intelligent skill that will automatically be called in your session any time a request is likely to 'waste' a significant number of tokens in the intermediate steps. In other words, it should invoke itself any time a token hungry task is being launched where you only care about the final output. Things like: - Extensive file search in your codebase - Web searching - Long running tasks - Documentation generation - Multi Step Workflows - Writing long form content - Test suite analysis - Log error analysis - Migration and Refactoring - Dependency Audits - API exploration And to take it a little further, any task that doesn't require additional intelligence (like web search or simple file exploration), Codex 5.1 Mini is used to save usage. If its a multi step workflow that requires intelligence, it will inherit the parent's model/reasoning. Or the agent/user can change the model/reasoning if you choose. But if its just searching, it'll use 5.1 mini. You may invoke this skill but it should work on its own. I'm not the first or the last to think of something like this, but my aim here was to make it as simple as possible to implement it right into your existing Codex installation. Feedback welcome. OSS Codex Skill Installer Available: github.com/am-will/codex-… npx @am-will/codexskills --user am-will/codex-skills/skills/codex-subagent
am.will tweet media
English
8
13
170
10.3K
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
@nummanali This sounds incredibly exciting - I have been thinking a lot about similar ideas - especially as it relates to how AI can be applied to create more leverage in our schools. Also founder + CTO here - would love to learn more about what you are thinking. How can I help?
English
1
0
1
9
Numman Ali
Numman Ali@nummanali·
My 2026 New Years Resolution as a Product focused, AI Supercharged CTO I am going to create a movement on X, globally, to empower millions of people to utilise AI to better their lives and those around them There will be - a global community - a new org for good - online events - ai powered discord servers - in person meets - guides and courses - ambassadors Tell me what else? Drop a comment if you're joining me and what you want to do - I'll DM what I've already started
English
1
0
9
685
Grant Gochnauer retweetledi
Numman Ali
Numman Ali@nummanali·
Prompting GPT 5.2 Codex for Continuity It excels at long running tasks but without explicit guidance can lose track of outcomes Put this at the top of your AGENTS .md file, it will let Codex work on even larger scale tasks It's how I let it run for 3 hours coherently
Numman Ali tweet media
English
40
91
1K
410.4K
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
@bryan_johnson @pumpdotscience Can you elaborate on what kind of sleep disturbances you were seeing? RHR, HRV, temps? I take 500mcg at 4am. Curious what your dose was too
English
0
0
0
263
Bryan Johnson
Bryan Johnson@bryan_johnson·
Protocol updates this past week: + stopped small molecule SLU-PP-332 due to it causing sleep disturbances, was using it for possible mitochondria improvements + got my semen microplastics results back, sharing tomorrow + decided to pass on doing cryotherapy as a longevity intervention + sourcing a skin elasticity device to measure effects of skin therapy + 10th week of metformin off stage + 6th week of PEMF
English
73
10
702
120.2K
Simon Kubica
Simon Kubica@simonkubica·
It’s official – I’m excited to introduce Alloy (@alloyapp), the world’s first tool for prototypes that look exactly like your product. All year, PMs and designers have struggled with off-brand prototypes – built with “app builder” tools that look nothing like their existing app. They’re left with confused stakeholders, prototypes they can’t show customers, and demos where they’re apologizing for the design. Your prototypes should look like your actual product. Starting today, they can. Alloy is AI Prototyping built for Product Management: ➤ Capture your product from the browser in one click ➤ Chat to build your feature ideas in minutes ➤ Share a link with teammates and customers ➤ 30+ integrations for PM teams: Linear, Notion, Jira Product Discovery, and more In lab results, Alloy delivers 3-5x more detail than alternatives when you start from an existing product. It’s powered by groundbreaking technology you won’t find in any other tool. Alloy is now available for you to try for free. Comment “ALLOY” and I’ll DM you an invite with instant access and extra credits.
English
760
109
2.1K
275.1K
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
@Peptide_researc just downloaded the app. Does pretty much everything I need! One issue is that I’m unable to add a blend peptide.
English
0
0
0
6
Grant Gochnauer retweetledi
Aadit Sheth
Aadit Sheth@aaditsh·
A senior Google engineer just dropped a 424-page doc called Agentic Design Patterns. Every chapter is code-backed and covers the frontier of AI systems: → Prompt chaining, routing, memory → MCP & multi-agent coordination → Guardrails, reasoning, planning This isn’t a blog post. It’s a curriculum. And it’s free.
Aadit Sheth tweet media
English
72
758
5.1K
468.7K
Grant Gochnauer retweetledi
Wes Roth
Wes Roth@WesRoth·
OpenAI DevDay 2025 Summary Growth and Adoption •User base: ChatGPT has jumped from 100 million weekly users (2023) to 800 million+ (2025). •Developer base: Weekly active developers doubled from 2 million to 4 million in the same period. •Throughput: The platform now handles 6 billion tokens /minute on the API (vs. 300 million /minute in 2023). Apps Inside ChatGPT •Apps SDK (preview) built on Model Context Protocol lets anyone create fully interactive, personalized apps that run natively in ChatGPT. •Launch partners: Booking, Canva, Coursera, Expedia, Figma, Spotify, Zillow (live today for all non-EU Free, Go, Plus, Pro users). •Monetization: Agentic Commerce Protocol enables instant, in-chat checkout. •Roadmap: App submissions open later this year, a public app directory, and expansion to ChatGPT Business, Enterprise, Edu and EU. Building Agents •AgentKit •Agent Builder (visual drag-and-drop canvas, beta) •ChatKit (embeddable chat components, GA today) •Evals expansion (datasets, trace grading, automated prompt tuning, 3rd-party model support, GA) •Connector Registry (beta) unifies data sources—Dropbox, Drive, SharePoint, Teams, custom MCP servers—under one admin console. •Guardrails: open-source safety layer for PII masking, jailbreak detection, and other defenses. Coding With Codex •General availability: Codex graduates from research preview, adds Slack integration, SDK, and admin dashboards. •Adoption: Daily Codex usage is up 10× since August; GPT-5-Codex has served 40 trillion tokens in three weeks. •Internal impact: 70 % more PRs merged per week; automatic reviews on almost every PR. •Quota change: Starting Oct 20, cloud tasks count against Plus/Pro usage limits (local messages unaffected for now). API Line-Up and Pricing Highlights gpt-5-pro-2025-10-06 — deep reasoning for finance, legal, and healthcare •Price: $15 (input) / $120 (output) per million text tokens •Highest-accuracy GPT-5 tier gpt-realtime-mini-2025-10-06 — fast, low-cost text and voice •Price: $0.60 / $2.40 per million text tokens; $10 / $20 per million audio tokens •About 70 % cheaper than the advanced voice model gpt-audio-mini-2025-10-06 — cost-efficient audio processing •Same pricing as gpt-realtime-mini •Tuned for transcription and text-to-speech sora-2 / sora-2-pro — video generation with synchronized sound •Price: $0.10–$0.50 per second, depending on resolution and tier •Flexible control over length, aspect ratio, and quality gpt-image-1-mini — economical vision model •Price: $2 per M text input tokens, $2.50 per M image input tokens, $8 per M image output tokens, or $0.005–$0.015 per image (quality-dependent) •Roughly 80 % cheaper than the larger vision model
Wes Roth tweet media
English
3
2
13
2.1K
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
Absolute must read if you are at all interested in AI and how AI will impact our future - and your role in it.
Carlos E. Perez@IntuitMachine

Everyone "knows" that as AI gets better, humans become less valuable. Except three economists just proved the exact opposite using math from 1973 and Steve Jobs. And it explains something that's been driving researchers crazy... Why did computers make inequality WORSE but ChatGPT is making it BETTER? The data is bizarre. In the 1990s, computers widened wage gaps everywhere they appeared. But study after study shows AI helping struggling workers more than experts. I spent the morning with this research paper and... the answer flips our entire mental model. Think about how you use ChatGPT. You don't just type once and walk away, right? You iterate. You refine. You spot opportunities to improve. That back-and-forth? That's the key to everything. The researchers decomposed ALL cognitive work into three parts: Implementation (doing the task) Opportunity judgment (seeing what could be better) Payoff judgment (knowing what actually matters) Here's where it gets wild... AI is really good at implementation. Like, scary good. A junior coder with Cursor can suddenly write like they have 5 years experience. But that's not the interesting part... The better AI gets at implementation, the MORE valuable your judgment becomes. It's multiplicative, not substitutive. Imagine you're a designer. AI can now execute any design in seconds. But knowing WHICH design to make? When to iterate? What the client actually needs? That's all you. The math proves something counterintuitive: as tools get more powerful, the gap between someone who can spot opportunities and someone who can't gets BIGGER. But wait - why is AI currently reducing inequality then? Because we're in phase one. Right now, AI is compensating for skill differences. The struggling workers get huge boosts. The experts? They were already good at implementation. Phase two is coming though... Once implementation is basically free (think: anyone can code, design, write), the ONLY thing that matters is judgment. Who sees the opportunity? Who knows what's valuable? And that's when inequality explodes again. The paper even calculates the exact turning point. Here's what broke my brain: better AI makes full automation LESS likely, not more. Why? Because automated systems have fixed judgment. They can't adapt. A radiologist AI might be 99% accurate, but it can't realize "wait, this patient's case is weird, I should think differently." The flexibility to adjust your judgment in real-time? That's uniquely human. And it gets MORE valuable as the tools improve. Even crazier: this changes how teams should work. The paper shows that as AI improves, control should shift from people who are good at DOING to people good at SEEING opportunities. We're already seeing this. That study about Microsoft's Kinect? Machine vision experts suddenly mattered less than generalists who could spot novel uses. You know what this reminds me of? The shift from craftsmen to designers during industrialization. The machines could make anything. The value moved to knowing WHAT to make. We're about to see the same thing with cognitive work. Next time you use ChatGPT, try this: instead of focusing on getting it to do the task perfectly, focus on recognizing opportunities to iterate. That skill - seeing what could be better - that's your moat. The researchers call it "opportunity judgment" and it's about to become the most valuable skill in the economy. Quick test: Give two people the same AI tool and the same task. The output difference? That's pure judgment. And that gap is about to get a lot wider. One finding haunts me: the paper shows task-based predictions (like "AI will replace X jobs") are missing the point entirely. They measure what people do TODAY. But the whole point is that AI changes what the job even IS. A lawyer's job won't be "writing contracts." It'll be "knowing which contract variation creates the most value in this specific situation." Completely different skill. The paper maps out exactly when to automate vs augment. The formula is complex but the intuition is simple: If judgment variance is high → augment If tasks are predictable → automate If stakes are high → definitely augment Here's my take: we're training for the wrong future. Everyone's learning to prompt better. But prompting is just implementation. The real skill is recognizing when the output could be better and knowing what "better" means for your specific context. Schools teaching "AI literacy"? They're teaching people to be better bicycles. We should be teaching people to be better riders. (That's literally where the paper's title comes from - Jobs called computers "bicycles for the mind") Last thought that changes everything: The paper proves that in high-judgment work, making AI 10x better might make humans 100x more valuable. Because you can iterate faster. Test more ideas. Explore more opportunities. Your judgment gets amplified. So the question isn't "will AI replace me?" It's "am I developing the judgment to ride increasingly powerful bicycles?" Because the bicycles are about to get VERY fast. And the gap between good riders and bad ones is about to become a chasm. What patterns are you starting to notice in your field that others are missing? That's your future edge. And it's about to matter more than ever. /end PS - If you're curious about the math, the paper actually derives the exact inequality curve. It's U-shaped. We're at the bottom of the U right now. The climb up is coming. Makes you wonder what other "obvious" things about AI we have completely backwards...

English
0
0
0
21
Wes Roth
Wes Roth@WesRoth·
It's so over / we're so back?
Wes Roth tweet media
English
7
0
15
1.7K
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
Hey #codex you are drunk, go home. Codex decided it wanted to do a "git reset --HARD" and lose all its changes. When I called it out, it denied that it was going to run the command or even wanted to! #vibecoding #openai
Grant Gochnauer tweet media
English
1
0
1
54
Grant Gochnauer
Grant Gochnauer@GrantGochnauer·
If you are using AI to build software, you are probably aware how critical the context is in order to product high quality code/design. I stumbled upon one of the best summaries of the problem with clear strategies and examples to work efficiently with AI coding agents from @humanlayer_dev: Getting AI to Work in Complex Codebases: github.com/humanlayer/adv…
English
1
0
0
23
eric zakariasson
eric zakariasson@ericzakariasson·
until this is native in product (which is soon), here's an approach you can take to implement the pattern of research, plan and implement
eric zakariasson tweet media
English
47
42
1.1K
158K