Derek Matzen

478 posts

Derek Matzen

@derekmatzen

CEO // High Tide building Luna (personal AI) + Ra (infra AI)

Pittsburgh, PA Katılım Haziran 2011

2.4K Takip Edilen317 Takipçiler

Derek Matzen@derekmatzen·10h

@DanielMiessler @MatthewBerman This is the premise of what I’m building - a system that grows with you and is able to recognize your taste. Easier said than done though

English

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler·10h

There are two primary types of Harness Engineering right now. 1. Telling the system exactly how to do things 2. Telling the system exactly what good looks like to you The first is the most common, and comes from early Prompt Engineering. It's the stuff that will get eaten. (See Bitter Lesson Engineering) But the latter will still remain. Explaining who you are, what you're about, what you're working on. What you like and don't like. And what good, bad, and excellent look like to you. That's future-proof Harness Engineering, because no matter how smart the models get, it can still perform better by knowing this. And this applies to both personal and enterprise AI.

English

1.6K

Matthew Berman@MatthewBerman·11h

Betting against models getting better is foolish. So as we build out harnesses, memory systems, etc, how are the core models not just going to eat more of the scaffolding around them?

English

194

20.2K

Derek Matzen@derekmatzen·10h

@runfusion @Teknium @NousResearch Very clean

English

326

Fusion ⚫️@runfusion·11h

@NousResearch @Teknium Fusion! Built w/Hermes and Paperclip

English

112

7.6K

Nous Research@NousResearch·20h

Weekends are for hacking! What are you building with Hermes?

English

213

600

55.6K

Derek Matzen@derekmatzen·18h

Benchmarks alone can be gamed. The real value of domain-specific use case testing is capturing and tracing reasoning through edge case analysis. When you combine domain-specific use case testing with benchmarks and self-improvement loops, your returns compound. You create a system that identifies points of failure, reflects on the gaps, and iterates based on those findings. The key is to avoid hardcoded solutions where practical.

English

Derek Matzen@derekmatzen·18h

So what does domain-specific use case testing look like? It follows a simple structure: For this specific use case, what happens and why? Does this accomplish the primary goal? Is the solution dynamic?

English

Derek Matzen@derekmatzen·20h

What evals are you using for your agent harnesses? I’ve been exploring domain-specific use case testing with LLM-as-a-judge to navigate some of the explicit limitations of golden data sets. Paired with benchmark testing and self-improvement loops, this has been promising

English

Derek Matzen@derekmatzen·20h

@garrytan The biggest enabling factor is allowing the agents to write their own tools and skills. Loop architecture with reflective evals are the key

English

Garry Tan@garrytan·20h

Applied research at the app level happens in the open with open source now Just in time software is a brave new world

λux@novasarc01

i was going through the hermes agent architecture and codebase and one thing that really stood out to me is that hermes is taking a much more explicit route to self-improvement than most agent systems usually imply. like it is not doing some offline trajectory mining where you collect lots of traces, run some separate extraction pipeline, cluster behaviors and then distill them into skills later. instead i think hermes feels much more like agent-mediated procedural distillation: the model itself notices that a workflow is reusable and writes it out into a durable artifact through the skill interface. in fact there is no separate skill-extraction model, no embedding-based clustering pass and no dedicated replay-style learning loop in the main design (hermes is doing act, notice, write, reuse all in one loop). also the interesting part is that the same runtime that acts is also the runtime that writes down its own reusable procedures.

English

148

27.2K

Derek Matzen@derekmatzen·1d

@techNmak Digging the evals set

English

617

Tech with Mak@techNmak·1d

Someone just dropped a 9-layer production AI architecture and it's the most honest breakdown I've seen. services/ - RAG pipeline, semantic cache, memory, query rewriter, router. Not one file. Five. agents/ - document grader, decomposer, adaptive router. Self-correcting by design. prompts/ - versioned, typed, registered. Never hardcoded. security/ - input, content, output. Three guards not one. evaluation/ - golden dataset, offline eval, online monitor. Most people skip this entire layer and ship blind. observability/ - per-stage tracing, feedback linked to traces, cost per query. .claude/ - agent context so your AI coding assistant knows the codebase before it touches a file. The demo is one file. Production is this.

English

264

2.1K

122K

Derek Matzen@derekmatzen·2d

@ionleu http://localhost:3000

QME

John@ionleu·3d

drop ur startup link

English

400

142

17.1K

Derek Matzen@derekmatzen·2d

@MiniMax_AI There goes my weekend plans

English

502

MiniMax (official)@MiniMax_AI·2d

MiniMax Music 2.6 is live. A few things worth knowing: 🎬 Original BGM in minutes No more hunting for "probably fine to use" tracks. Describe your scene, get something fully yours. 🎭 Structure that actually follows your prompt. You can now write "open with tension, build toward awakening, explode into triumph", and the model follows, beat by beat. For the first time, AI music generation feels less like rolling the dice and more like directing. 🎤 Intentional imperfection In lo-fi, indie folk, jazz — the breathiness that makes a track feel human, not generated. Also shipping with 2.6: → First audio in under 20s: write a prompt, take a breath, it's ready → Improved low-mid frequency response: tighter bass for House, Trap, Drum & Bass → Style transfer & remixing: reimagine your own melody in a completely different genre 14-day free global beta starts today (500 songs/day). 👉Try now: minimax.io/audio/music

English

740

60.5K

Derek Matzen@derekmatzen·3d

@CliftonSellers Scale infra for AI demand

Română

Clifton Sellers@CliftonSellers·3d

You got 5 words Sell me your service

English

708

294

45.1K

Derek Matzen@derekmatzen·4d

@forgebitz People naturally underestimate complexity and overestimate simplicity

English

Klaas@forgebitz·4d

a lot of ai saas companies will get lost in making generic agents "marketing agents" that aren't really good at any marketing or "finance agents" that barely can parse a csv file right now all the problems are so extremely complex on their own that just solving one thing is the key customers are not interested in generic agents that can't perform well; they would rather have a dedicated agent/saas who absolutly nails one thing you only care about the best ones, if you have to pick between generic marketing agents or one that is 20% better at just running ads or writing content, what do you pick?

English

139

11K

Derek Matzen@derekmatzen·4d

@thekitze Local LLMs is all you need

English

340

kitze 🛠️ tinkerer.club@thekitze·4d

listen to me!!! download lm studio + gemma 4 you'll *never* know when you might need a local llm for something my hotel had INCREDIBLY shitty wifi and i couldn't connect my beryl router gemma walked me through so many combinations of things to try and finally i'm in and properly connected and have super fast connection 🤓

English

494

34.8K

Derek Matzen@derekmatzen·4d

@nic_carter I think this highlights the importance of being able to use AI to help you learn to become an expert

English

389

nic carter@nic_carter·4d

It should be pretty obvious at this point that AI is a "force multiplier" not a "labor substitute". It helps experts be better at things they are already good at. It doesn't let beginners match experts. If you can't write, anything you write with AI will be unmitigated slop. If you aren't a software engineer, anything you vibecode with AI will have security holes and won't be able to scale past a toy demo. If you blindly trust AI to deliver on a research task without knowing the subject matter, you won't be able to fact-check it. There's this weird misconception of AI as something that completely levels the playing field. I don't see it that way at all. There are mathematicians deriving novel lemmas with off-the-shelf models. Normal people can't do that. AI is a tool that makes experts better. It doesn't make everyone into an expert.

English

183

489

3.9K

231.9K

Derek Matzen@derekmatzen·4d

Are there any companies offering perks & benefits for AI startups? 👀

English

Derek Matzen retweetledi

Ashpreet Bedi@ashpreetbedi·5d

x.com/i/article/2041…

ZXX

808

51.1K

Derek Matzen retweetledi

Sajeel Purewal 🇨🇦 🇵🇰@Sajeel_Purewal·3 Nis

Build Robots Build Drones Build Hexapods Build Glasses Build Radios Build Clocks Build Rovers Build Wearables Build Rockets Build Exoskeletons Build Sensors Build it all blueprint.am

English

411

3.3K

204.1K

Derek Matzen retweetledi

Charlie O'Neill@oneill_c·1 Nis

x.com/i/article/2039…

ZXX

209

75.6K

Derek Matzen retweetledi

Pau Labarta Bajo@paulabartabajo_·31 Mar

60-minute deep dive on how to fine-tune a Small Language Model for browser control Enjoy ↓ youtube.com/watch?v=gKQ08y…

YouTube

English

128

Derek Matzen retweetledi

Boris Cherny@bcherny·30 Mar

I wanted to share a bunch of my favorite hidden and under-utilized features in Claude Code. I'll focus on the ones I use the most. Here goes.

English

554

2.5K

23.1K

3.9M

Derek Matzen@derekmatzen·28 Mar

@mem0ai Compaction

English

mem0@mem0ai·27 Mar

We’re starting a @mem0ai article series on AI agent memory & context engineering, in context. Which memory system should we cover next? Drop it below 👇

English

Keşfet

@DanielMiessler @MatthewBerman @runfusion @Teknium @NousResearch @garrytan @techNmak @ionleu