Derek Matzen

478 posts

Derek Matzen banner
Derek Matzen

Derek Matzen

@derekmatzen

CEO // High Tide building Luna (personal AI) + Ra (infra AI)

Pittsburgh, PA Katılım Haziran 2011
2.4K Takip Edilen317 Takipçiler
Derek Matzen
Derek Matzen@derekmatzen·
@DanielMiessler @MatthewBerman This is the premise of what I’m building - a system that grows with you and is able to recognize your taste. Easier said than done though
English
0
0
0
68
ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️
There are two primary types of Harness Engineering right now. 1. Telling the system exactly how to do things 2. Telling the system exactly what good looks like to you The first is the most common, and comes from early Prompt Engineering. It's the stuff that will get eaten. (See Bitter Lesson Engineering) But the latter will still remain. Explaining who you are, what you're about, what you're working on. What you like and don't like. And what good, bad, and excellent look like to you. That's future-proof Harness Engineering, because no matter how smart the models get, it can still perform better by knowing this. And this applies to both personal and enterprise AI.
English
2
2
39
1.6K
Matthew Berman
Matthew Berman@MatthewBerman·
Betting against models getting better is foolish. So as we build out harnesses, memory systems, etc, how are the core models not just going to eat more of the scaffolding around them?
English
74
6
194
20.2K
Nous Research
Nous Research@NousResearch·
Weekends are for hacking! What are you building with Hermes?
English
213
19
600
55.6K
Derek Matzen
Derek Matzen@derekmatzen·
Benchmarks alone can be gamed. The real value of domain-specific use case testing is capturing and tracing reasoning through edge case analysis. When you combine domain-specific use case testing with benchmarks and self-improvement loops, your returns compound. You create a system that identifies points of failure, reflects on the gaps, and iterates based on those findings. The key is to avoid hardcoded solutions where practical.
English
0
0
0
12
Derek Matzen
Derek Matzen@derekmatzen·
So what does domain-specific use case testing look like? It follows a simple structure: For this specific use case, what happens and why? Does this accomplish the primary goal? Is the solution dynamic?
English
1
0
0
16
Derek Matzen
Derek Matzen@derekmatzen·
What evals are you using for your agent harnesses? I’ve been exploring domain-specific use case testing with LLM-as-a-judge to navigate some of the explicit limitations of golden data sets. Paired with benchmark testing and self-improvement loops, this has been promising
English
1
0
0
36
Derek Matzen
Derek Matzen@derekmatzen·
@garrytan The biggest enabling factor is allowing the agents to write their own tools and skills. Loop architecture with reflective evals are the key
English
0
0
0
67
Tech with Mak
Tech with Mak@techNmak·
Someone just dropped a 9-layer production AI architecture and it's the most honest breakdown I've seen. services/ - RAG pipeline, semantic cache, memory, query rewriter, router. Not one file. Five. agents/ - document grader, decomposer, adaptive router. Self-correcting by design. prompts/ - versioned, typed, registered. Never hardcoded. security/ - input, content, output. Three guards not one. evaluation/ - golden dataset, offline eval, online monitor. Most people skip this entire layer and ship blind. observability/ - per-stage tracing, feedback linked to traces, cost per query. .claude/ - agent context so your AI coding assistant knows the codebase before it touches a file. The demo is one file. Production is this.
Tech with Mak tweet media
English
28
264
2.1K
122K
John
John@ionleu·
drop ur startup link
English
400
3
142
17.1K
MiniMax (official)
MiniMax (official)@MiniMax_AI·
MiniMax Music 2.6 is live. A few things worth knowing: 🎬 Original BGM in minutes No more hunting for "probably fine to use" tracks. Describe your scene, get something fully yours. 🎭 Structure that actually follows your prompt. You can now write "open with tension, build toward awakening, explode into triumph", and the model follows, beat by beat. For the first time, AI music generation feels less like rolling the dice and more like directing. 🎤 Intentional imperfection In lo-fi, indie folk, jazz — the breathiness that makes a track feel human, not generated. Also shipping with 2.6: → First audio in under 20s: write a prompt, take a breath, it's ready → Improved low-mid frequency response: tighter bass for House, Trap, Drum & Bass → Style transfer & remixing: reimagine your own melody in a completely different genre 14-day free global beta starts today (500 songs/day). 👉Try now: minimax.io/audio/music
English
32
89
740
60.5K
Clifton Sellers
Clifton Sellers@CliftonSellers·
You got 5 words Sell me your service
English
708
3
294
45.1K
Derek Matzen
Derek Matzen@derekmatzen·
@forgebitz People naturally underestimate complexity and overestimate simplicity
English
0
0
2
39
Klaas
Klaas@forgebitz·
a lot of ai saas companies will get lost in making generic agents "marketing agents" that aren't really good at any marketing or "finance agents" that barely can parse a csv file right now all the problems are so extremely complex on their own that just solving one thing is the key customers are not interested in generic agents that can't perform well; they would rather have a dedicated agent/saas who absolutly nails one thing you only care about the best ones, if you have to pick between generic marketing agents or one that is 20% better at just running ads or writing content, what do you pick?
English
70
4
139
11K
kitze 🛠️ tinkerer.club
listen to me!!! download lm studio + gemma 4 you'll *never* know when you might need a local llm for something my hotel had INCREDIBLY shitty wifi and i couldn't connect my beryl router gemma walked me through so many combinations of things to try and finally i'm in and properly connected and have super fast connection 🤓
kitze 🛠️ tinkerer.club tweet media
English
44
11
494
34.8K
Derek Matzen
Derek Matzen@derekmatzen·
@nic_carter I think this highlights the importance of being able to use AI to help you learn to become an expert
English
0
0
0
389
nic carter
nic carter@nic_carter·
It should be pretty obvious at this point that AI is a "force multiplier" not a "labor substitute". It helps experts be better at things they are already good at. It doesn't let beginners match experts. If you can't write, anything you write with AI will be unmitigated slop. If you aren't a software engineer, anything you vibecode with AI will have security holes and won't be able to scale past a toy demo. If you blindly trust AI to deliver on a research task without knowing the subject matter, you won't be able to fact-check it. There's this weird misconception of AI as something that completely levels the playing field. I don't see it that way at all. There are mathematicians deriving novel lemmas with off-the-shelf models. Normal people can't do that. AI is a tool that makes experts better. It doesn't make everyone into an expert.
English
183
489
3.9K
231.9K
Derek Matzen
Derek Matzen@derekmatzen·
Are there any companies offering perks & benefits for AI startups? 👀
English
0
0
2
77
Derek Matzen retweetledi
Sajeel Purewal 🇨🇦 🇵🇰
Sajeel Purewal 🇨🇦 🇵🇰@Sajeel_Purewal·
Build Robots Build Drones Build Hexapods Build Glasses Build Radios Build Clocks Build Rovers Build Wearables Build Rockets Build Exoskeletons Build Sensors Build it all blueprint.am
English
42
411
3.3K
204.1K
Derek Matzen retweetledi
Boris Cherny
Boris Cherny@bcherny·
I wanted to share a bunch of my favorite hidden and under-utilized features in Claude Code. I'll focus on the ones I use the most. Here goes.
English
554
2.5K
23.1K
3.9M
mem0
mem0@mem0ai·
We’re starting a @mem0ai article series on AI agent memory & context engineering, in context. Which memory system should we cover next? Drop it below 👇
English
23
4
82
9K