
Jonathan Hefner
140 posts

Jonathan Hefner
@hefnerdotpro
Working on https://t.co/ur0WT8vRhs | All my issues are skill issues
Присоединился Ağustos 2024
122 Подписки32 Подписчики

@andrewqu I can imagine this being awesome for quickly interacting with agents (at fine granularity).
English

@dreamsofcode_io That is a useful framing because it suggests a design strategy: relentlessly ask “what can go wrong?” and iterate until the answer is acceptable (based on product requirements).
English

@sarahwooders That's an interesting point. See a dev blog describing a best practice you want to adopt? Point your memory-augmented agent at it and tell it to do that from now on.
English

@OfficialLoganK More benchmarks also means more well-defined targets for frontier labs, so it's a bit of a win-win.
English

@BenjaminBadejo My own short version of this is "Do you understand what I mean?"
In addition to making sure we're on the same page, I feel like having the AI state its understanding in its own words improves adherence during implementation.
English
Jonathan Hefner ретвитнул

I’ve seen people on X dunking on folks like @garrytan @doodlestein and others for sharing SKILL dot md files they've built. They are dismissing these files as "just a markdown file.”
I think this misses the point entirely and I'll try to address that here. Quick thread:
A bad skill file is just text, sure.
A good skill file is compressed expertise, packaged in a format an agent can actually use.
The value is not just in the “markdown file.” The value is the interaction between:
a huge neural network with latent capabilities
a precise, reusable, agent-readable procedure that steers those capabilities toward a specific outcome
That combination is the product.
Saying “it’s just markdown” is like saying Hamlet is “just ink on paper,” or Einstein’s relativity paper was “just a text.”
Technically true. Intellectually useless.
The medium is simple. The content is what matters. And more importantly, the effect of that content on the reader is what matters.
With humans, a book, a coach, a lecture, or painting can change how someone thinks and acts.
With LLMs, text is also the control surface. These models were trained on text, reason through text, call tools through text, and follow procedures through text.
So yes, the skill is “just text.”
But it is text designed to be read by an enormous neural net.
That matters.
A good skill is agent-ergonomic. It does not merely say “do this better.” It encodes workflow, constraints, examples, edge cases, tool usage, failure modes, and success criteria in a way the agent can reliably execute.
That is very different from a casual prompt.
A prompt is often a one-off request.
A skill can be reused, versioned, tested, improved, shared, and loaded at the exact moment an agent needs it.
That turns “vibes-based prompting” into something closer to operational knowledge.
Another way to think about it:
We have built these massive models, but much of their power is latent. Different people can extract very different levels of performance from the same model.
A good skill is a way to actualize a specific slice of that latent capability.
A refactoring skill.
A research skill.
A legal review skill.
A math explanation skill.
A codebase-navigation skill.
Each one can make the same model behave very differently.
I think of Cus D’Amato and Mike Tyson.
Tyson had enormous latent potential. But Cus gave him a system, a style, a discipline, a way to channel that potential.
That’s what good skills are for agents.
They are not magic. They are not all equally valuable. Many will be mediocre or useless.
But dismissing them right off the batt because they are “just markdown” shows a misunderstanding of what LLMs are.
Text is how we trained these systems. (for the most part)
Text is how we steer them.
Text is how we unlock parts of what they can do.
The question is not whether a skill file is “just text.”
The question is whether the text reliably makes the model perform better at a valuable task.
If yes, then it is not “just markdown.”
It is leverage.
English

@mattpocockuk If you want to host them on a website, you can use the Agent Skills `.well-known` URI: github.com/cloudflare/age… / github.com/agentskills/ag…
It's supported by `npx skills add`.
English

Nearly 23K stars for a collection of markdown files I wrote
I guess they must be pretty good
I want to invest more time in this repo. So, folks who starred it, what can I do to make these skills more obvious to you?
- A docs site for the skills?
- Send them to plugin marketplaces?
Help me help you
github.com/mattpocock/ski…
English

@zeeg The honest answer is that it was copied from Claude Code Skills early on in the standardization process. Since then, CC has introduced even more features, which we decided to wait to standardize until they get more consensus. allowed-tools is a wart, but mostly inconsequential.
English

@hefnerdotpro the spec suggests things that no one implements - why are they part of the spec if they're not going to exist?
allowed-tools is unlikely to ever function, so not sure why it was added in the first place
English

can we talk about how absurd it is that there's this SKILL.md spec on agentskills.io that is not implemented by anyone
and that some of the spec can't even work? allowed-tools for example
English

@andrewqu @zeeg I think there are multiple ways skill support can be implemented (i.e., no single "right" way).
If anyone is looking for guidance on how to implement skills support, see agentskills.io/client-impleme…, which covers some of the choices.
English

@hefnerdotpro @ndrewpignanelli Bruh it doesn't solve it. think of it this way how would you use grep when your documents are in a blob or in index db. that was the point you can't use it as a service.
English

people don’t understand this take cause they don’t understand what’s happening in AI memory.
Everything is moving to git backed files accessible via grep-type-systems or semantic plus grep which isn’t very defensible to offer as a service. In other words… the SOTA approaches to memory are now just agent plus terminal.
And all the fancy approaches like knowledge graphs are getting rekt by an agent plus a terminal. Your fancy agent structure is getting rekt by a model that can keep track of anything over 1000+ terminal calls.
Satyam@KlausCodes
I believe, the AI memory startups need to pivot now
English

@EwoofCMD @ndrewpignanelli The terminal interface is just an interface. It doesn't need to be an actual terminal. For example, it can be implemented with something like github.com/vercel-labs/ju….
English

@ndrewpignanelli eh. the issue is the agent terminal situation is not scalable though web. LLMs are already expensive now you need to give each agent you make a computer?
English

@Presidentlin Same. I also make them use `any` for all types -- saves a ton of tokens.
English

@simonw I imagine Eye of the Tiger is playing in the background.
English

These pelicans are kind of angry looking!
Left is deepseek-v4-flash, right is deepseek-v4-pro - both generated using OpenRouter via my LLM tool


DeepSeek@deepseek_ai
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n
English

@marine28041 @GoogleCloudTech @github Yes, Gemini CLI supports skills. Many other clients too! See agentskills.io/clients for (non-exhaustive) list.
English

Our official Agent Skills repository on @github is here!
Skills are a simple, open format for giving agents new capabilities and expertise. Think of a skill as compact, agent-first documentation for a specific tech or task.
Learn more → goo.gle/4eCsZqu #GoogleCloudNext

English

@lucas59356 @GoogleCloudTech @github That's an option. It depends on how you want to architect progressive loading for your context. If there is a single entry point for the model, then it can be a single skill with many resource files. But if there are multiple entry points, then you'd want a skill for each.
English

@GoogleCloudTech @github Why not one skill with multiple resources in it?
English














