Jonathan Hefner

140 posts

Jonathan Hefner

Jonathan Hefner

@hefnerdotpro

Working on https://t.co/ur0WT8vRhs | All my issues are skill issues

Bergabung Ağustos 2024
122 Mengikuti33 Pengikut
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@andrewqu I can imagine this being awesome for quickly interacting with agents (at fine granularity).
English
0
0
1
179
Andrew Qu
Andrew Qu@andrewqu·
Slack - but you can react to sections of long messages instead of the whole message
English
15
4
134
17.9K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@dreamsofcode_io That is a useful framing because it suggests a design strategy: relentlessly ask “what can go wrong?” and iterate until the answer is acceptable (based on product requirements).
English
0
0
0
35
Dreams of Code
Dreams of Code@dreamsofcode_io·
I’ve come to the conclusion that LLMs by themselves don’t make software less stable. What they do however is amplify Murphys Law to a level most developers have never seen. I think we’re going to see a real difference between software “development” and “engineering”.
English
10
2
43
2.7K
Hieu Pham
Hieu Pham@hyhieu226·
Is it possible to distill a specific domain skill from an LLM into a smaller one? For instance: Codex can code everything, but if I only want CUDA kernels, may I get away with a much smaller model? What are the intuitions? How would things fall apart? 😅
English
65
12
637
114.3K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@sarahwooders That's an interesting point. See a dev blog describing a best practice you want to adopt? Point your memory-augmented agent at it and tell it to do that from now on.
English
0
0
1
44
Sarah Wooders
Sarah Wooders@sarahwooders·
When agents have memory, they can just learn to automatically do the things you’d otherwise some UI/UX for (e.g. in your ADE/IDE): - create worktrees for new tasks - open files in zed/vscode/cursor - link the conversational from PRs This is why Letta Code app is quite minimal
English
6
0
17
1K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@OfficialLoganK More benchmarks also means more well-defined targets for frontier labs, so it's a bit of a win-win.
English
0
0
1
111
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Every company building on top of AI should be making their own benchmarks. This is the way if you want model progress to disproportionally benefit your company.
English
134
97
1.9K
140.5K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@BenjaminBadejo My own short version of this is "Do you understand what I mean?" In addition to making sure we're on the same page, I feel like having the AI state its understanding in its own words improves adherence during implementation.
English
0
0
1
19
Ben Badejo
Ben Badejo@BenjaminBadejo·
Believe if or not, the key to getting good results when building with AI, is to say to your AI harness, “Before you begin, state back to me clearly what you think you are being asked to do, and ask me any questions you may have.”
English
3
2
12
855
Jonathan Hefner me-retweet
Oussama Sekkat
Oussama Sekkat@osekkat·
I’ve seen people on X dunking on folks like @garrytan @doodlestein and others for sharing SKILL dot md files they've built. They are dismissing these files as "just a markdown file.” I think this misses the point entirely and I'll try to address that here. Quick thread: A bad skill file is just text, sure. A good skill file is compressed expertise, packaged in a format an agent can actually use. The value is not just in the “markdown file.” The value is the interaction between: a huge neural network with latent capabilities a precise, reusable, agent-readable procedure that steers those capabilities toward a specific outcome That combination is the product. Saying “it’s just markdown” is like saying Hamlet is “just ink on paper,” or Einstein’s relativity paper was “just a text.” Technically true. Intellectually useless. The medium is simple. The content is what matters. And more importantly, the effect of that content on the reader is what matters. With humans, a book, a coach, a lecture, or painting can change how someone thinks and acts. With LLMs, text is also the control surface. These models were trained on text, reason through text, call tools through text, and follow procedures through text. So yes, the skill is “just text.” But it is text designed to be read by an enormous neural net. That matters. A good skill is agent-ergonomic. It does not merely say “do this better.” It encodes workflow, constraints, examples, edge cases, tool usage, failure modes, and success criteria in a way the agent can reliably execute. That is very different from a casual prompt. A prompt is often a one-off request. A skill can be reused, versioned, tested, improved, shared, and loaded at the exact moment an agent needs it. That turns “vibes-based prompting” into something closer to operational knowledge. Another way to think about it: We have built these massive models, but much of their power is latent. Different people can extract very different levels of performance from the same model. A good skill is a way to actualize a specific slice of that latent capability. A refactoring skill. A research skill. A legal review skill. A math explanation skill. A codebase-navigation skill. Each one can make the same model behave very differently. I think of Cus D’Amato and Mike Tyson. Tyson had enormous latent potential. But Cus gave him a system, a style, a discipline, a way to channel that potential. That’s what good skills are for agents. They are not magic. They are not all equally valuable. Many will be mediocre or useless. But dismissing them right off the batt because they are “just markdown” shows a misunderstanding of what LLMs are. Text is how we trained these systems. (for the most part) Text is how we steer them. Text is how we unlock parts of what they can do. The question is not whether a skill file is “just text.” The question is whether the text reliably makes the model perform better at a valuable task. If yes, then it is not “just markdown.” It is leverage.
English
10
7
53
11.3K
Matt Pocock
Matt Pocock@mattpocockuk·
Nearly 23K stars for a collection of markdown files I wrote I guess they must be pretty good I want to invest more time in this repo. So, folks who starred it, what can I do to make these skills more obvious to you? - A docs site for the skills? - Send them to plugin marketplaces? Help me help you github.com/mattpocock/ski…
English
135
171
2.4K
191.2K
dax
dax@thdxr·
@Hacksore i forgot which one was ours
English
10
2
501
16.3K
Hacksore
Hacksore@Hacksore·
Tell me the diff in these icons
Hacksore tweet mediaHacksore tweet media
English
13
0
183
30.1K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@zeeg The honest answer is that it was copied from Claude Code Skills early on in the standardization process. Since then, CC has introduced even more features, which we decided to wait to standardize until they get more consensus. allowed-tools is a wart, but mostly inconsequential.
English
0
0
1
70
David Cramer
David Cramer@zeeg·
@hefnerdotpro the spec suggests things that no one implements - why are they part of the spec if they're not going to exist? allowed-tools is unlikely to ever function, so not sure why it was added in the first place
English
1
0
0
152
David Cramer
David Cramer@zeeg·
can we talk about how absurd it is that there's this SKILL.md spec on agentskills.io that is not implemented by anyone and that some of the spec can't even work? allowed-tools for example
English
10
0
37
6.5K
Andrew Qu
Andrew Qu@andrewqu·
@zeeg I think a big gap is that there’s no de-facto reference architecture for how to implement every part of skills/plugins Pi or some minimal agent could the first consumer of any new skills spec feature, and every new coding agent could be rebased on top
English
2
0
4
286
Ewoof
Ewoof@EwoofCMD·
@hefnerdotpro @ndrewpignanelli Bruh it doesn't solve it. think of it this way how would you use grep when your documents are in a blob or in index db. that was the point you can't use it as a service.
English
1
0
1
47
andrew pignanelli
andrew pignanelli@ndrewpignanelli·
people don’t understand this take cause they don’t understand what’s happening in AI memory. Everything is moving to git backed files accessible via grep-type-systems or semantic plus grep which isn’t very defensible to offer as a service. In other words… the SOTA approaches to memory are now just agent plus terminal. And all the fancy approaches like knowledge graphs are getting rekt by an agent plus a terminal. Your fancy agent structure is getting rekt by a model that can keep track of anything over 1000+ terminal calls.
Satyam@KlausCodes

I believe, the AI memory startups need to pivot now

English
89
76
1.7K
246.6K
Ewoof
Ewoof@EwoofCMD·
@ndrewpignanelli eh. the issue is the agent terminal situation is not scalable though web. LLMs are already expensive now you need to give each agent you make a computer?
English
2
0
2
4K
Lincoln 🇿🇦
Lincoln 🇿🇦@Presidentlin·
I no longer have the LLM write tests. Cost of tokens going up. We have to be frugal.
English
7
0
30
1.6K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@simonw I imagine Eye of the Tiger is playing in the background.
English
0
0
0
94
Google Cloud Tech
Google Cloud Tech@GoogleCloudTech·
Our official Agent Skills repository on @github is here! Skills are a simple, open format for giving agents new capabilities and expertise. Think of a skill as compact, agent-first documentation for a specific tech or task. Learn more → goo.gle/4eCsZqu #GoogleCloudNext
Google Cloud Tech tweet media
English
49
748
5.4K
447.5K
Jonathan Hefner
Jonathan Hefner@hefnerdotpro·
@lucas59356 @GoogleCloudTech @github That's an option. It depends on how you want to architect progressive loading for your context. If there is a single entry point for the model, then it can be a single skill with many resource files. But if there are multiple entry points, then you'd want a skill for each.
English
0
0
0
67