Charles Pick

1.9K posts

Charles Pick

@c_pick

Founder of @codemixdotcom - codemix makes sure humans and coding agents always build the right thing.

York, United Kingdom Katılım Ağustos 2010

711 Takip Edilen760 Takipçiler

Charles Pick@c_pick·2h

@liran_tal @mattpocockuk gpt-tokenizer is just too slow for a lot of use cases, it's obviously a lot more accurate for openai models, but I found that accuracy wasn't worth the performance penalty for my needs

English

Liran Tal@liran_tal·3h

@c_pick @mattpocockuk but the ergonomics for that is painful although likely claude can figure out the bash for that by piping sed and awk if you want to try something cleaner though I built tokenu github.com/lirantal/tokenu

English

Matt Pocock@mattpocockuk·7 Mar

Has anyone made du, but for tokens? I want to make a rule for my agent where if a file is more than 5K tokens, split it up.

English

22K

Charles Pick@c_pick·1d

> it's essentially the same model as postinstall scripts these are gradually being phased out of package managers because they're so dangerous. Honestly it feels like this needs rethinking. I doubt hardly anyone reads their SKILL.md files thoroughly, and markdown is not a well-known threat vector - I guess with things like this it will become well-known soon.

English

Lydia Hallie ✨@lydiahallie·1d

@itechnologynet Ah this fails by default, it only runs if the skill's frontmatter declares something like allowed-tools: Bash/Bash(rm *) etc. Just make sure to check what the skill allows if you're downloading anything external, it's essentially the same model as postinstall scripts

English

23.3K

Lydia Hallie ✨@lydiahallie·2d

if your skill depends on dynamic content, you can embed !`command` in your SKILL.md to inject shell output directly into the prompt Claude Code runs it when the skill is invoked and swaps the placeholder inline, the model only sees the result!

English

126

240

2.9K

804.5K

Charles Pick@c_pick·1d

@kentcdodds *British spy. FWIW I don't usually run into this problem either, the dev tools got quite good at compaction (though it is slow and sometimes i start fresh just to avoid that). It's a real thing when using LLMs in apps though and you do need good compaction strategies there too

English

Kent C. Dodds ⚡@kentcdodds·1d

Maybe this is the "German spy raises three fingers meme" but I'm telling you, it's just not been a significant issue for me in those tools. I don't use Claude Code or Open Code so my exposure to this problem might be limited?

English

2.5K

Kent C. Dodds ⚡@kentcdodds·1d

Everyone knows that the last 40% of the context window of AI models start to get pretty unreliable... Except I don't experience this at all 🤔 My two primary AI tools are Cursor (mostly GPT 5.4) and ChatGPT with very long conversations. Cursor compacts and it's still does fine.

English

10.1K

Charles Pick@c_pick·1d

@threepointone very good, next step is give it a language server for tsc

English

322

sunil pai@threepointone·1d

full rewritten from scratch, now executed (and secured!) inside a dynamic worker, a lot lighter and powerful. much, MUCH more to come. (sharing here early because you know how we do)

sunil pai@threepointone

what if we gave every cloudflare agent a file system (sqlite/r2) tools to operate on it (shell) powered (and secured!) by codemode all powered by workers ai (or byom)

English

783

73.4K

Charles Pick@c_pick·1d

@threepointone @supabase @kiwicopple these people are the worst

English

631

sunil pai@threepointone·1d

hey @supabase @kiwicopple I think Tyler's account's been compromised

English

111

23.8K

Charles Pick@c_pick·1d

@chribjel those that don't eventually regret it

English

157

Christoffer Bjelke@chribjel·1d

Do you think you are expected to keep yourself up to date on your field of work outside of work hours? E.g. learn new technologies, ai-tools etc if you are a developer?

English

4.3K

Charles Pick@c_pick·1d

@sebastienlorber most of what we do is not speccing programming languages though. The thing that people seem to be missing is the idea of "gradual specification" - you can specify things in close detail when you want fine grained control, and broad strokes the rest of the time

English

224

Seb ⚛️ ThisWeekInReact.com@sebastienlorber·1d

Here's a sufficiently detailed spec from a JS std method (referencing many other specified things) Wouldn't you rather write the code directly? 😆

gabby@GabriellaG439

New blog post: "A sufficiently detailed spec is code" I wrote this because I was tired of people claiming that the future of agentic coding is thoughtful specification work. As I show in the post, the reality devolves into slop pseudocode haskellforall.com/2026/03/a-suff…

English

7.2K

Charles Pick@c_pick·1d

Hate MCP all you want, but the sheer potential for malicious SKILL.md files should keep you up at night. That is a wolf in sheep's clothing.

English

Charles Pick@c_pick·1d

@dillon_mulroy i had one of my worst days with agents in ages, the variance is the worst aspect. Consistently good would be ideal, but I could also live with consistently bad, this middle ground feels like a casino

English

Charles Pick@c_pick·1d

@mattzcarey @garrytan but they will accept it, no doubt

English

180

Charles Pick@c_pick·1d

@mattzcarey @garrytan like, no one is reading that carefully before they accept it

English

185

Matt Carey@mattzcarey·1d

so now we chill executing untrusted bash through a md file

Lydia Hallie ✨@lydiahallie

English

686

60.4K

Charles Pick@c_pick·1d

@DavidKPiano @combdn also, also, curious how this plays with the cache

English

Charles Pick@c_pick·1d

@DavidKPiano @combdn the ui is a lot simpler and easier to understand if you omit deep threading, also how do you know when to "return" to the parent thread (if you're thinking about replacing sub-agents this seems important)

English

David K 🎹@DavidKPiano·2d

Rough diagram:

David K 🎹@DavidKPiano

Been thinking about this a lot (and prototyping). Rough thoughts on how I think conversations should work; brain-dump for sake of sharing: - Tree data structure, nodes have parent IDs - Threads can either trace full path to conversation root or summarize from "branch" point - Threads can have threads, inf recursive - Agent still sees linear conversation (path to root or summary) - Threads are first-class primitives with their own metadata (name, purpose, etc), not just implicit from tree - Threads can run in parallel: agents can do work in multiple threads simultaneously - Messages have sender identity, not roles: "user" / "assistant" is a two-player limitation IMO - Messages can reference other messages and threads by ID: agents can cite, link, and build on each other's work - Events are the source of truth (very actor-model coded); messages and threads are derived views - Tool calls for creating, reading, and summarizing threads - Compatible across providers, of course --- I also have some ideas around multi-agent participation, state machine-driven agent behavior, & structured conversation flow that make this significantly more interesting

English

8.3K

Charles Pick@c_pick·2d

Q. which model has been lobotomised today? A. all of them

English

Charles Pick@c_pick·2d

biggest LLM-UI tell - <Card><Card><Card><Card></Card></Card></Card></Card>

English

Charles Pick@c_pick·2d

@threepointone @max__drake

QME

sunil pai@threepointone·2d

@max__drake I like that. we'll do it.

English

250

sunil pai@threepointone·2d

what if this is only part of the story Think about it

sunil pai@threepointone

what if we gave every cloudflare agent a file system (sqlite/r2) tools to operate on it (shell) powered (and secured!) by codemode all powered by workers ai (or byom)

English

15.2K

Charles Pick@c_pick·2d

seems like there's 2 different approaches emerging: 1. Maximum freedom for the agents, letting them run with little oversight - the claws, and now hermits of this world 2. Extensive guardrails, tight controls and restrictions that keep agents on track towards specific goals As models improve it's going to be interesting to see which wins

English

Ben Schrauwen@benschrauwen·2d

Over the weekend, I used Hermit, the autonomous application framework, to: - have Hermit manage its own website and come up with a CRM demo - tasked it to keep track and manage all the loose ends around the house - created a librarian running my book collection - set up a personal coach to help me train for this summer's big swimrun event - onboarded a couple of people to help them use it as assistants for their small business It is quite amazing that such a small codebase can design and manage its own datastore, create the app, write skills, ... The most fun is that you design it by describing the job that Hermit has. It is not just code writing or question answering, it is an agent that creates its own environment to be able to do its job. You should give it is spin, it runs by default in a secure sandbox and I added an easy to get started MacOS walkthrough. hermit-ai.com

English

176

Charles Pick@c_pick·2d

@alexdotjs

QME

Alex / KATT 🐱@alexdotjs·2d

my peak is now 8 concurrent coding agents at the same time

English

3.6K

Keşfet

@liran_tal @mattpocockuk @itechnologynet @kentcdodds @threepointone @supabase @kiwicopple @chribjel