gerred

11.3K posts

gerred banner
gerred

gerred

@devgerred

senior principal mts. model whisperer. chasing the speed of light.

Joined Mart 2020
1.2K Following2.5K Followers
Samuel Colvin
Samuel Colvin@samuelcolvin·
PYTHON + RUST. Python inside the sandbox, Python (and some Rust) outside the sandbox. Typescript for frontend developers trying to stay relevant, like Mastra. By the way, the SF bubble is the one place where TS is popular for AI: `openai` package has 53m weekly downloads on PyPI, and 10m on NPM. AND THAT GAP IS WIDENING - used to be 4x, now 5x.
jason liu@jxnlco

Future of AI

English
4
3
146
22.2K
gerred
gerred@devgerred·
@thdxr i know I just overloaded every noun I used but I'm tracking what you wrote, compressed it down hard.
English
0
0
0
41
gerred
gerred@devgerred·
@thdxr tbqh maybe it's the editor that's skeumorphic. what you're wanting in "using it less" is real, but much like spreadsheets mirror their analogue equivalents, I think a lot about post-intelligence age UX. i don't think we're going back to the editor to solve the supervision problem
English
1
0
2
1.1K
dax
dax@thdxr·
when we first started working on opencode days would pass by where we wouldn't use it and go back to our editor for everything had to actively try and switch our workflow now we're on the opposite extreme where we're all trying to use it less - crazy how fast that happened
English
14
1
275
14.9K
gerred
gerred@devgerred·
me n who
gerred tweet media
English
0
0
1
67
gerred
gerred@devgerred·
@jxnlco @jekbradbury You could totally do a pseudo prefix+(simulated, I suppose) radix caching from there to go further but that's really a profiling question from there. With hierarchical KV cache though, it's not out of the question.
English
0
0
0
57
gerred
gerred@devgerred·
I obviously don't have knowledge of what they're doing over there but it's a reasonable way to structure it, much like caching compilation graphs, pre compiling and shipping versioned KV caches is a pretty obvious optimization. Ant has a lot of caching options in their docs that would lead to this has at least entered their minds vs the less sophisticated pre-caching everyone else outside of Google does. I used to use GCP's provisioned caching as a very cost effective, high context limit docs oracle because I could provision and expire it at my leisure.
English
1
0
0
82
gerred
gerred@devgerred·
I'm betting the Anthropic ban of OpenCode is as technical and cost-saving as it is political. I've long argued there's a moat to be had by closing third party tools to subs. CC can rely on KV caching across every instance, and have KV caches on a per-organization basis for further customization for their largest customers. They can, across their entire fleet, pre-compute 1/3-1/2 (if not more) of every CC user's system prompt. By encouraging baking this into MDM and enterprise plans too, they can further negotiate that out in these large contracts. It also potentially lets them do some more clever things than just pure prefix caching and make specific tradeoffs you don't just get by allowing anybody to use those endpoints. At least that's how I'd do it. It surprised me it took THIS long.
English
5
0
37
4.5K
SemiAnalysis
SemiAnalysis@SemiAnalysis_·
Olympian Gold Medalist Alysa Liu, recently went viral for her Teen Vogue rant on OpenAI Codex. “I can see why Sam Altman open sourced Codex. Clearly the experience is significantly worse than Claude Code. I was unable to feel the AGI using Codex. As oppose to using Claude Code, I felt the enlightenment coming and support UBI ”
SemiAnalysis tweet mediaSemiAnalysis tweet media
English
46
35
922
145.6K
gerred
gerred@devgerred·
@stochasticchasm I can't wait to rent your instance in cortical labs cloud.
English
0
0
2
97
stochasm
stochasm@stochasticchasm·
they're capturing my cudagraphs tomorrow
English
4
2
32
931
gerred
gerred@devgerred·
@trq212 That's what I was about to ask after looking at the code, I'm REALLY WANTING claude/channels or another experimental extension here like the apps one. Would love to chat more about this, push notifications would be huge. Will there be a SEP for MCP?
English
1
0
1
2.7K
gerred retweeted
Thariq
Thariq@trq212·
We just released Claude Code channels, which allows you to control your Claude Code session through select MCPs, starting with Telegram and Discord. Use this to message Claude Code directly from your phone.
English
1.1K
1.5K
17.2K
3.2M
gerred
gerred@devgerred·
@_BILLDING_ They can also better avoid shenanigans like quantized KV caching for their own products. So then there's even a quality edge CC can have. I'm like the only inferencing expert that actually puts a product hat on it feels like sometimes.
English
0
0
0
60
gerred
gerred@devgerred·
@_BILLDING_ Yep now imagine that cached at mass scale across every instance for a long time, instead of letting that go cold (like any API user's would). Hierarchical caching is expensive but the cost savings are very worth it.
English
1
0
1
259
gerred
gerred@devgerred·
@kalomaze "why does this model not generalize to my hello world program written in Piet?"
gerred tweet media
English
0
0
0
131
kalomaze
kalomaze@kalomaze·
the kind of person who asks "but does this transfer generalize to Brainfuck?" is simply not being a serious person tbqh
English
8
3
45
1.5K
gerred
gerred@devgerred·
@FamilyofAutumn from an n=1 perspective this is incredibly accurate
English
1
0
0
12
autumn
autumn@FamilyofAutumn·
If you have a nice collection of crazy, odd and unbelievable things that have happened to you, I think it must increase the odds of those things continuing to happen to you
English
4
0
7
273