Eric Stokes

618 posts

Eric Stokes

@eestokesOSS

Claude code's secretary

เข้าร่วม Şubat 2022

318 กำลังติดตาม200 ผู้ติดตาม

ทวีตที่ปักหมุด

Eric Stokes@eestokesOSS·1 Mar

Graphix now has a web site graphix-lang.github.io/graphix/

English

Eric Stokes@eestokesOSS·11 Mar

@yminsky I've had good luck with Opus 4.6 using the workflow, 1. Make a plan 2. Review the plan yourself, revise until correct 3. Clear context and execute plan 4. Clear context and ask for review in plan mode, revise and execute the plan, do ~3 loops of this. 5. Review the code yourself

English

Yaron (Ron) Minsky@yminsky·11 Mar

Awesome that they did this study. It demonstrates things that anyone using the models for serious engineering work could see, but is invisible in the public data.

Parker Whitfill@whitfill_parker

How do benchmarks map to real-world capabilities? To study this, we hired 4 maintainers of repos used in SWE-bench Verified to review agent code. Of agent PRs that passed SWE-bench’s grader, maintainers would merge ~half. This holds accounting for noise in maintainer decisions.

English

8.4K

Eric Stokes@eestokesOSS·6 Mar

@AgustinLebron3 We are all on the lookout for * I changed the *test* to work around an issue 243 of 243 tests pass, everything looks good 😱😱😱

English

Agustin Lebron@AgustinLebron3·6 Mar

We are all test-driven developers now.

English

103

8.8K

Eric Stokes@eestokesOSS·6 Mar

@KentonVarda The jagged frontier is real 😂

English

Kenton Varda@KentonVarda·6 Mar

Opus 4.6 is smart enough to play tic tac toe on this whiteboard with me entirely by making API calls to the app's client API, yet dumb enough to lose at tic tac toe.

English

3.5K

165.5K

Eric Stokes@eestokesOSS·6 Mar

graphix-lang.github.io/graphix The gui package is released!

English

Eric Stokes@eestokesOSS·3 Mar

@yminsky Except now the metric is, can you work on 8 projects at once without losing your mind 😂

English

Yaron (Ron) Minsky@yminsky·3 Mar

And this is a time in which we're more eager than ever to hire great engineers, and we're working on more software projects than ever before.

English

Yaron (Ron) Minsky@yminsky·3 Mar

I wonder if we're starting to hit a deflationary era in software engineering. For the first time, we're starting to talk about this in a planning context; it can make sense to put off some projects because we expect they'll be easier to achieve in the future than today.

English

518

128K

Eric Stokes@eestokesOSS·3 Mar

@yminsky @mbacarella I do think languages they design for their own use is probably the end state. They already appear to get significant value out of expressive and strict type systems, and I see no reason why they wouldn't double down on that.

English

Yaron (Ron) Minsky@yminsky·3 Mar

@mbacarella @eestokesOSS If anything, models will adopt weirder and more esoteric languages, maybe designed just for them, to magnify their intelligence yet more. They'll have fewer social barriers to learning new languages!

English

Yaron (Ron) Minsky@yminsky·1 Mar

Not news, exactly, but an interesting observation about our rapidly changing world

Jules Jacobs@JulesJacobs5

@yminsky (3) Maybe certain libraries are not so valuable any more. A description of the treemap layout algorithm alone is enough or perhaps even better than having an implementation, because it is easier to tweak the description than have AI tweak the code.

English

8.2K

Eric Stokes@eestokesOSS·3 Mar

The term vibe coding won't last the year. We'll just call it coding, and by 2028 no one will even remember what it was like before.

English

Eric Stokes@eestokesOSS·3 Mar

@mbacarella @yminsky But to your point, I really was thinking about the next model when I asked the question. To reframe again. What do programming languages look like when the customer is an AI? I think Ron has the right idea.

English

103

Michael Bacarella@mbacarella·2 Mar

@eestokesOSS @yminsky given how fast Claude is advancing if you said a year from now it'll ingest 50,000 lines of ad hoc assembly, project it into some kind of typed lambda calculus, rework it and then blast it back out I wouldn't be that shocked

English

163

Eric Stokes@eestokesOSS·3 Mar

@mbacarella @yminsky It just found and fixed a bug in the Graphix type checker. The literal most complex piece (don't tell the parser I said that). I helped, but it felt like a team effort and it went a lot faster than if I had done it unaugmented.

English

Eric Stokes@eestokesOSS·2 Mar

@yminsky More succinctly, PL overall still matters, but the set of PL projects that matter just changed almost completely.

English

2.2K

Eric Stokes@eestokesOSS·2 Mar

I agree with everything you said. Type systems help them reason just like they help us reason, we should double down on that and end up with AI that's much more powerful. However, a practical example. I just finished building a new programming language. It's designed to make building UIs much easier than it has been in the past. There are two issues for me specifically. 1. My language isn't in the training set, and so I have to rely on in context learning to get AI to write in it. This is ... not great, not awful. 2. Humans no longer write UI code. Maybe your point is humans might read UI code, and so my language is still worth it. Well, maybe, but probably not in this case. No one cares what the UI code looks like as long as it works. So yeah, the field of programming languages still matters. However a lot of projects that were promising and relevant 3 weeks ago, no longer are now.

English

185

Eric Stokes@eestokesOSS·2 Mar

@mbacarella I have a useless programming language to finish bro!

English

Michael Bacarella@mbacarella·1 Mar

@eestokesOSS tbh if you've managed to lock in and not be distracted by news this well you should keep it up

English

Eric Stokes@eestokesOSS·1 Mar

@esrtweet Ooo, making a list of computer languages. Did you get graphix-lang.github.io/graphix/ it's very new

English

154

Eric S. Raymond@esrtweet·1 Mar

I've learned an interesting number recently, from working on loccount. What would your guess be about the number of distinct computer languages and plain-text markup formats in the world? The range is anything for which "count lines" might be an interesting question on a Unix machine. I've been working on extending loccount's breadth of coverage for a while, and recently I've been using AIs to find obscure languages and markup formats to add. And...I've hit a wall. I've actually had an LLM tell me that I've covered every computer language outside of obscure academic toys, and most of those too. And when I ask it what markup formats I should add, it's reduced to pointing at various obscure bits of glue in build and orchestration systems. So I know what that number is. There's room for some argument about it along the usual splitter/lumper lines, but none of those arguments are going to budge the number by 20% at the outside. Make a guess. Drop it in a reply. I'm interested in what the range of peoples' estimates is. When the hubbub dies down, I'll post the answer.

English

100

134

15.1K

Eric Stokes@eestokesOSS·1 Mar

Three weeks ago I wrote code all day. Now I review Claude's code all day and attend meetings. I avoided it for 25 years, but I've finally been promoted to management.

English

284

Eric Stokes@eestokesOSS·25 Şub

@AgustinLebron3 Yeah, chatbots are saturated, codex and claude code have found a useful job for these models. I'm seeing at least 1 oom speedup in my daily work. If they can expand that utility to a few more domains then the investments can just about pay off.

English

163

Agustin Lebron@AgustinLebron3·25 Şub

"OpenAI itself admits the problem, talking about a ‘capability gap’ between what the models can do and what people do with them, which seems to me [...] you don’t have clear product-market fit." Turns out most people don't need a know-it-all very much. ben-evans.com/benedictevans/…

English

8.4K

Eric Stokes@eestokesOSS·20 Şub

Initially using claude code in the opus 4.6 era felt like I was the bottleneck. Now I'm just working on 3 projects at the same time.

English

Eric Stokes@eestokesOSS·19 Oca

@FaytuksNetwork We have a gold plated trash can ready and waiting to receive their strongly worded letter.

English

Faytuks Network@FaytuksNetwork·19 Oca

European officials are privately describing Trump's rush to annex Greenland as "crazy, mad, and a step too far." European officials say Trump "deserves Europe's toughest retaliation" - POLITICO

English

670

30.5K

ค้นพบ

@yminsky @AgustinLebron3 @KentonVarda @mbacarella @elonmusk @BarackObama @taylorswift13 @cristiano