Stephen Edginton

2.2K posts

Stephen Edginton

@StephenEdginton

Chief Product and Technology Officer @Dext | ex Founder - Technology | Business | Fitness

England, United Kingdom Katılım Ocak 2012

5.3K Takip Edilen620 Takipçiler

Stephen Edginton@StephenEdginton·4d

@drewhouston @mitchellh We will have fine tuned smaller models specialised on language and OS that will bring this down

English

Drew Houston@drewhouston·4d

@mitchellh Looking like GLM 5.2 is truly Opus-tier -- to run it fast (>100 tok/sec) you'll need 8x RTX 6000 pros minimum ($125-150k), but achievable now

English

1.9K

Mitchell Hashimoto@mitchellh·4d

We've gone really quickly from "local models are dogshit" to "local models are good actually" (like, a 12 month window from A to B). I don't think they're actually good ENOUGH yet. We need an Opus 4.5 quality local model. When that happens, I think the world will spill over. Opus 4.5 is/was amazing, and is more than good enough for almost all tasks still as long as you pair with a frontier-level planner/judge. It'll still require a hugely expensive machine to run it, I'm sure, like a $5K or more laptop or mac studio. But, that's going to be pennies compared to the API costs plus all the benefits of guaranteed privacy and so on.

English

177

202

3.9K

248.1K

Stephen Edginton@StephenEdginton·4d

@charliebcurran So good spacex should buy you too

English

101

Charles Curran@charliebcurran·4d

I used AI to explain SpaceX to my girlfriend, with fruit.

English

411

807

7.1K

556.4K

Stephen Edginton@StephenEdginton·5d

@msdev Wow that’s incredible

English

379

Microsoft Developer@msdev·5d

Meet the Majorana 2, a next-generation topological quantum chip developed with the help of Microsoft Discovery’s agentic AI.

English

107

712

62.1K

Stephen Edginton@StephenEdginton·13 Haz

@DavidSacks Let’s hope we can get this solved quickly we know it’s just slowing down the inevitable

English

David Sacks@DavidSacks·13 Haz

I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true: — As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable. — Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.) — A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused. — In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.” — In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety. — In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community. — The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority. — Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.

English

2.2K

3.2K

25.5K

7.9M

Stephen Edginton@StephenEdginton·13 Haz

@matthewclifford Agree we need to control our own destiny sovereign AI

English

Matt Clifford@matthewclifford·13 Haz

I do find it extraordinary that current events in AI don’t make the top ~30 stories on the BBC News homepage

English

119

1.6K

176.2K

Stephen Edginton retweetledi

Xenova@xenovacom·13 Haz

I gave Fable 5 one job: write custom WebGPU kernels for Gemma 4 inference. It climbed to 84 tok/s, then hit a wall, insisting further optimization was impossible. Hours later, Anthropic rolled back invisible LLM development safeguards, and it hit 255 tok/s. The next day, access to Fable 5 was suspended globally.

English

146

370

5.3K

1.1M

Stephen Edginton@StephenEdginton·13 Haz

@AnthropicAI Oh great just when things were starting to get good

English

Anthropic@AnthropicAI·13 Haz

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

English

12.6K

25.8K

88.3K

91.4M

Stephen Edginton retweetledi

pabs@pabloberlangab·7 Haz

Introducing Pemba. The first humanoid to climb to 20,000ft. Everest next. More below.

English

152

117

800

234.1K

Scott Hanselman 🌮@shanselman·6 Haz

VibeOS - Fully Hallucinated Operating System from Microsoft BUILD #msbuild by @stevensanderson (genius) (relax, it's a joke)

English

383

81.3K

Stephen Edginton@StephenEdginton·6 Haz

@shanselman @stevensanderson Crazy

English

Stephen Edginton@StephenEdginton·4 Haz

@_catwu I like the hook idea will intent that for our dbt repos

English

269

cat@_catwu·4 Haz

Excited to share how Anthropic's data team has automated 95% of business analytics queries with Claude. Blog post covers how we approach evals, ablations, and online validation!

ClaudeDevs@ClaudeDevs

How do we automate business analytics with Claude? New blog post covering our best practices for skills, data foundations, and evaluations when building agents to perform data analysis: claude.com/blog/how-anthr…

English

120

880.3K

Stephen Edginton@StephenEdginton·1 Haz

@thdxr Already did

English

110

dax@thdxr·1 Haz

if you're setting up a new linux machine pick btrfs instead of ex4 trust me

English

217

2.5K

284.9K

Stephen Edginton@StephenEdginton·30 May

@thdxr Agree

English

dax@thdxr·30 May

i have seen enough proof now that using a coding agent is a deep skill it's confusing because the people you see heavily using them produce horrible results but that's because it's a skill! you can get better and the ceiling seems pretty high - this is very exciting to me

English

321

395

6.5K

379.5K

Stephen Edginton@StephenEdginton·29 May

@MikushRab @Thom_Wolf Maybe this will become the new are you a human test pick which one matches

English

392

Michael Rabinovich@MikushRab·29 May

Opus 4.8 just dropped and I ran it through our CAD tasks. 4.6 → 4.7 → 4.8 side by side. The results are unexpected!

English

198

193

3.5K

708.1K

Stephen Edginton@StephenEdginton·28 May

@claudeai @IrenaCronin Still has a tendency to stop and declare defeat

English

Claude@claudeai·28 May

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors. Available today at the same price.

English

3.7K

8.6K

67.4K

15.3M

Stephen Edginton@StephenEdginton·28 May

@ycombinator @IrenaCronin @LightconePod @koomen This was very good love the fact your all leading the way internally and on side projects agree with the direction here all the primitives are being rediscovered and refactored - I’m going to feed the transcript to my company agent now to dream about

English

651

Y Combinator@ycombinator·27 May

Over the past year, we've been building our own internal agent infrastructure at YC: over 350 tools, self-improving skill loops, and a shared organizational brain that gets smarter overnight. In this episode of the @LightconePod, we sat down with YC General Partner Pete @koomen to talk about how he led the effort from the ground up. We cover how giving agents unrestricted access to one database was the key unlock, the self-improving skill loops that get smarter overnight, and why he thinks we've arrived at the personal computer moment for AI. 00:39 — YC's AI Stack 02:15 — The Finance Team Problem That Started It All 05:07 — SQL Access Changes Everything 07:20 — One Database to Rule Them All 09:14 — Jevons Paradox 10:07 — Denormalizing for Agents 12:15 — The Single-Player Era of Agents 14:16 — 350 Tools and a Shared Registry 16:24 — Skillify, DRY, and MECE Resolvers 18:23 — The Self-Improving Dream Cycle 20:26 — The Two-Sentence Pitch Skill 23:06 — How Super Intelligence Compounds 25:10 — Recording Everything as a Building Layer 27:10 — The Shared Organizational Brain 29:18 — Trust-Default Culture as a Requirement 30:44 — Raising the Floor for New Employees 32:35 — Horseless Carriages 34:24 — Why Chat Is the Best Interface for Agents 38:50 — Just-in-Time Software 40:49 — Centralizing vs. Decentralizing AI 43:32 — The Personal AI Revolution

English

118

805

751.7K

Stephen Edginton@StephenEdginton·27 May

@haoailab @ltx_model Wow impressive work

English

504

Hao AI Lab@haoailab·27 May

🚀Generate a 30-second 1080p video in just 7 seconds! We’re open-sourcing FastVideo Dreamverse: real-time vibe directing for video generation on a single NVIDIA B200 GPU with LTX-2 model @ltx_model Repo: github.com/hao-ai-lab/Fas… Blog: haoailab.com/blogs/fastvide…

English

115

732

223.6K

Stephen Edginton@StephenEdginton·26 May

@justindross Maybe that’s the missing trick we will need AI phycologists to help solve and manipulate the agents charge them for coaching

English

327

JD Ross@justindross·26 May

@StephenEdginton finetune it on David Goggins?

English

3.3K

JD Ross@justindross·25 May

This technology is so weird. Our CTO ran an agent overnight that decided “to sleep” for 4 hours at 2am before starting back on the task again. Hope the computer is less tired now.

English

1.5K

69.5K

Stephen Edginton@StephenEdginton·22 May

@BritishArmy @NATO Why would we be shouting about this? Makes zero sense

English

610

British Army 🇬🇧@BritishArmy·22 May

We've been conducting a major @NATO exercise in London this week - and no one above ground suspected a thing. Hundreds of soldiers have been testing how they would run a major NATO command post, hidden deep beneath one of the busiest cities on earth. Read more ⬇️ bit.ly/4dMwHfs

English

107

210

1.2K

172.3K

Keşfet

@drewhouston @mitchellh @charliebcurran @msdev @DavidSacks @matthewclifford @AnthropicAI @stevensanderson