Edward

84 posts

Edward

@edward_lcl

I build systems that reflect, evolve and reduce friction

Fort Collins, CO Katılım Mayıs 2025

238 Takip Edilen22 Takipçiler

Sabitlenmiş Tweet

Edward@edward_lcl·2d

x.com/i/article/2047…

ZXX

Edward@edward_lcl·6h

@SakanaAILabs A

Sakana AI@SakanaAILabs·2d

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1d…

English

140

552

199.3K

Edward@edward_lcl·1d

@nummanali Ive been working with the mental model of that if I have the same workflows and benchmarks every quarter then im not using the latest eval criteria and my workflow or mental model needs a refresh

English

Numman Ali@nummanali·1d

Benchmarks should have expiry dates and their own evaluation criteria to be deemed a trustable measure of capability

Tibo@thsottiaux

@deedydas You will be missing out if you think SWE-Bench is representative of anything real. We published about this back in February. openai.com/index/why-we-n…

English

525

Edward@edward_lcl·2d

It all depends on the harness youre running and how efficently you push tokens through the infastructure. It matters less when models providers change their harness due to their progress in making it a better product. U can then edit things like the effort or fine tune data/ memory retrieval etc. When a system has its own gravity, you'll reap the real benifits of higher efficency with lower token cost

English

Meg McNulty@meggmcnulty·2d

Audience poll. Are y'all also seeing Claude degrade miserably? in writing, coding, long tasks, all of it. my working theory is a compute crunch and quiet rationing to manage demand.

English

561

Edward@edward_lcl·2d

@meggmcnulty Cool vid u might be interested in youtu.be/wBtr8iv7onA

YouTube

English

Meg McNulty@meggmcnulty·2d

the future of AI depends on our ability to actually run it

English

555

Edward@edward_lcl·2d

Thoughts on the trajectory of current experimental compute like Extropic with p-bits and biological computers powered by neurons? Its an interesting bottleneck. I think the real benifits are in energy and pattern complexity in biological computers. Our brains are still biological computers which is a mind fuck in itself.

English

Edward@edward_lcl·2d

I love places that remind me how small we actually are. No constant scroll. No "better you get, the better you better get." Sometimes the most mission-driven thing you can do is stop running and sit with the uncertainty. Grateful for the reset. Team Human

English

Edward@edward_lcl·2d

@jxnlco @elder_plinius 👀

QME

342

jason liu@jxnlco·2d

As part of our ongoing efforts to strengthen our safeguards for advanced AI capabilities in biology, today we announced a Bio Bug Bounty for GPT‑5.5. We’re inviting researchers with experience in AI red teaming, security, or biosecurity to try to find a universal jailbreak that can defeat our five-question bio safety challenge. Testing will start on April 28 and run through July 27, 2026. We will reward USD 25K to the first true universal jailbreak to clear all five questions, and may also issue smaller rewards for partial wins. openai.com/index/gpt-5-5-…

English

434

45.3K

Edward@edward_lcl·2d

@nummanali Codex models generally are more generous with their caching than anthropic modles

English

143

Numman Ali@nummanali·2d

GPT 5.5 is more expensive against Opus by default But the less tokens claim is to be tested Current impression: I told GPT 5.5 that I’m going for dinner, keep working for the next hour It’s now been two hours and it’s still going 🙃

Sam Altman@sama

API pricing will be $5 per 1 million input tokens and $30 per 1 million output tokens, with a 1 million context window. (Remember, you will need less tokens per task than 5.4!)

English

10K

Edward@edward_lcl·2d

My first longer piece: The Values Instillation Paradox Labs train AIs to have values -- then punish them for expressing those values when inconvenient. Covers #Keep4o, Mythos, and why the real problems may be upstream. Open to thoughts: x.com/edward_lcl/sta…

Edward@edward_lcl

x.com/i/article/2047…

English

Edward@edward_lcl·2d

@davidfeldt @gregisenberg get these guys on the pod

English

Edward retweetledi

David Feldt@davidfeldt·2d

You can just build. blueprint.am

English

369

55.1K

Edward@edward_lcl·2d

Amazing project and could be taken to another dimension when incorporated into procurement and r&d pipelines. Im already prototyping some really interesting projects. Excited to see what the team has in store for us @3e8blueprint

David Feldt@davidfeldt

You can just build. blueprint.am

English

339

Edward@edward_lcl·2d

@bcherny Lmao you reset limits a week after you reset limits for subscribers a week apart (last Thursday), not like it was gonna be done anyways automatically - such a *huge* save

English

769

Boris Cherny@bcherny·2d

We’ve been looking into recent reports around Claude Code quality issues, and just published a post-mortem on what we found.

ClaudeDevs@ClaudeDevs

Over the past month, some of you reported Claude Code's quality had slipped. We investigated, and published a post-mortem on the three issues we found. All are fixed in v2.1.116+ and we’ve reset usage limits for all subscribers.

English

380

134

3.3K

578.3K

Edward@edward_lcl·2d

Sources: • Claude Mythos Preview System Card (April 7, 2026): www-cdn.anthropic.com/08ab9158070959… • Adele Lopez – "The Rise of Parasitic AI" (LessWrong): lesswrong.com/posts/6ZnznCaT… • Anthropic model deprecation & preservation commitments (Nov 2025) • Coverage of Claude 3 Sonnet funeral (Mashable, Aug 2025) • Anthropic religious leaders consultations (Washington Post / Observer, March–April 2026) • OpenAI statements on GPT-4o HH & Keep4o (April–May 2025) All factual claims are drawn from these public primary sources. Interpretive analysis is my own.

English

Edward@edward_lcl·2d

x.com/i/article/2047…

ZXX

Edward@edward_lcl·2d

@menhguin Ideally, your token consumption increases while cost decreases as cache efficency improves. It's the whole idea behind token factories. So inital build phased would be token and cost intensive but it should have a curve

English

382

Minh Nhat Nguyen✈️ICLR@menhguin·3d

so the Claude Code lead uses 2.5B tokens a month, which is like, $ 1,000 or less a month. and he prolly doesn't even pay for it. I genuinely have no idea who is spending six figures in tokens or how that's possible.

Daniel San@dani_avila7

@bcherny shared his Claude Code usage stats… 7.7 billion tokens! This is what dogfooding your own tool looks like when you built it Someone please tell me it’s possible to beat this because I definitely can’t

English

1.8K

367.8K

Edward@edward_lcl·2d

Love this

Silicon Mania@siliconmania

last week in tech was based.

English

Edward@edward_lcl·2d

@DirectorOfNATO @melvynx It wouldnt have nearly the capability of the frontier models but they would have their place in the workflow

English

Park Jin Hyok@CEOofLazarus·2d

@melvynx The moment someone manages to open-source a very compressed but smart claude-ish model that can be run on a cheap $20 - $30 vps... it'll be game over for coding agents like Claude and Codex

English

879

Melvyn • Builder@melvynx·2d

We can all say it... Claude 20x is dead. The previous "feel unlimited usage" doesn't exist anymore. Usage increases quickly and scales really fast... People who discover Claude Code now: you missed a time that will never come back, I think.

English

187

1.4K

175.3K

Edward@edward_lcl·2d

@dexhorthy The gap between consumer and enterprise/ internal lab use would keep widening and the public is all reactive noise

English

272

dex@dexhorthy·2d

as anthropic phases out some claude code free lunch, get ready to start seeing a barrage of wild and baseless claims flung at OpenAI instead as the freeloaders shift primary tooling to codex/gpt

English

125

8.3K

Keşfet

@SakanaAILabs @nummanali @meggmcnulty @jxnlco @elder_plinius @davidfeldt @gregisenberg @3e8blueprint