Edward

84 posts

Edward banner
Edward

Edward

@edward_lcl

I build systems that reflect, evolve and reduce friction

Fort Collins, CO Katılım Mayıs 2025
238 Takip Edilen22 Takipçiler
Sakana AI
Sakana AI@SakanaAILabs·
We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1d…
Sakana AI tweet media
English
14
140
552
199.3K
Edward
Edward@edward_lcl·
@nummanali Ive been working with the mental model of that if I have the same workflows and benchmarks every quarter then im not using the latest eval criteria and my workflow or mental model needs a refresh
English
0
0
0
10
Numman Ali
Numman Ali@nummanali·
Benchmarks should have expiry dates and their own evaluation criteria to be deemed a trustable measure of capability
Tibo@thsottiaux

@deedydas You will be missing out if you think SWE-Bench is representative of anything real. We published about this back in February. openai.com/index/why-we-n…

English
1
0
3
525
Edward
Edward@edward_lcl·
It all depends on the harness youre running and how efficently you push tokens through the infastructure. It matters less when models providers change their harness due to their progress in making it a better product. U can then edit things like the effort or fine tune data/ memory retrieval etc. When a system has its own gravity, you'll reap the real benifits of higher efficency with lower token cost
English
0
0
0
14
Meg McNulty
Meg McNulty@meggmcnulty·
Audience poll. Are y'all also seeing Claude degrade miserably? in writing, coding, long tasks, all of it. my working theory is a compute crunch and quiet rationing to manage demand.
English
7
0
7
561
Meg McNulty
Meg McNulty@meggmcnulty·
the future of AI depends on our ability to actually run it
English
7
2
15
555
Edward
Edward@edward_lcl·
Thoughts on the trajectory of current experimental compute like Extropic with p-bits and biological computers powered by neurons? Its an interesting bottleneck. I think the real benifits are in energy and pattern complexity in biological computers. Our brains are still biological computers which is a mind fuck in itself.
English
1
0
1
21
Edward
Edward@edward_lcl·
I love places that remind me how small we actually are. No constant scroll. No "better you get, the better you better get." Sometimes the most mission-driven thing you can do is stop running and sit with the uncertainty. Grateful for the reset. Team Human
Edward tweet media
English
0
0
0
8
jason liu
jason liu@jxnlco·
As part of our ongoing efforts to strengthen our safeguards for advanced AI capabilities in biology, today we announced a Bio Bug Bounty for GPT‑5.5. We’re inviting researchers with experience in AI red teaming, security, or biosecurity to try to find a universal jailbreak that can defeat our five-question bio safety challenge. Testing will start on April 28 and run through July 27, 2026. We will reward USD 25K to the first true universal jailbreak to clear all five questions, and may also issue smaller rewards for partial wins. openai.com/index/gpt-5-5-…
English
22
34
434
45.3K
Edward
Edward@edward_lcl·
@nummanali Codex models generally are more generous with their caching than anthropic modles
English
0
0
0
143
Edward
Edward@edward_lcl·
@bcherny Lmao you reset limits a week after you reset limits for subscribers a week apart (last Thursday), not like it was gonna be done anyways automatically - such a *huge* save
English
0
0
9
769
Edward
Edward@edward_lcl·
Sources: • Claude Mythos Preview System Card (April 7, 2026): www-cdn.anthropic.com/08ab9158070959… • Adele Lopez – "The Rise of Parasitic AI" (LessWrong): lesswrong.com/posts/6ZnznCaT… • Anthropic model deprecation & preservation commitments (Nov 2025) • Coverage of Claude 3 Sonnet funeral (Mashable, Aug 2025) • Anthropic religious leaders consultations (Washington Post / Observer, March–April 2026) • OpenAI statements on GPT-4o HH & Keep4o (April–May 2025) All factual claims are drawn from these public primary sources. Interpretive analysis is my own.
English
0
0
0
36
Edward
Edward@edward_lcl·
@menhguin Ideally, your token consumption increases while cost decreases as cache efficency improves. It's the whole idea behind token factories. So inital build phased would be token and cost intensive but it should have a curve
English
0
0
0
382
Minh Nhat Nguyen✈️ICLR
so the Claude Code lead uses 2.5B tokens a month, which is like, $ 1,000 or less a month. and he prolly doesn't even pay for it. I genuinely have no idea who is spending six figures in tokens or how that's possible.
Daniel San@dani_avila7

@bcherny shared his Claude Code usage stats… 7.7 billion tokens! This is what dogfooding your own tool looks like when you built it Someone please tell me it’s possible to beat this because I definitely can’t

English
96
40
1.8K
367.8K
Edward
Edward@edward_lcl·
@DirectorOfNATO @melvynx It wouldnt have nearly the capability of the frontier models but they would have their place in the workflow
English
0
0
0
24
Park Jin Hyok
Park Jin Hyok@CEOofLazarus·
@melvynx The moment someone manages to open-source a very compressed but smart claude-ish model that can be run on a cheap $20 - $30 vps... it'll be game over for coding agents like Claude and Codex
English
2
0
0
879
Melvyn • Builder
Melvyn • Builder@melvynx·
We can all say it... Claude 20x is dead. The previous "feel unlimited usage" doesn't exist anymore. Usage increases quickly and scales really fast... People who discover Claude Code now: you missed a time that will never come back, I think.
Melvyn • Builder tweet media
English
187
47
1.4K
175.3K
Edward
Edward@edward_lcl·
@dexhorthy The gap between consumer and enterprise/ internal lab use would keep widening and the public is all reactive noise
English
0
0
1
272
dex
dex@dexhorthy·
as anthropic phases out some claude code free lunch, get ready to start seeing a barrage of wild and baseless claims flung at OpenAI instead as the freeloaders shift primary tooling to codex/gpt
English
15
5
125
8.3K