
kerimkaya
1.6K posts

kerimkaya
@kerimrocks
Beauty and craft in the coming abundance of software. Co-creating Kai, the Continuous Codebase Engineer @driaforall


Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

Az önce modern köleliğin "ütopya" ambalajıyla pazarlandığı bir paralel evrene düştüm. İş-yaşam dengesini tamamen rafa kaldırıp, haftanın yedi günü 18 saatlik mesaiyi gururla "adanmışlık" olarak sunan bir zihniyetle karşı karşıyayız. Özel hayatı, sağlığı ve aileyi unutup sadece başkasının hayali uğruna ömür çürütmek vizyonerlik değil, düpedüz plaza prangasıdır. Emeğinizi sömürmeyi "büyük bir tutkuyla dünyayı değiştiriyoruz" masalıyla meşrulaştıran bu tarz toksik çalışma kültürlerinden arkanıza bile bakmadan koşarak uzaklaşın.

wtf

New research result: we use Claude to make fully autonomous progress on scalable oversight research, as measured by performance gap recovered (PGR). Claude iterates on a number of different techniques and ends up significantly outperforming human researchers for $18k in credits.

New post: Systems Engineering Coding agents have lowered the barrier to writing code, but they haven't lowered the requirements of production software. Agentic software is just software. The agent replaces business logic. Everything else is the same. ashpreetbedi.com/articles/syste…




It's only a matter of time before only the model creators have access to the most powerful models. The rest get access to smaller, distilled versions. Or access the models through first party apps and services that don't provide direct access to the token path. The investment needs for training are too high, and distillation too effective to warrant any other future.

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

Over the past few months, we have integrated @googledeepmind's AlphaEvolve into our computational lithography. Enabled by AlphaEvolve's algorithmic leaps, we are now printing complex patterns in a single exposure that would otherwise require multiple. substrate.com/information-to…



Claude Code and Cursor... but they improve themselves. Autonomously. Meta Harness is wild. Had to make a video about it...


