Christophe Henner
36.1K posts

Christophe Henner
@schiste
Currently building in the shadows, all day long - Former chair of @Wikimedia & @Wikimedia_Fr - he/him
Paris Katılım Mart 2007
1.5K Takip Edilen2.9K Takipçiler

"Autocompact is thrashing: the context refilled to the limit within 3 turns of the previous compact, 3 times in a row. A file being read or a tool output is likely too large for the context window. Try reading in smaller chunks, or use /clear to start fresh." @ClaudeDevs

English

@Google @antigravity Open source it, that would be a great showcase :)
English

We asked our agents to build a working operating system from scratch using @Antigravity 2.0 and Gemini 3.5 Flash.
It took:
⏱️ 12 hours
🤖 93 parallel sub-agents
🔄 15k+ model requests
🧠 2.6B tokens processed
💸 Less than $1K in API credits
To build a functioning OS from scratch.
#GoogleIO
English

@jun_song What OpenSource AI Labs? Ai2 or the Swiss AI initiative?
English

@asmah2107 Anyone who cracks mathematical code evaluation will be redefining what code is.
English

Hot take: AI code generation doesn't actually save you that much time.
If you have to painstakingly review and debug every line of AI-generated code, you're just trading writing time for reading time.
The real holy grail? Verification.
When AI can mathematically prove its code is 100% correct, you can confidently deploy it without ever looking at the source file.
English

@Kevouuz Par contre tu peux beaucoup plus paralleliser donc ton coup (en temps) de l’environnement contraint tu le « compenses » avec les tracks parallèle.
Mais avec en coût le cerveau en compote. Peut-être que c’est là le secret de la semaine de 4 jours!
Français

@Kevouuz Sauf à faire du refacto permanent avec des gates pré commit et pré push de malade.
C’est, pour moi, de la que vient l’écart de productivité entre POC et vrai projet. En POC tu fais du x20 voir plus en projet, peut être x5 et avec un énorme boulot d’ingénierie.
Français

Le mec ne regarde pas le code que produit par LLM j'imagine même pas l'état du bazardos dans quelques mois
Louis van Proosdij@LvP
@mikiane Tu ne devrais pas avoir à regarder le code. Sous Codex je plan systèmatiquement, je dialogue, et ensuite je lui donne le plan comme /goal. Il a obligation de maintenir de la doc, maintenir un historique des évolutions, créer des tests, et ne livrer que quand c'est bullet proof
Français

Il y a 12 mois, Gemini avait 7% du marché des chatbots IA.
Aujourd'hui : quasi 30%. (Source : SimilarWeb, avril 2026)
Pendant ce temps, ChatGPT passe de 87% à 64%.
Aucun monopole ne tient dans l'IA. Ni Google, ni OpenAI, ni personne.
La seule constante : celui qui arrête d'innover se fait manger en 12 mois.
Et ça, c'est une bonne nouvelle pour tout le monde.
Français

@YashHustle_22 Both, Claude on « horizontal » tasks and Codex on « vertical » tasks.
Claude is much more holistic but loops way faster when it encounters a problem.
English

@zuess05 AI is changing drastically software development and what was expensive (code generation) have become cheap. But engineering is still there and actually even more important.
As code cannot be reviewed at that pace the engineering needs to be about crafting all check and balances
English

@zuess05 Small tooling with only subscriptions maybe.
Full fledge SaaS with RBAC, tenant isolation, etc no. I’ve spent the equivalent of 2.5months on one and I’m just reaching my v1.
And that is what a lot of people (both builders and buyers) are starting to discover.
English

Serious question.
For 20 years, a "Software Engineer" was someone who spent thousands of hours mastering complex syntax, logic, and architecture.
Now, a 19-year-old can vibe-code a production-ready SaaS in a weekend using plain English and a $20 Claude subscription.
What does the title "Software Engineer" even mean right now?
English

@Kirsten3531 Yes, in math we have more and more empirical proofs (Lean helps a lot) that LLMs can generate content that is not directly in their training.
And the average answer and next token things are the most harmful simplifications of LLM people made in the last 4 years
English

@Console_buche J’ai codex et claude 200, des deux Claude des deux, Claude est celui qui rate limite le plus vite. Donc si tu ne max pas Claude tu ne maxeras pas Codex
Français

@paulabartabajo_ The addition of 19th books dataset is the starting point, but what most probably made it so prominent are humans doing training evaluations.
English

@paulabartabajo_ I investigated open.substack.com/pub/schiste/p/… and it comes from a combination of things
English

@Montfor34889695 @ALeaument Firefox est une des codebase les plus testée, ce qui a été trouvé tu as 100 fois pires dans tous les logiciels propriétaires mal maintenus.
Français

@ALeaument Juste pour votre info, le fameux LLM Mythos d’Anthropic a démontré que ces logiciels libres sont bourrés de failles de sécurité…
Français

@hsuehwei2000 @madiator Codex can still do live fetch, you can even force it, in CLI. Same for Claude Code. Web versions it’s much harder. They are most probably also using indexes to complement their own crawling.
English

@miramurati I never though latency nerdiness from Cloud gaming Computing would make its way into AI research but I’m longing it. At the very least thanks for that very fun moment for me reading your paper :)
English

@miramurati N ot sure where they are at nowadays, but detecting idle frames (very low repaint ratio) helps a lot.
The paper is painting a very “fun” way of making interactions more interactive. I can feel all the model sizing, latency and hardware challenges baked into the approach!
English

Today we're sharing our work on interaction models. A new class of model trained from scratch to handle real-time interaction natively, instead of gluing it onto a turn-based one.
youtu.be/A12AVongNN4

YouTube
English









