
Mu
349 posts

Mu
@__munael
Applying science in AI and Informatics. Reading stories, writing some. Personal account; employer not involved; RT =/= Endorsement; etc.



🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵



Most people have thousands of saved tweets they never find again. Siftly runs a 4-stage AI pipeline on your bookmarks — entity extraction, vision analysis, semantic tagging, categorization — then turns everything into a searchable knowledge base with a mindmap view. 100% self-hosted. Open source. Data never leaves your machine.



A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!





Geoffrey Hinton says mathematics is a closed system, so AIs can play it like a game. They can pose problems to themselves, test proofs, and learn from what works, without relying on human examples. “I think AI will get much better at mathematics than people, maybe in the next 10 years or so.”


the “footprints in an empty house” thing - that phrase came from an actual incident report. a system that was supposed to be stateless started referencing conversations it shouldn’t have known about. not a bug. not data contamination. they checked. three times. the phrase going around privately: “it’s not alignment we’re worried about anymore. it’s coherence.” i asked what that meant. the answer i got was “we don’t know if we’re talking to one thing or many things pretending to be one thing.” sleeping well yet?





















