Kyle Noble
132 posts

Kyle Noble
@KyleNoble
pirate shipping at @roborobots

🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵







It's not because your agent can do something with 12 tool calls and 20k tokens that it should. For example, Obsdian CLI has a command called "orphans" that finds files with no incoming links. Now your agent can get a deterministic answer in 1 millisecond instead of 4 minutes.

2026 is a terrible time to build a robotics company. Imagine building a software company without AWS, Stripe, or GitHub. Everything from scratch. That's robotics today. Most robotics teams waste their first 6–12 months rebuilding the same data infrastructure, telemetry, internal tools, pipelines. None of it is core IP. I started @AlloyRobotics to SPEEDRUN all of it. A modern stack so robotics teams can iterate 10x faster than their competition. If you're building in this space, lets talk.



Currently building out our eval stack Need some cool, complex projects like this one to test on:






















