
Aidarbek Suleimenov
312 posts

Aidarbek Suleimenov
@idarbek
building fintech for the world of atoms 🚛 ex Palantir, Meta



ARC-AGI-3 is out now! We've designed the benchmark to evaluate agentic intelligence via interactive reasoning environments. Beating ARC-AGI-3 will be achieved when an AI system matches or exceeds human-level action efficiency on all environments, upon seeing them for the first time. We've done extensive human testing that shows 100% of these environments are solvable by humans, upon first contact, with no prior training and no instructions. Meanwhile, all frontier AI reasoning models do under 1% at this time.



AI (or any human) will never get 100% in ARC-AGI-3 Let me introduce you to the worst game mechanic you can find in a puzzle game: fog of war At the start, if you go right instead of bottom, you're wasting many moves. Your score on this level literally depends on a conflip!




When @karpathy built MenuGen (karpathy.bearblog.dev/vibe-coding-me…), he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services, docs, API keys, configurations, dev/prod deployments, team and security features, rate limits, pricing tiers." We've all run into this issue when building with agents: you have to scurry off to establish accounts, clicking things in the browser as though it's the antediluvian days of 2023, in order to unblock its superintelligent progress. So we decided to build Stripe Projects to help agents instantly provision services from the CLI. For example, simply run: $ stripe projects add posthog/analytics And it'll create a PostHog account, get an API key, and (as needed) set up billing. Projects is launching today as a developer preview. You can register for access (we'll make it available to everyone soon) at projects.dev. We're also rolling out support for many new providers over the coming weeks. (Get in touch if you'd like to make your service available.) projects.dev





William on how an early stage employee takes way more risk than a founder: "If I'm making $400-500K at Google or Meta and go to an early stage company to get 1% of this company and make $90,000. I've now changed the trajectory of my life, that's a lot of risk. But as a founder, you're not. It's a much higher likelihood that of the next round, regardless of your company, you'll be able to sell some secondary. If it shuts down, you can get employed at a great company, and you have a CEO on your resume. That first employee, they have first employee at a failed company. That's actually not a great resume line item. So we've de-risked the founder, but we haven't de-risked the early stage employee."


these photos are so much funnier to me knowing that they are not wearing shoes





BREAKING: Thoma Bravo just released their LP meeting slides. The world's largest software PE firm thinks the market has it completely wrong on software right now. Public markets are panic-selling software based on AI fear. Here's what they're seeing:


We're excited to announce our partnership with @PrimeIntellect to allow anyone to train browser agents. General-purpose models aren't optimized for your browser workflows, BrowserEnv lets you train one that is. Checkout browserenv.com and train your own custom model in a few hours.

We are excited to welcome @OpenAI to the AIE Expo for the first time as Platinum sponsors for AIE EU! OAI has shipped SO much for AI Engineers this year alone, and this is the best place to catch up: - Meet the team at the Ask OpenAI lounge (bring your hardest tasks and best questions!) - Hear keynotes from @steipete and @lopopolo, and - get hands on with in-depth Codex workshops from @kagigz and @reach_vb! See you April 8-10 in London! AI Engineers💙@OpenAIDevs !



In the early-mid 2010s, if your search history was really good, Google would automatically invite you to foo bar and solving that would get you an interview at Google Now, if your agent history is really good on GStack, YC will (soon) automatically fill your YC application and that would get you into YC YC is the agent native YC





AI has become the justification for every layoff. It's the perfect excuse card, but there is a lot of spin involved. Every layoff is some combo of the following five very different AI stories. 1. Nothing changed, we just realized we have too many people. We are going to blame AI, but we are bullshitting. This is the AI as an excuse; it was really sloppy hiring, and we are just blaming AI. (See Block) 2. Growth has gone away so now we have too many people. This may be because of AI if you are a SaaS company. All the customer love is now going to AI. But it's less AI as a productivity lift, and more about you just building a less ambitious growth company. (See Salesforce and most every SaaS company) 3. We spent our money on capex to build AI so now we can’t afford as many people. Management may say it’s about AI making us productive (4 below) but my gut is a lot of it is about Nvidia getting our money so now there is none for you. (See Meta and Oracle) 4 We are really using AI the way god intended us to. We don't need as many people. This is the ONLY version of the story that is actually about a productivity increase. It's real, it's happening, but I wonder if it is even the majority of the layoffs. (See some software engineering departments right now) @jasonlk raised a fifth reason that doesn't get talked about enough: we just have the wrong people. Maybe we don't need 20 engineers who all know C++, but rather eight who have strong AI skills. This I think should be happening everywhere. Every time a layoff announcement comes out, I try and mentally categorize per the above.
