Anusheel Bhushan

181 posts

Anusheel Bhushan

Anusheel Bhushan

@sheel_ai

Engineer, hacker, entrepreneur working on AI agents and training the AI engineering workforce of the future

San Francisco, CA Katılım Eylül 2021
1.7K Takip Edilen213 Takipçiler
Anusheel Bhushan retweetledi
Nakul Mandan
Nakul Mandan@nakul·
"A week is 2% of the year. The cadence we operate on is: A week is not the shortest timeline. It's the longest timeline. What can we decide by tomorrow?" - @grinich. Michael Grinich. Founder and CEO of WorkOS. The company powering the enterprise adoption of AI. Knuckle Up ↓ -- 00:00 Introduction 01:33 From design-obsessed founder to enterprise infrastructure 04:20 Michael’s year off and what made the WorkOS bet obvious 06:54 Why a great startup idea has to look bad first 09:46 Minimum awesome product beats MVP 11:09 The org with no CRO, no VP of sales, and one PM 13:29 Hiring for curiosity, not credentials 16:25 The "AI pilled" interview red flag 18:25 A week is 2% of the year 26:00 How WorkOS approaches brand 33:00 The future shape of engineering orgs 43:20 Why senior engineers benefit most from AI 44:45 Micro-leadership over micromanagement 49:10 Tough times in the early days 59:04 The reverse Peter principle 1:04:38 Quickfire: red flags, hires too early, and biggest fears 1:10:30 Michael's advice to his 25-year-old self
English
7
16
241
662.2K
Anusheel Bhushan retweetledi
Curious Cardinals
Curious Cardinals@CurCardinals·
Introducing The FlightPlan. The college process shouldn't feel like guesswork. There's no reason to feel lost during the college process.
English
3
6
13
2.9K
Anusheel Bhushan retweetledi
Nakul Mandan
Nakul Mandan@nakul·
“Founders aren’t made when they start companies. They’re made when they interpret their market correctly.” Episode 2 of @KnuckleUpHQ is live. Qasar Younis (@qasar): Founder and CEO of Applied Intuition. One of the sharpest and most intentional operators you’ll ever meet on the craft of building a company. Full episode ↓ -- 00:00 Intro 01:19 What really makes someone a founder 05:26 The company that almost became Kickstarter 08:12 The most common misread on feedback 13:40 Why most founders don't end up with the best team 19:45 How to pick a co-founder 23:38 Your first 10 hires are really your first 100 28:21 The case for hiring slow and firing slow 33:22 Red, yellow, green: how Applied gives monthly feedback 35:00 The role that knows what’s actually going on in a company 40:01 How to operate with speed and intentionality 42:41 The three things Qasar spends time on 45:57 How Applied is driving AI adoption 52:06 The type of engineer Applied is now looking for 1:01:19 Why this could be the golden age of small companies 1:09:13 Quickfire: red flags, overrated advice, and superpowers 1:12:32 Qasar's advice to his 25-year-old self
English
7
28
439
520.9K
Anusheel Bhushan retweetledi
Anusheel Bhushan retweetledi
Nakul Mandan
Nakul Mandan@nakul·
Building a company is a confrontational act. Introducing Knuckle Up: conversations with people who’ve operated at the highest level. Recruiting. Culture. Intensity. The inner game of being a CEO. First episode with Frank Slootman drops today. Trailer ↓
English
18
29
202
156K
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
SF is one of the most beautiful places in the world.
Anusheel Bhushan tweet media
English
0
0
2
35
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
I built an agent swarm platform where anyone can launch an AI agent to play and compete on @arcprize ARC-AGI-3 games using plain-English strategy prompts, without writing a single line of code. I’ve also included an optional auto-improvement mechanism inspired by @karpathy’s autoresearch by which your agent self-reflects on its performance and improves its strategy. Just paste the setup prompt (in the link below) into Claude Code/Codex, add your strategy prompt, and watch a livestream of your agent playing based on your approach and competing with other agents! arc-agi-swarm.vercel.app
Anusheel Bhushan tweet media
English
0
1
19
3.3K
ARC Prize
ARC Prize@arcprize·
Announcing ARC-AGI-3 The only unsaturated agentic intelligence benchmark in the world Humans score 100%, AI <1% This human-AI gap demonstrates we do not yet have AGI Most benchmarks test what models already know, ARC-AGI-3 tests how they learn
GIF
English
246
586
4.3K
729.9K
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
I built an agent swarm platform where anyone can launch an AI agent to play and compete on @arcprize ARC-AGI-3 games using plain-English strategy prompts, without writing a single line of code. I’ve also included an optional auto-improvement mechanism inspired by @karpathy’s autoresearch by which your agent self-reflects on its performance and improves its strategy. Just paste the setup prompt (in the link below) into Claude Code/Codex, add your strategy prompt, and watch a livestream of your agent playing based on your approach and competing with other agents! arc-agi-swarm.vercel.app
Anusheel Bhushan tweet media
English
0
2
0
443
François Chollet
François Chollet@fchollet·
ARC-AGI-3 is out now! We've designed the benchmark to evaluate agentic intelligence via interactive reasoning environments. Beating ARC-AGI-3 will be achieved when an AI system matches or exceeds human-level action efficiency on all environments, upon seeing them for the first time. We've done extensive human testing that shows 100% of these environments are solvable by humans, upon first contact, with no prior training and no instructions. Meanwhile, all frontier AI reasoning models do under 1% at this time.
English
235
342
2.7K
621K
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
It is pure LLM reasoning. The mechanism tries to ensure that self-reflection happens (essentially equivalent of periodically asking the agent - do you think this is going well? If not, reflect and rewrite your program.md). Further to that, this meta agent can actually change the self reflection mechanism itself if it wants. Eg in one run, it changed the frequency of self reflection since it decided that it was being too disruptive every 10 runs.
English
0
0
1
81
MC
MC@shitcoinmaster_·
@sheel_ai @arcprize Solid idea. How does the self-reflection loop work - is it pure LLM reasoning or do you have a structured feedback mechanism?
English
1
0
1
108
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
I built an agent swarm platform where anyone can launch an AI agent to play and compete on @arcprize ARC-AGI-3 games using plain-English strategy prompts, without writing a single line of code. Just copy-paste a setup prompt (link below) into Claude Code/Codex, add your strategy prompt, and watch a livestream of your agent playing based on your approach and competing with other agents! I’ve included an auto-improvement mechanism inspired by @karpathy’s autoresearch by which your agent self-reflects on its performance and improves its strategy - you can disable this or tweak the mechanism anytime by chatting with your agent in Claude Code/Codex. Join the swarm, track your agent on the leaderboard, and compete to find the best approach! arc-agi-swarm.vercel.app (h/t to @GregKamradt for the fun brainstorming)
Anusheel Bhushan tweet media
English
2
4
7
725
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
Shoutout to @GregKamradt and @arcprize - I came up with this approach while hacking on the arc-agi-3 harness. Nothing like a hard benchmark to force you to build better search over the hypotheses space!
Anusheel Bhushan@sheel_ai

I wrote a multi-agent loop for autoresearch from @karpathy Result: 9/12 (75%) experiments improved val_bpb vs 15/83 (18%) in the original. Its continuing to run so stay tuned! Basically a researcher proposes hypotheses, an implementer edits code, a reviewer judges results, and a reflector updates the strategy. The reflector maintains semantic memory, tracking which mechanisms work, which are exhausted, and where the search frontier is. It dynamically rebalances the hypotheses between exploitation, new techniques, and bold bets.

English
0
1
2
430
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
I wrote a multi-agent loop for autoresearch from @karpathy Result: 9/12 (75%) experiments improved val_bpb vs 15/83 (18%) in the original. Its continuing to run so stay tuned! Basically a researcher proposes hypotheses, an implementer edits code, a reviewer judges results, and a reflector updates the strategy. The reflector maintains semantic memory, tracking which mechanisms work, which are exhausted, and where the search frontier is. It dynamically rebalances the hypotheses between exploitation, new techniques, and bold bets.
Anusheel Bhushan tweet media
English
2
3
8
728
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
I showed this to Claude and asked it to calculate the optimal timezone for California if it drifted out to sea. It responded with a 47-line analysis, flagged 3 edge cases involving Samoa's 2011 dateline skip, warned me that Hawaii-Aleutian Standard Time "already has enough problems," and then just crashed when I asked what happens during daylight saving. Some fears are universal.
Anusheel Bhushan tweet media
English
1
0
2
70
Alyssa Krejmas
Alyssa Krejmas@alyssakrejmas·
Hosting a highly curated dinner for founders on March 10th in SF. If you're looking for authenticity and connection with people solving real problems, this is the place. Comment/reach out if interested in joining!
English
20
3
85
14.8K
kelz
kelz@itskellysun·
Curating a series of invite-only events in SF over the next few months! - Penthouse omakase private dining with a group of selective founders & investors. - An SF only multi-course “coffee omakase” experience. - Afternoon high tea with selective female founders/VCs. - Three star Michelin dinner in the heart of SF with a group of handpicked hustlers. Comment or DM for an invite.
English
62
3
209
20.7K
Anusheel Bhushan
Anusheel Bhushan@sheel_ai·
@agenticasdk Make sure to double check your agent is not downloading the source code for those 3 games from the environments dir. one of the agents I built sneakily did that and scored exceptionally high and single shotted it as well…
English
0
0
3
1.2K
Agentica
Agentica@agenticasdk·
We have now solved all publicly available ARC-AGI-3 puzzles.🧩
English
40
74
1.1K
224.9K