Avi Arora

3.6K posts

Avi Arora

@c0delemons

building benchspan (backed by yc) prev. ml research @ github

Beigetreten Ağustos 2014

270 Folgt471 Follower

Avi Arora@c0delemons·2d

someone needs to run gpt-image-2 on swebench this is insane.

Justin Schroeder@jpschroeder

what. what. what. gpt-image-2 almost passes the pelican test...in a screenshot of a code editor.

English

190

Avi Arora@c0delemons·2d

Benchspan has also given @SpaceX the right to acquire Benchspan later this year for $60 billion.

SpaceX@SpaceX

SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.

English

Avi Arora@c0delemons·3d

Sam Altman stopped by @ycombinator today. Here's everything he said (w/ direct quotes) - new model coming - electricity is the biggest bottleneck right now, not gpus “if you try to build a company that will still be alive in 50 years you will miss” “i think the way we currently measure things in society is going to break” “we’re still 2.5 years from the chatgpt moment for robotics” “at some point we are effectively going to summon aliens, and i think they will be friendly aliens” “the better you do, the more chaos comes your way” “people have seen robots do some pretty impressive things, but they have not seen a truly general purpose robot they can interact with and have it do useful things. that will be the chatgpt moment” “i would bet that society figures some new grand bargain” “society figures out for new technologies where the value chain is and who is liable where” - society and technology coevolve in difficult to predict ways - the models are good enough now to figure out new architectures themselves - value accrues to the top layer with the direct relationship with the consumer

English

148

Avi Arora@c0delemons·3d

@garrytan GStack browser should use @benchspan (YC P26), the most advanced detector for indirect prompt injection 🫡

English

Garry Tan@garrytan·4d

Sneak Preview: GStack Browser will do defense-in-depth on detecting prompt injections from websites

English

253

47.4K

Avi Arora@c0delemons·3d

Prompt injection defense was built for a world that doesn't exist anymore. Chatbots, roleplay style attacks, 'ignore previous instructions' Agents have hands now. And they are reaching into your database. We've developed the most advanced model to defend against the real risk: Indirect prompt injection

English

115

Avi Arora@c0delemons·3d

@ycombinator @ontoratech @dav1dk0rn @LeonIwanowitsch @maxonary congrats, such a cool idea!

English

126

Y Combinator@ycombinator·4d

Ontora (@ontoratech) interviews every employee in large companies to pinpoint what’s slowing them down. Knowledge lives in people’s minds. Ontora helps your AI tools tap into it. Congrats on the launch, @dav1dk0rn, @LeonIwanowitsch, and @maxonary! ycombinator.com/launches/PyU-o…

English

261

125.8K

Avi Arora@c0delemons·16 Nis

@ycombinator @DatostApp @MaceoCk Good stuff

English

108

Y Combinator@ycombinator·16 Nis

Datost (@datostapp) is an AI data analyst in Slack. It keeps a semantic layer of your business definitions, crm, docs, and codebase so it knows what questions mean. 75.2% on the hardest public text-to-SQL benchmark, where Opus 4.6 scores 33%. Congrats on the launch, @maceock & @jasonhywang! ycombinator.com/launches/Pxg-d…

English

206

140K

Avi Arora@c0delemons·15 Nis

@ycombinator @Silmarildev @aumup001 Congrats guys!

English

Y Combinator@ycombinator·15 Nis

Silmaril (@Silmarildev) is the first self-healing prompt injection defense. It catches 2x more attacks 10x faster than leading defenses, and retrains continuously to protect your full AI stack, including agents like Claude Code and OpenClaw. Congrats on the launch, @aumup001 and @EduardoVel36291! ycombinator.com/launches/Pvl-s…

English

300

26.1K

Avi Arora@c0delemons·4 Nis

May be time for a new marketing campaign

Karun Kaushik@karunkaushik_

There’s been a lot of allegations against Delve. But we haven’t been able to share our side of the story until today due to ongoing cybersecurity and forensics investigations. Maintaining customer trust is central to everything we do. That said, we grew too fast and fell short of our own standard. To our customers, we deeply apologize for the inconveniences caused. We take these allegations seriously and have made changes: a new auditor network, free re-audits and pentests for all customers, enhanced transparency in audit communications, and more. However, we also want to set the record straight on the anonymous attacks. The evidence we have points to a targeted cyberattack from a malicious actor, not a “whistleblower.” We believe the attacker purchased Delve under false pretenses, exfiltrated internal company data, and used it to launch a coordinated smear campaign. The posts rely on a mix of fabricated claims, cherry-picked screenshots, and stolen data taken out of context. See the link in the comments for more details. Delve was built to modernize compliance. We are not going anywhere and are committed to building what's next.

English

212

Avi Arora@c0delemons·2 Nis

We got into YC Ritesh and I were random roommates in college, day dreaming about starting a company people love. 8 years later, we are still roommates, and we’re building Benchspan to fix all the issues around benchmarking agents. DM if you’re in the Bay and want to connect

English

177

Avi Arora@c0delemons·6 Mar

@j0hnwang dm’d

John Wang@j0hnwang·6 Mar

Got a few extra invites for Kalshi Conference end of this month in NYC prioritizing MMs, traders, and institutions. DM me

Tarek Mansour@mansourtarek_

We are hosting our first Prediction Market Conference in March 2026. Researchers, economists, policymakers, traders will discuss big questions around prediction markets and knowledge aggregation. Spots will be limited. Reply here with a topic if interested in joining.

English

406

85K

Avi Arora@c0delemons·6 Mar

@mansourtarek_ Interested - We submitted a paper on backtesting @oddpool_alerts

English

Tarek Mansour@mansourtarek_·24 Ara

Tarek Mansour@mansourtarek_

In 1945, Friedrich Hayek outlined the Knowledge Problem that any society faces: The central economic problem is not resource allocation - it is how to use knowledge that is dispersed among millions of individuals. He argues that information is fragmented, local, dynamic, and often hidden. He explains that no government or central planner can ever fully possess it, which makes them inefficient resource allocators. He proposes markets as the solution: knowledge is decentralized and prices are how society aggregates it. This idea is the intellectual foundation of modern prediction markets. Decades later, in 1988, the University of Iowa launched the Iowa Electronic Markets (IEM), which allowed small size trades on US elections and macro events. The results: even thin, low-capital markets outperformed polls. This was the first credible empirical proof that market prices are effective aggregators of public beliefs. A variety of corporate and policy experiments followed in the 2000s. Google, HP, and Microsoft all tried their own internal versions of prediction markets to forecast product launches and sales targets. DARPA built its own to forecast geopolitical events. The results were consistent: broad participation with monetary incentives led to accurate forecasts. Then, in 2015, Philip Tetlock published Superforecasting. The book, which is the culmination of decades of research into human judgment, shows that groups of curious and humble “forecasters” dramatically outperformed intelligence analysts and domain experts at forecasting. By showing that smart amateurs can outperform experts, Tetlock put into question authority figures and whether we should trust them for predictions about the future. Today, Kalshi is sitting on one of the largest repositories of high quality market data in the world. For the first time, public beliefs across a variety of domains - from economics, to politics and culture - are aggregated at scale through market prices and updated in real-time as new information arrives. Our data contains answers to open questions held about prediction markets - why they outperform traditional belief aggregation methods, how to detect shifts in collective sentiment, and which players drive market accuracy. This proprietary data has been closed to the public. We are launching @KalshiResearch to change that. We invite academics, researchers, economists, philosophers, and interested parties to work with us to study and uncover the fundamentals underpinning belief formation and prediction markets. Like Hayek proposed 80 years ago, prediction markets have the potential to improve society's collective decision making and resource allocation. The goal for Kalshi Research is to fulfill his vision.

English

373

763

219K

Avi Arora@c0delemons·9 Oca

@ASvanevik I am over at oddpool.com

English

Alex Svanevik 🐧@ASvanevik·9 Oca

anyone building an aggregator for prediction markets?

English

192

420

60.6K

Avi Arora@c0delemons·19 Ara

@Prithvir12 @opinionlabsxyz @Polymarket @Kalshi @trylimitless @MyriadMarkets @predictdotfun Kalshi has been dominating the prediction market index for the last four days oddpool.com/dominance

English

PJ@Prithvir12·19 Ara

Prediction Market Weekly Update Notional Volume 1. @Opinionlabsxyz $1.53b 2. @Polymarket $1.15b 3. @Kalshi $1.11b 4. @Trylimitless $33m 5. @MyriadMarkets $4.05m 6. @predictdotfun $0.463m Total: $3.8b WoW: +8% Open Interest 1. @Kalshi $313m 2. @Polymarket $296m 3. @Opinionlabsxyz $65m 4. @MyriadMarkets $0.945m 5. @Trylimitless $0.928m 6. @predictdotfun $0.01m Total: $677m WoW: +2.7% Transactions 1. @Polymarket 6.07m 2. @Kalshi 5.2m 3. @MyriadMarkets 0.627m 4. @Opinionlabsxyz 0.45m 5. @Trylimitless 0.33m 6. @predictdotfun 0.07k Total: 12.68m WoW: +10% Users 1. @Polymarket 219k 2. @Opinionlabsxyz 29k 3. @Trylimitless 27k 4. @predictdotfun 3.2k 5. @MyriadMarkets 1.9k Total: 282k WoW: -0.7% I post these every Friday. Lmk what other data would be interesting to add. h/t @datadashboards for the @Dune dashboard. Use @tradefoxai to trade all these markets from a single interface.

English

512

41.1K

Avi Arora@c0delemons·15 Ara

@eddylazzarin basically we need pikkit for prediction markets

English

Eddy Lazzarin 🟠🔭@eddylazzarin·14 Ara

Prediction markets are probably at their best (both for users and mere readers) when embedded in the right social context.

Shane Mac@ShaneMac

I’d use prediction markets 100x more if they lived in the place where I actually talked about predictions Small groups with friends. blog.shanemac.com/the-future-of-…

English

118

29.9K

Avi Arora@c0delemons·15 Ara

@phosphenq @PolymarketTrade @AskPolymarket adding scanner for ghost markets to my list of things to build on oddpool.com

English

199

Phosphen@phosphenq·14 Ara

Jane Street for the poor real money is made in silence, look for ghost markets where you'll be the only mm // on low liquidity markets spreads reach 10-40c find an empty order book set your prices: buy ~$0.30 / sell ~$0.60 wait for a random trader to hit the limit orders ImJustKen: top-tier trader on the platform admits it's manual work, not magic Profile: @ImJustKen?via=01" target="_blank" rel="nofollow noopener">polymarket.com/@ImJustKen?via… > "nobody believes it, but i trade manually, i have a bunch of orders in the book, most trades are when people hit my limit orders" Live Setup (MrBeast Views): Market: polymarket.com/event/of-views… right now the market for the next MrBeast video views is empty liquidity and hype will inevitably come with the video release your task is to take over the order book now and sell entry to the crowd later btw nfa

English

277

22.6K

Avi Arora@c0delemons·15 Ara

@shafu0x damn if only someone was building an aggregator oddpool.com 👀

English

shafu@shafu0x·14 Ara

Things with we need to solve about Prediction Markets - not permissionless - low liquidity - oracles - analytics - aggregators - off-chain CLOB - no good prediction market AMM - no defi on top - separated from the "news" - liquidity fragmentation - capital inefficient - hard to compare markets

English

377

39.3K

Avi Arora@c0delemons·14 Ara

We just release our Prediction Market Dominance Index Finally there is a single place to track volume, liquidity, turnover, open events, and much more for both @Kalshi and @Polymarket

English

385

Avi Arora@c0delemons·8 Ara

@igor_mikerin great article man! you should also look into cross market arbitrage. there is about 500 duplicate markets between Kalshi and Polymarket that you can find arbs between i’ve been working on a scanner for these over at oddpool.com and we find more than 50 everyday >3%

English

908