AfterQuery (@AfterQuery) - Twitter 个人资料

置顶推文

AfterQuery@AfterQuery·5 Kas

Today, humanity is shackled by scarcity of expertise. When expertise becomes infinitely scalable, humans will be freed to tackle problems we can't even conceive of today. Introducing @AfterQuery. We’re building a world where expertise is abundant. Domain by domain, profession by profession, AfterQuery is crafting datasets that encode excellence into forms that machines can learn. Data is the final frontier.

English

76

16

69

8.5K

AfterQuery@AfterQuery·9 Mar

Y Combinator@ycombinator

YC and @GoogleDeepMind are hosting the Multimodal Frontier Hackathon this Saturday. Most AI apps still don't utilize the full multimodal stack. So we’re giving you access to Gemini 3.1, Lyria, & NanoBanana 2 to see what you can build! Sign up at: events.ycombinator.com/deepmind-march…

ZXX

0

1

8

463

AfterQuery@AfterQuery·12 Şub

Paper: arxiv.org/abs/2601.20886

English

0

3

274

AfterQuery@AfterQuery·12 Şub

Introducing IDE-Bench! A multi-language, full-stack benchmark evaluating LLMs acting as autonomous IDE agents IDE-Bench assesses agents' ability to navigate, reason, and modify complex repositories using the same tools available in modern AI-native IDEs like Cursor Models tested from @AnthropicAI, @OpenAI, @Alibaba_Qwen, @GoogleDeepMind, @xai, @deepseek_ai, @Meta, and @cohere Check out the full results at ide-bench.com!

English

8

17

649

AfterQuery@AfterQuery·11 Şub

RT @spencermateega: Congrats to @ValsAI team for launching Finance Agent Benchmark v1.1! Proud that @AfterQuery could contribute finance e…

English

0

1

0

191

AfterQuery@AfterQuery·29 Oca

Our findings show that current models lack the ability to perform even the most basic tasks in high-impact, real-world domains like quantitative trading. We hope Market-Bench can serve as a shared framework to evaluate models’ understanding of trading strategies and code generation for quantitative finance. Excited to track how these capabilities evolve!

English

0

4

263

AfterQuery@AfterQuery·29 Oca

Leaderboard: marketbench.ai Paper: arxiv.org/abs/2512.12264

English

1

6

364

AfterQuery@AfterQuery·29 Oca

Introducing Market-Bench by @AfterQuery! The first-of-its-kind benchmark on LLMs for quantitative finance. We challenged models to attempt a frequent introductory quantitative trading task: coding an executable backtester from a natural-language strategy description and market assumptions. > 13 models build backtesting systems for directional, pair trading, and delta hedging strategies > evaluated on reliability (executable passes) and accuracy (MAE) across 5 attempts per strategy > real order book data with exchange delays and liquidity constraints > @xAI’s Grok 4 achieved the overall lowest mean MAE (deviation from the golden backtest), followed closely by @OpenAI’s GPT 5.2 > @AnthropicAI's Sonnet 4.5 and @AlibabaGroup's Qwen 3 Max at perfect executability but high MAE > Models from @Meta, @Amazon, @NVIDIA, and @Cohere continued to fail to produce executable backtesters Leaderboard & full paper below!

English

4

1

12

579

AfterQuery 已转推

Spencer Mateega@spencermateega·12 Ara

How far can vibe coding actually go? Introducing App-Bench by @AfterQuery, a benchmark for end-to-end web app development. We tested 6 production web apps on 10 coding agents from @OpenAI, @GoogleDeepMind, @AnthropicAI, @cursor_ai, @orchidsapp, @v0, @boltdotnew, @Replit, and @Lovable. One shot generation. Zero human edits. 4,530 evaluations.

English

14

12

71

30.8K

AfterQuery@AfterQuery·5 Kas

@cigdemoztabak_ 👀

QME

1

0

2

76

Cigdem Oztabak@cigdemoztabak_·5 Kas

what a grand opening! watching @AfterQuery

AfterQuery@AfterQuery

Today, humanity is shackled by scarcity of expertise. When expertise becomes infinitely scalable, humans will be freed to tackle problems we can't even conceive of today. Introducing @AfterQuery. We’re building a world where expertise is abundant. Domain by domain, profession by profession, AfterQuery is crafting datasets that encode excellence into forms that machines can learn. Data is the final frontier.

English

1

0

5

1K

AfterQuery 已转推

shrawberry@shrawberryy·5 Kas

Really excited to have contributed to this sick creative vision and brought the @AfterQuery website to life 😎😎

Spencer Mateega@spencermateega

Today, humanity is shackled by scarcity of expertise. When expertise becomes infinitely scalable, humans will be freed to tackle problems we can't even conceive of today. Introducing @AfterQuery. We’re building a world where expertise is abundant. Domain by domain, profession by profession, AfterQuery is crafting datasets that encode excellence into forms that machines can learn. Data is the final frontier.

English

2

1

15

4.5K

AfterQuery@AfterQuery·24 Eki

@shrawberryy @sashabirukoff shrawberry 🙌

English

1

0

1

245

shrawberry@shrawberryy·24 Eki

website and brand identity i did for @AfterQuery !! site design, visuals, copy, animations all by me - logo in collaboration with @sashabirukoff more details soon yippeeee

English

66

47

1.4K

101.7K

AfterQuery 已转推

Spencer Mateega@spencermateega·17 Eki

The frontier begets the frontier. I highly recommend reading @jaminball's latest Clouded Judgement article which spells out the AfterQuery thesis (thread)

English

6

11

41

4K

AfterQuery@AfterQuery·18 Eyl

Excited for UI-Bench to be the leading benchmark for UI/web design! Congrats to the @figma make team for claiming the #2 ranking

Dylan Field@zoink

while leaderboards are fun and motivating, this is just the start for figma make. can't wait to share all the improvements we are making over coming days / weeks / months!

English

1

2

4.2K

AfterQuery 已转推

Spencer Mateega@spencermateega·3 Eyl

Introducing UI-Bench by @afterquery. The first and only rigorous eval of vibe coding tools. > 4,000+ blinded pairwise judgments > @orchidsapp, @figma make, and @lovable take the lead > @v0 and @replit ranked dead last > performance gaps = differences in LLM orchestration, prompting, design templates, and post-processing > link to our paper in the comments!

English

14

33

160

33K

AfterQuery 已转推

Spencer Mateega@spencermateega·25 Ağu

finally got the @afterquery team to touch grass

English

7

4

34

3K

AfterQuery 已转推

Spencer Mateega@spencermateega·5 Tem

🇺🇸 249 years ago, America declared that innovation belongs to the bold. Today, we're writing the next chapter—one dataset at a time. At @AfterQuery, we believe AI's future isn't just about algorithms. It's about the human ingenuity that teaches machines to think, reason, and serve humanity better. From our San Francisco headquarters, we're proud to partner with America's leading AI research labs to fuel the intelligence revolution. Our mission is uniquely American: Bringing together the brightest minds from talent hubs like Stanford AI Lab, Berkeley AI Research, Wall Street, and Silicon Valley—united to ensure AI advancement remains in the hands of those who value freedom and human potential. While others talk about the future, we're building it. Happy Independence Day from @AfterQuery! 🦅🎆 Here's to the innovators who refuse to accept limits. Here's to the American dream.

English

0

2

10

1.8K

AfterQuery 已转推

Spencer Mateega@spencermateega·30 Haz

Today, we’re pulling back the curtains. After collecting thousands of original, human-written coding problems, @AfterQuery created internal, contamination-free evals to test LLM code generation. No leaderboard tricks. No test-set leakage. Just raw task execution. Thread 🧵

GIF

English

1

3

10

2.3K

AfterQuery 已转推

Ethan TS. Liu@ethantsliu·29 May

Excited to share VADER, AfterQuery's new, human-evaluated benchmark for evaluating LLMs on real-world vulnerability handling! Paper: lnkd.in/g7EfAi2cAll data, evaluation tools & results are open-sourced at: lnkd.in/gYPUKwub [1/4]

English

2

3

12

1.8K

AfterQuery 已转推

Gary Qi@gary_qz·22 May

Hey devs! Trae’s back in SF! We’re proud to be a lead partner of AGENTHACKS hackathon 📍 Join us in San Francisco and meet our amazing hosts and candidate. 🗓️ May 23–24 @ AGI House SF 💰 $10K+ in prizes, free AI credits, bounties & more! 🌐 Join now: agenthacks.org 🚀 Bring friends to build your awesome project together. 👉 BTW, check out the standout tech companies of Silicon Valley: 🎯 Hosted by: @AfterQuery (YC W25) @dexterity_ai (YC W25) @AGIHouseSF 🤝 Supported by: @Trae_ai – IDE powered by autonomous agents. @OpenAI – Creators of GPT, DALL·E, and Codex. @AnthropicAI – AI research company building safe systems like Claude. @MistralAI – Open frontier models built in Europe. @vercel – The frontend cloud for developers. @boltdotnew – Build stunning apps & websites, just by prompting @Shopify – Leading global e-commerce platform. @novita_labs – Unified LLM API platform for developers. @TensorPool – Serverless GPU access for AI workloads. @weHRTyou – HRT AI Labs driving AI for trading and infrastructure. @Dalus_io – AI-first hardware system design for aerospace & defense. @Guse_AI – Open-source infrastructure for agent systems. 🌐 Learn more: agenthacks.org #Trae #build_on_trae #AgentHacks #AIHackathon #SanFrancisco #AIagents #OpenSourceAI #YC

English

0

2

14

2.5K

AfterQuery

发现