
Michael Liao
258 posts

Michael Liao
@michaelcfix
@mercor_ai, ex-scale ai, founder, @uoft
San Francisco Entrou em Mart 2022
183 Seguindo92 Seguidores
Michael Liao retweetou

Traditional coding benchmarks do not reflect how software is actually built and maintained.
That's why we built a new benchmark, APEX-SWE, in partnership with @cognition. It measures whether AI models can perform complex, real-world software engineering work to ship systems that work and debug them when they don't.
@OpenAI GPT 5.3 Codex (High) tops the leaderboard at 41.5% on Pass@1.
English

I just left @xai
It was not an easy decision. The past three months were an absolute blast - I've been in many trenches in my life and can say this was by far one of the most intense warzones.
I love fighting. Especially being in the trenches with my friends, working on problems that will actually advance humanity.
But the current environment wasn't serving my growth. And that's a really hard thing to admit - I've always looked up to Elon, and I genuinely believe xAI will win. I still do.
One thing I'll say: don't stay somewhere just because of the name. If you're unhappy, and you know you can't grow 100x where you are - it's the right call to leave.
What's next? Get some sleep back. Then find the next trench worth fighting in.
I'll always be meeting exceptional people - that was never because of a recruiting title. I just love finding smart people and helping however I can. Many more side quests to come!!!
English

@dopabees yes but grade inflation in canada is also horrendous
English

waterloo requires a 98% avg to get in
they have 2 yrs of co-op exp guaranteed and their entire curriculum is real SWE
they have every right to be sought after
Dhravya Shah@DhravyaShah
average waterloo guy
English
Michael Liao retweetou

Today, we're releasing our first version of the AI Consumer Index (ACE).
ACE tests what people actually ask, and expect, AI to do for them in their personal life. From shopping for a gift to tackling home projects, people are turning to AI for recommendations and step-by-step guidance.
ACE contains realistic and challenging evals, split across shopping, food, gaming, and DIY. The results show that models routinely fail on consumer tasks:
- @OpenAI GPT 5 is the top model but scores only 56.1% overall.
- No model scores over 50% on Shopping tasks, an opportunity worth $5+ trillion globally.
- Frontier models frequently hallucinate web content they were supposed to retrieve, getting numbers or a link wrong between 29% to 62% of the time.

English
Michael Liao retweetou

We’ve raised our $350M Series C at a $10B valuation from @felicis, @benchmark, and @generalcatalyst.
Just 2 years after starting, Mercor is paying $1.5 million per day to experts in our marketplace.
We’re creating a new category of work in the AI economy, where software engineers, bankers, lawyers, and other professionals earn based on their experience while advancing the frontier of AI.
While most new categories take time to build momentum, we’ve broken every growth record. For comparison, in their first 2 years:
- Uber paid out just over a $1 million to drivers
- Airbnb paid out $10 million to hosts
We are unlocking human potential in the AI economy.

English

today i learned im getting underpaid
Ryan Hu@ryanhu20
1 cold email is really is all you need. @bruce_wang15 told me he delivered at his last internship. Now he’s in SF helping push copious amounts of revenue at the fastest growing startup of all time. He’s 19 making more than most senior engineers now
English

@cynthwangg depends on what you want
it’s still good for an above average corporate career
for starting or running a company, probably not so much
English

life update: I’ve joined @andocorporation as a founding engineer!
after spending the summer working on aisdk at Vercel, I was looking for the next ambitious problem to tackle. then I met @saradu and the team, and it just clicked

English

@ella_schlags you must really hate startup culture right now then
English









