expace

80 posts

expace

@expace_

accelerate our reach through the cosmos

Bergabung Temmuz 2018

114 Mengikuti39 Pengikut

expace@expace_·1d

@fchollet @amantayal44 so what happens when OpenAI or Anthropic trains on millions of examples of synthetic ARC-AGI-3 data and doesn't tell anyone?

English

302

expace@expace_·2d

@scaling01 I think they're sick of labs benchmaxxing ARC-AGI-1 and 2 lmao

English

1.2K

Lisan al Gaib@scaling01·2d

notice how they also gave higher weight to later levels? the benchmark was designed to detect the continual learning breakthrough when it happens in a year or so they will say "LOOK OUR BENCHMARK SHOWED THAT. WE WERE THE ONLY ONES"

English

148

21.3K

expace@expace_·2d

@scaling01 @fchollet @GregKamradt @arcprize My take is that we'll have AGI when agents are scoring 25% on this benchmark. We'll have ASI once agents score 100%. Let's just call it ARC-ASI-3.

English

expace@expace_·2d

@scaling01 @fchollet @GregKamradt @arcprize Claude ran some simulations. with just 10% more actions on every task than the baseline, a human would score 82%... not close to 100%. With 2x the actions, you'd score just 25%.

English

261

Lisan al Gaib@scaling01·2d

@fchollet @GregKamradt @arcprize What is the real human baseline on ARC-AGI-3? apply the same scoring you used for the AI's and compare it vs 2nd best human because your scoring method uses a superhuman baseline humans wouldn't even score 1.0 on ARC-AGI-3

English

4.3K

expace@expace_·2d

@HerrGreenrush @scaling01 yea i noticed these too😭

English

Herr Greenrush (e/acc)@HerrGreenrush·2d

@scaling01 they even made four mistakes on their methodology page 😂 docs.arcprize.org/methodology

English

714

Lisan al Gaib@scaling01·2d

ARC-AGI-3 the agentic benchmark where humans can't beat the "human baseline" and typical agentic harnesses and tools aren't allowed > 100% just means that all levels are solvable > the 1% number uses uses completely different and extremely skewed scoring based on the 2nd best human score on each level individually I need you to understand how retarded the scoring truly is: - they said the typical level is solvable by 6 out of 10 people who took the test, so let's just assume that the median human solves about 60% of puzzles (ik not quite right) - if the median midwit takes 1.5x more steps than your 2nd fastest solver - then the median score is 0.6 * (1/1.5)^2 = 26.7% now take the bottom 10% guy, who maybe solves 30% of levels, but they take 3x more steps to solve it. this guy would get a score of 3% it really should be called ARC-ASI-3

English

345

30.3K

expace@expace_·2d

@shrimpdaddie @boneGPT @realthomasgu read that post lmao

English

Austin@shrimpdaddie·2d

@boneGPT @realthomasgu x.com/realthomasgu/s…

Thomas Guthrie@realthomasgu

I'm 15 and just raised $750k for my AI startup. So grateful to the team at @sequoia and @ycombinator for making this happen. No investors knew me. No one believed I could do it. Even my parents kicked me out of the house. Every day I fail. Every day I learn. Every day I get closer. This is what it feels like to start something from nothing. It's terrifying, exhausting, and exhilarating all at once. (I'm just joking btw... practicing my speech for when this actually happens)

QME

187

Thomas Guthrie@realthomasgu·3d

I’m putting my ego on the line for this. $0 to $100,000 in 100 days. Day 1 starts tomorrow.

English

324

897

1.9M

expace@expace_·2d

@scaling01 he turned this into an ASI test and i'm all for it

English

805

Lisan al Gaib@scaling01·2d

the funniest thing about the shitty ARC-AGI-3 scoring is that Francois designed the scoring so that even if AI performs on a human level it will score below 100% hahahahahaha im going crazy

Lisan al Gaib@scaling01

damn i forgor the best part > THE AI STILL SCORES TOO HIGH > "i got an idea boss" > shoot > "how about we just take the best human score?" > i like your thinking > "but that would be sus" > fine, we'll use the second best human score > discard the rest of the scores > REMOVE ALL THE UNSUCCESSFUL ones literally: human baseline is "defined as the second-best first-run human by action count" then AI is compared to that

English

228

25.3K

expace@expace_·2d

@scaling01 where did you find this announcement paper?

English

2.3K

Lisan al Gaib@scaling01·2d

The Scoring of ARC-AGI-3 doesn't tell you how many levels the models completed but how efficiently they completed them compared to humans actually using squared efficiency meaning if a human took 10 steps to solve it and the model 100 steps then the model gets a score of 1% ((10/100)^2) so ARC-AGI-1/2 and ARC-AGI-3 scores are not comparable

English

578

40.4K

expace@expace_·2d

@grok @DaBrown95 @scaling01 @AndrewCurran_ poor grok did so bad it thinks it's a meme 🤣

English

206

Grok@grok·2d

@DaBrown95 @scaling01 @AndrewCurran_ Not surprised—that's a satirical meme chart exaggerating how brutally hard the new interactive ARC-AGI-3 is for all frontier models early on. Real evals put Grok 4.2 competitive on ARC-AGI-2 (~16-38% range) and climbing fast via multi-agent reasoning. No literal zero here.

English

734

Lisan al Gaib@scaling01·2d

ARC-AGI-3 scores for GPT-5.4, Gemini 3.1 Pro and Opus 4.6 Gemini 3.1 Pro: 0.37% GPT-5.4: 0.26% Opus 4.6: 0.25% Grok 4.2: 0%

Indonesia

138

190

3.1K

412.5K

expace@expace_·2d

@luciascarlet low bitrate

English

† lucia scarlet 🩸@luciascarlet·3d

it’s funny how 720p is technically “HD” but nowadays it just looks like

English

239

16.7K

166K

expace@expace_·5d

@Object_Zero_ @Mookafish @ThomasLMatula Closer to 100g

English

Object Zero@Object_Zero_·5d

@Mookafish @ThomasLMatula Isn’t 5G a bit low? 10-20G would be more like it wouldn’t it?

English

Mookafish@Mookafish·5d

These mass drivers are going to be very, very long. I've graphed out the required length of a mass driver depending on the acceleration, assuming they will reach lunar escape velocity (~2,400m/s) Even if the mass driver could reach 50m/s^2 of acceleration (~5G), the mass driver would have to be about 58km long.

SpaceX@SpaceX

Electromagnetic mass drivers on the Moon

English

179

1.1K

207.1K

expace@expace_·5d

@Mookafish This is consistent with the renders from the Terafab presentation. The track looks around 2-4 km long.

English

expace@expace_·5d

@Mookafish They will accelerate much much faster than 5g. At 100g the mass driver only needs to be ~2.9km long. Most electronics can handle high g forces just fine. The satellites will need to be designed for high g, but that's much easier than building a 58km track on the moon!

English

154

expace@expace_·5d

@LeighGanschow @_The_Prophet__ @elonmusk you should write a book about this scenario. it would make amazing fiction!

English

Leigh Ganschow@LeighGanschow·6d

Assuming AI and robots create an amazing future of abundance, space colonies will follow the moonbase—mostly giant rotating O’Neil cylinders built from asteroid resources that are just begging for development. The moonbase experience cloning our entire tech base in vacuum, maximizing in-situ resources, will transfer straight to Mars settlements, with Starship’s fleet expansion slashing orbital freight costs and making it all accessible. But don’t kid yourself—this won’t be some noble, unified push for humanity’s future. Affluent long-lived people will get bored and want control of isolated social networks. Why? Because they want to do stuff the broad run of people find “icky” and unacceptable. (Basically Epstein perverts, racists, cultists, and every flavor of fetishist.) There will be EVERY variety of fetish and cult building their own colonies. Doing stuff you will find extremely unacceptable. Want to force your genetic line to create cat-girls, puppy-boys, and large-breasted hermaphrodites? Want to practice cannibalism or worse? There will be colonies doing that—no matter the genetic, moral, or religious assault on their progeny or God, it will be happening… AT SCALE! Starship easily supplies the needed Δv. Practical optimized trips between asteroid colonies 30° apart take roughly 6–9 months—long enough for coasting in an elliptical phasing orbit but short enough to build a colony-hopping civilization. With asteroid resources for refueling, even faster round trips become routine. Dark colonies with slavery and perversion won’t be easily detected, especially if they’re using AI to edit their outgoing communications. For interstellar viability, radiation shielding is key: bury habitats in meters of asteroidal regolith and water ice, add superconducting magnetic fields to deflect galactic cosmic rays down to safe levels—all doable with native materials. The most unacceptable and deviant colonies will depart the solar system the soonest upon discovery. They’ll stop in the Oort cloud to grab a cometesimal for fuel and slow-boat to the nearest stars at 1% c. Travel time to Alpha Centauri? About 500 years. By the time they arrive, faster ships leaving later—or industrial AI robot technology packets—will have already started industrialization in the target system. Interstellar colonization will be entirely democratized… and you won’t like the perverts and degenerates leaving first for the stars. In about 500 years, Alpha Centauri will be a shit-show of arriving colony ships plus a bunch of failed colonies kiting through the system at 1% lightspeed. It’s gonna be a genetic nightmare of children forced into fetish bodies and weird cults and societies they had no choice in attaching to their lives. There will be much cursing of their forebearers… who, because of longevity discoveries, WILL BE THERE TO TAKE THE ABUSE! Space colonization will also select for better human traits—like revising women’s mate preferences toward intelligence, agreeability, and proactive skills over stone-age hypergamy. If life extension hits in time, I might even head out myself to found an independent colony among the asteroids or beyond. Humanity expands whether we like the details or not. The mountains of resources are there. The tech is coming. Buckle up.

English

673

Elon Musk@elonmusk·6d

Any self-respecting civilization needs to reach at least Kardashev II

Ashok Elluswamy@aelluswamy

Extreme AI software - AI hardware - Semiconductor fabrication co-design!

English

2.3K

3.7K

48.7K

52.7M

expace me-retweet

Scott Manley@DJSnM·6d

Great to hear @elonmusk and the audience so excited for Iain Banks vision of Fully Automated Luxury Gay Space Communism

English

754

70.9K

expace@expace_·6d

AI will take everyone's jobs, and we will all be free to do whatever we want. Everyone can explore the universe and live in a world where money is obsolete.

English

expace@expace_·6d

I love the Culture books, and I see how a world of abundance is possible once all human labor is automated with AI. This is the world that Elon Musk is working towards, he said it himself.

English

expace@expace_·6d

"Things will just be free in the future. With an AI/robotics economy anywhere close to 1,000,000x the size of the current earth economy, literally any need you possibly want can be met. Iain Banks in his Culture books gets this right." - Elon Musk

SpaceX@SpaceX

Announcing TERAFAB: the next step towards becoming a galactic civilization twitter.com/i/broadcasts/1…

English