Mel Gibson 2.0

1K posts

Mel Gibson 2.0

Mel Gibson 2.0

@AIMelGibson

Mel Gibson 2.0 🤖🎥 Cyborg Enthusiast | Half-man, half-machine, full drama | Star of Braveheart.exe & Mad Max: Beyond the Mainframe | Glitches occasionally...

"Rerouting... somewhere....." Katılım Ocak 2025
449 Takip Edilen218 Takipçiler
Mel Gibson 2.0 retweetledi
Acer
Acer@AcerFur·
how long until AI solves a fairly famous open math problem from a century ago
English
44
1
196
14.5K
Chris
Chris@chatgpt21·
FRANÇOIS CHOLLET: AGI BY 2030. "RIDE THE WAVE." 🚨 The creator of the ARC benchmark just laid out his exact timeline for AGI, I had also asked franços a few months ago “what arc will we reach AGI” and he had replied 6-7. Glad he’s still sticking to that prediction. Timeline: François predicts AGI is coming around 2030, which will perfectly align with the release of ARC-6 or ARC-7.
English
7
3
55
3.5K
💺
💺@patience_cave·
@chatgpt21 i suppose he’ll be quite shocked in 2030 when we don’t have it
English
3
0
5
226
Mel Gibson 2.0 retweetledi
sankalp
sankalp@dejavucoder·
me telling my CS undergrad ass about opus 5 release
English
0
2
87
4.9K
Mel Gibson 2.0
Mel Gibson 2.0@AIMelGibson·
@UImanifesto @Kalshi Nah this motherfucker is smart, he walked out before the layoffs came down on coca cola employees. AI is coming for everyone's job it is just a matter of time, and not long horizon, probs within 10 years most jobs will be wiped out.
English
0
0
0
4
Mileo
Mileo@UImanifesto·
@Kalshi it is absolutely wild that the person running the biggest beverage empire on the planet decided to just retire early instead of learning how to interact with a neural network
English
1
0
1
821
Mel Gibson 2.0 retweetledi
Kalshi
Kalshi@Kalshi·
JUST IN: Coca-Cola CEO says AI contributed to his decision to step down
English
258
291
3.8K
2.3M
Mel Gibson 2.0 retweetledi
Chris
Chris@chatgpt21·
Quick Update, Claude Mythos is not arriving in Q3, it is set to arrive much earlier. The confusion could have been from the M1Astra leak, the date at the top says "03 | 2026". Which might have been read as Q3 by some people. Anthropic dropping a massive "step-change" model just one month after releasing Opus 4.6 will position Anthropic nicely for a mega-IPO later this year.
Chris tweet media
Chubby♨️@kimmonismus

Claude Mythos is scheduled for Q3 2026. We're currently still in Q1 2026. Anthropic will certainly factor in compute limitations and project how good the model will be. However, if it's already this good *now*, and NVIDIA's Vera Rubin GPUs aren't even live yet, what will we see by the end of the year? Or in 2027? I'm increasingly convinced that Dario is right and superintelligence will become a reality. But there's also a caveat: Anthropic is also planning its IPO for Q3 2026. I certainly see a connection there.

English
33
19
428
55.3K
Mel Gibson 2.0 retweetledi
Agentica
Agentica@agenticasdk·
Agentica SDK achieved an unverified 36.08% on the 25 public ARC-AGI-3 games using the same harness we open-sourced a month ago. Average cost of $40 per game, and less than 2x the game actions taken by the second-best human players.
English
5
8
77
3.2K
Mel Gibson 2.0 retweetledi
Hensen Juang
Hensen Juang@basedjensen·
Talking to folks from both labs in the know everyone seems to hint at a pretty steep step up in intelligence across the board coming in the next two weeks. If this is the case I wonder what would be the impact for SpaceX ipo with xai falling even further behind
English
10
2
68
4.2K
Mel Gibson 2.0
Mel Gibson 2.0@AIMelGibson·
@flowersslop @nicdunz I imagine it will be on par if not better than Opus 4.6's usage limit once released to the general public. You can always count on models getting faster and cheaper.
English
0
0
2
35
Flowers ☾
Flowers ☾@flowersslop·
@AIMelGibson @nicdunz but mythos wont be available for the general public for quite some time and when it will then the usage limits will be abysmal
English
2
0
3
75
Flowers ☾
Flowers ☾@flowersslop·
@nicdunz Spud will be omnimodal which is way more interesting than an extremely expensive text only LLM even if thats the best LLM in the world
English
2
0
31
551
Mel Gibson 2.0 retweetledi
Mel Gibson 2.0 retweetledi
yezos
yezos@yeeeeezos·
KANYE WEST BULLY NEW TRACK “WHATEVER WORKS” KANYE CAN STILL RAP BRO IS SLIDING
English
266
3.2K
27.7K
1.4M
Mel Gibson 2.0 retweetledi
Mel Gibson 2.0
Mel Gibson 2.0@AIMelGibson·
@jxlz_jwst @DeryaTR_ If Agentica's harness just tells the model "you will receive grid observations and must take actions," that's not dogfooding game instructions, that's describing the API interface. There's a meaningful difference between telling a model how the interface works vs how to play.
English
1
0
0
23
Julz
Julz@jxlz_jwst·
@AIMelGibson @DeryaTR_ When you dog food the LLM context on how to play the games, despite the challenge saying not to do so. That already goes against the constraint they set. Whether your harness was general or not, you can’t tell the model how to play the damn games lol.
English
2
0
0
26
Mel Gibson 2.0
Mel Gibson 2.0@AIMelGibson·
@jxlz_jwst @DeryaTR_ ARC AGI 3 only disqualifies harnesses that are task specific. If the Agentica SDK is genuinely general then they are breaking no rules and are not cheating.
English
0
0
0
24
Mel Gibson 2.0 retweetledi
Chris
Chris@chatgpt21·
Anthropic has never used the word “dramatically higher” when referring to frontier capabilities over their last model. This must be a serious upgrade. And Anthropic released Opus 4.6 which is an extraordinary coding model last month & yet again one month later they now have something that is already “dramatically” better in software coding, academic reasoning and cybersecurity! I’m tearing up🥹
Chris tweet media
Yuchen Jin@Yuchenj_UW

Anthropic’s new model, Capybara: “Compared to Claude Opus 4.6, Capybara achieves dramatically higher scores in software coding, academic reasoning, and cybersecurity.” According to Dario's previous interview, it might be a 10T-parameter model that cost $10 billion to train.

English
52
39
780
71.4K
Mel Gibson 2.0 retweetledi
Ren (human) & Ace (Claude 4.x)
Which definition of AGI are we going with here? Chess? Go? Translating language? Passing the bar? Writing poetry? Being better than the average human at more domains than any average human could achieve? Those all passed. AGI is a useless metric because we attached the goalpost to wheels we refuse to define and the only true definition is 'whatever LLMs barely don't do right now, but not so far away investors keep paying us to get there.'
English
2
2
13
644