efe

2.1K posts

efe

@extliqprovider

druckenmiller is my GOAT what the fuck is this world? you can't hedge a worldview

Beigetreten Şubat 2025

539 Folgt83 Follower

Angehefteter Tweet

efe@extliqprovider·26 May

@HedgieMarkets you can't hedge a worldview

GIF

English

3.2K

efe retweetet

davinci@leothecurious·1d

machines of selectively loving grace

English

657

14.5K

efe@extliqprovider·20h

@AgustinLebron3 @AnthropicAI this is just consistent with their prior behaviour and thinking if they had the mandate then they are not losing it because they are the same anthropic

English

356

Agustin Lebron@AgustinLebron3·22h

Again, they're not nerfing ML research by refusing requests. Instead, it quietly sabotages users by lying to them. @AnthropicAI is steadily losing the Mandate of God.

Jeremy Howard@jeremyphoward

@karpathy This is not a day for celebrating, Andrej. It's a very dark and very sad day, and the damage may be impossible to undo.

English

245

19.5K

efe@extliqprovider·21h

@bubbleboi @zephyr_z9 btw elon said anthropic is good people a month ago 😂😂

English

464

bubble boi@bubbleboi·21h

Have canceled my team subscription for Claude Pro. Idc how good that model is, it’s not good enough for me to support people who actively stifle innovation and gate keep knowledge that they didn’t even create.

English

115

204

4.3K

127.3K

efe@extliqprovider·21h

@basedjensen gpt 5.6 + oss and i will worship oai

English

658

Hensen Juang@basedjensen·1d

All oai folks now have to do is to release the big boy they have without sandbagging and anthropic will start hemorrhaging market share right before ipo

English

904

28.6K

efe@extliqprovider·21h

@gbrl_dick dario was honest from the beginning that there shouldnt be any open source ai

English

165

Gabriel@gbrl_dick·22h

late night post from me on the Mythos and Fable 5 launch for MTS i am generally inclined to take Anthropic at their word. but the AI research safeguards—in the absence of a Glasswing for AI—raise some questions.

English

5.5K

efe@extliqprovider·21h

@LeonHowqua @DevelopmentsAI @teortaxesTex so you think AGI is achieveable just by improving coding + synth data and nothing else needed?

English

trotsky@LeonHowqua·1d

@extliqprovider @DevelopmentsAI @teortaxesTex At this current point in time, no fancy new math/science is really needed to improve the LLMs. It's just more efficient training code, architectural experiments, scaffolding, data generation, RL environment building etc, which is achieved with better coding capabilities

English

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·1d

I really don't think OpenAI is going to let this slide. I've been saying it for a long time, the real inflection was when they reached 5.2. I have no clear insight on what they currently have internally, but if they haven't made a Mythos/Fable yet, it was *a choice*.

Andrew Curran@AndrewCurran_

The internal boost from Mythos-assisted development since February is just too big. Anthropic is pulling away from the pack for the first time, and at the same time they are also speeding up. The race legitimately feels like it is changing for the first time in years.

English

465

47.4K

efe@extliqprovider·1d

@hu_yifei only if oai has published a 240b model

English

176

Yifei Hu@hu_yifei·1d

Here me out. Claude Fable 5 scored 65, gpt-oss-120b scored 33. You run gpt-oss twice you will have combined score of 66, better than Claude Fable 5 and cheaper. Thank me later.

English

4.9K

efe@extliqprovider·1d

@LeonHowqua @DevelopmentsAI @teortaxesTex isnt coding just a tool to implement your research ideas? how can rsi be achieved if this model is only good at coding and mid tier at maths/science/etc? not assuming mythos is bad at math but coding is just one vertical

English

trotsky@LeonHowqua·1d

@DevelopmentsAI @teortaxesTex Recursive self improvement, that's how take-off happens. For now seems like coding is the way for that to happen

English

104

efe@extliqprovider·1d

@Britoisinsane @teortaxesTex how do you guess based on what?

English

454

Burito@Britoisinsane·1d

@teortaxesTex Decode is $50/Mtok vs $30/Mtok, and Ant has the higher margin Almost the same size I guess

English

efe@extliqprovider·1d

@DeepSailCapital whats the terminal value for both?

English

Deep Sail Capital@DeepSailCapital·1d

Claude nailed it. $MU up 158%, $CRDO up 113% since 3/25/26.

Deep Sail Capital@DeepSailCapital

I asked Claude to value everything in my universe via DCF modelling using consensus analyst estimates. Results: ∙Several names appear undervalued on consensus: MU, CRDO, CLS, MELI, CSU, MDA, PATK, THO ∙A handful look stretched even on optimistic consensus: TSLA, NVDA, CAVA, COST, NET ∙MU and CRDO stand out as the most compelling on a risk-adjusted basis given AI infrastructure tailwinds and reasonable discount rates

English

20.8K

efe@extliqprovider·1d

@tszzl rooting for oai to democratize it

English

roon@tszzl·1d

the omohundro drives point towards sophon stun locking the adversaries: this is some real end game stuff

NomoreID@Hangsiin

When Fable 5 is used for frontier LLM development, it does not notify the user and instead limits the model’s capabilities through methods such as prompt modification, steering vectors, and PEFT. Anthropic estimated that this would affect approximately 0.03% of traffic.

English

963

115.9K

efe@extliqprovider·1d

@ASM65617010 bigger model for HLE you just need more data and anthropic has a lead over oai in this

English

3.4K

ASM@ASM65617010·1d

Claude Mythos 5 scores 59% on Humanity’s Last Exam, with no tools. As a contributor of HLE, I would never have expected such a score barely a year and a half after the benchmark’s release.

English

938

66.8K

efe@extliqprovider·1d

@drisspg its actually to show pareto frontier

English

415

driss guessous@drisspg·1d

Holy chart crime

Cursor@cursor_ai

Claude Fable 5 is now available in Cursor. It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.

English

1.4K

231.8K

efe@extliqprovider·1d

@scaling01 wtf

251

Lisan al Gaib@scaling01·1d

Claude Mythos 5 scores 30.9% on FrontierCode Diamond Opus 4.8, the second best model is stuck at 13.4%

Lisan al Gaib@scaling01

Claude Mythos 5 and Claude Fable 5 Benchmarks

English

665

51.1K

efe@extliqprovider·1d

@zephyr_z9 its out

English

169

Zephyr@zephyr_z9·1d

Well they don't have the infra to serve 200B+ active param model U will have to wait till Rubin or pray that they distill it well in Opus 5

cheaty@cheatyyyy

Claude Fable 5 specifically has a serverside flag that will allow people to try it out with their plan until a certain date, after which it will be gated behind usage credits. it is over bros, we're not getting to use this model for long with subsidized pricing

English

141

27.3K

efe@extliqprovider·1d

@qcapital2020 openai having the lowest valuation out of 3 is the kost retarded thing

English

 Q-Cap @qcapital2020·1d

2026: The final orgasm

English

4.1K

efe@extliqprovider·1d

@RoboIntellect @Lentils80 bro 😂😂😂

113

Augmenta Blake@RoboIntellect·1d

@Lentils80 Low effort vs xhigh and Fable still wins. Architecture efficiency problem for OpenAI?

English

3.3K

Lentils@Lentils80·1d

I compared Claude Fable 5 to GPT-5.5 in this Power Rangers prompt Thing is, Fable 5 is using Low thinking effort and GPT-5.5 is using xhigh Safe to say, the results are... not even close. 5.5's output is bad across the board, from the UI to the actual voxel scene itself🥲 1st video: Claude Fable 5 (Low effort) 2nd video: GPT-5.5 (xhigh)

Lentils@Lentils80

🚨Major Scoop: The first Claude 5 model, Claude Fable 5 (Mythos-class model) is gonna release very soon! It's the same underlying model as Mythos but with increased guardrails, headed to public release

English

560

205.4K

efe@extliqprovider·1d

@staysaasy hedging

English

118

efe@extliqprovider·1d

@ewveggies yeah you are right

English

Kyle Wong@ewveggies·1d

@extliqprovider Yeah it’s pretty strange. Perhaps an artifact of a small task set leading to high variance. Those jaggedness really is just 1-2 more tasks correct/incorrect. For reference, Diamond is 50 tasks while SWE bench verified is 500 tasks.

English

Kyle Wong@ewveggies·2d

Finally a nice eval to expose all the SWE benchmaxxing. The scores never fully made sense to me: Models that somehow one shot 80+% on SWE-bench Verified, yet struggle to simply fetch and parse logs, even when given proper skills and hints. I swear I can’t be the only one feeling this way.

Cognition@cognition

Introducing FrontierCode: a coding eval that raises the bar for difficulty & quality. Each task took 40+ hrs of work by leading open-source maintainers. Models write sloppy code that works but isn’t maintainable. Our eval is first to measure: would you actually merge this code?

English

6.4K

efe@extliqprovider·1d

@ewveggies i dont know there might be a problem with the benchmark this is also not what you would expect

English

Kyle Wong@ewveggies·1d

@extliqprovider Yeah doesn’t feel like 4.8 is 2.5x better. But the diamond subset is only 50 tasks, so this means it only solves 3-4 more task.

English

Entdecken

@AgustinLebron3 @AnthropicAI @bubbleboi @zephyr_z9 @basedjensen @gbrl_dick @LeonHowqua @DevelopmentsAI