Quantum Skull

47 posts

Quantum Skull

@SigmaMaleAnon

Katılım Aralık 2021

27 Takip Edilen2 Takipçiler

Quantum Skull@SigmaMaleAnon·2d

@MattZeitlin xAI isn't a frontier lab

Français

Matthew Zeitlin@MattZeitlin·2d

we always hear about frontier labs being compute constrained, why is xai/spacex giving anthropic access to their compute?

Joe Weisenthal@TheStalwart

whoa

English

193

63.4K

Quantum Skull@SigmaMaleAnon·3d

@ShakeelHashim @hamandcheese In retrospect I'm pretty sure that post was at least half a gambit for astralcodexten.com/p/support-your…

English

638

Shakeel@ShakeelHashim·4d

i swear to god if @hamandcheese turns out to have been right

Shakeel@ShakeelHashim

nytimes.com/2026/05/04/tec…

English

103

20.9K

Quantum Skull@SigmaMaleAnon·4d

@emollick Epochs Capability Index seems to do a similar job but better.

English

Ethan Mollick@emollick·5d

The artificial analysis index is a normalized score of several benchmarks (and has changed over time) it is fine for roughly comparing models, it is not useful for trend analysis and it is unclear what individual point differences in the scores mean.

Chris@chatgpt21

I pulled the current Artificial Analysis style index scores, looked at OpenAI’s release cadence and average raw score gains, then ran a conservative extrapolation by cutting the gain per release in half. while keeping cadence the same. Even with that SLOWER path, GPT still trends toward a 90 index score by around 2029, which is why I think late decade AGI is starting to look like a base case rather than optimistic in my opinion. A 90 on the Index would be massive because it means the model is averaging near PHD performance across a diversified frontier basket like CritPt, HLE, SciCode, Terminal Bench Hard, and GDPval AA, rather than leaning on one saturated benchmark. Keep in mind, this graph already cuts the current raw progress rate (+3 per release) in half, so it is not the aggressive case. If gains speed up from better agents, test time compute, synthetic data, post training, or AI helping with AI research, that path to 90 could arrive much faster than this (albeit linear) conservative prediction.

English

125

22.4K

Quantum Skull@SigmaMaleAnon·30 Nis

@RafaRuizdeLira 1- Can't calculate all the possible lines for a given board state 2- Guessing likely responses by the opponent. 3- At the top level, choosing what opening prep to do and guessing what prep your opponent did.

English

2.9K

Rafael Ruiz ⏸️🔸@RafaRuizdeLira·29 Nis

what incomplete data all the pieces are there

Chess.com - India@chesscom_in

That's what Magnus learnt. What about you?

English

188

219

10.4K

882.7K

Quantum Skull@SigmaMaleAnon·30 Nis

@provisionalidea The main points I took from Rob's video were: 1- Among specific deployments there was 5% success out of all companies, but only 20% used AI, so 95% failure rate is misleading. 2- Separately from specific deployments, employees had widespread adoption. I think both hold?

English

206

James Rosen-Birch ⚖️🕊️@provisionalidea·30 Nis

@SigmaMaleAnon No, Rob is wrong, maliciously so. The study is a valuable business resource. You can go read it yourself.

English

398

James Rosen-Birch ⚖️🕊️@provisionalidea·29 Nis

I can't believe the AI hype people are trying to discredit one of the few good, impartial signals on market adoption, which is necessary for businesses to make informed decisions on how to go about improving adoption rates, just because they don't want the results to be true, and think people believing the data might negatively affect their valuations. this is so fucking sleazy. here's an idea -- why don't people actually listen to why barriers to adoption exist and do something about it, instead of plugging their ears and pretending all you need is hype?

Rob Wiblin@robertwiblin

'MIT Study Shows 95% of AI Projects Lose Money' was the #1 AI meme for the public (and politicians) last year. So I looked into this 'study'. It was… much worse than I would have guessed. And I suspect not by mistake. The authors had a hidden agenda from the start. I explain:

English

244

19.3K

Quantum Skull@SigmaMaleAnon·29 Nis

@Tim_Hua_ @JeffDean Papa Sergey flexing those voting shares

English

122

Tim Hua 🇺🇦@Tim_Hua_·28 Nis

Hmmm I’m under the impression that many Google employees did not want this to happen? Including @JeffDean ?

Erin Woo@erinkwoo

SCOOP: Google has signed a deal with the Pentagon to allow the use of its AI for "any lawful government purpose." Google’s agreement also requires it to assist in adjusting its AI safety settings and filters at the government’s request.

English

5.7K

Quantum Skull@SigmaMaleAnon·29 Nis

@lymanstoneky I'm pretty sure free lunch is with respect to the scale of the medicine operating on the human body, i.e. the sense that you can modify the behavior of the body outside natural parameters without negative side effects.

English

103

Lyman Stone 石來民 🦬🦬🦬@lymanstoneky·28 Nis

also calling a drug which has taken BILLIONS OF DOLLARS OF ADVANCED SCIENTIFIC RESEARCH to develop a "free lunch" is very strange

Crémieux@cremieuxrecueil

Brian has the best take on GLP-1 side effects: There's a huge one, we just wouldn't have liked it historically. It just so happens, however, that in our era, killing hunger is desirable rather than deadly

English

152

17.1K

Quantum Skull@SigmaMaleAnon·28 Nis

@scaling01 Consequence of Dario being too conservative on acquiring compute.

English

2.3K

Lisan al Gaib@scaling01·28 Nis

the aura loss anthropic has suffered in april is insane

English

163

219

8.4K

387.4K

Quantum Skull@SigmaMaleAnon·24 Nis

@VictorTaelin Gemini 3.1 is divinely endowed with the ability to benchmaxx everything without actually being useful.

English

661

Taelin@VictorTaelin·24 Nis

Introducing LamBench . . . You asked me to make a benchmark, so I made it. It is a simple, old style Q&A consisting of 120 fresh λ-calculus programming questions. Some are easy, like "implement add for λ-encoded nats". Some are harder, like "derive a generic fold for arbitrary λ-encodings". It measures: - intelligence (% tasks completed) - elegance (BLC-length of solutions) - speed (completion time) Basically what I care about, other than long context. I made it today because I was excited about GPT 5.5. It didn't do too well ): (My first-day impression is that I can't tell the difference between GPT 5.5 and GPT 5.4. I would be lying if I said otherwise. I'd not be able to distinguish in a blind test. I need more time. It is much faster though.) This is a new, simple bench, so expect be bugs. Specially on OpenRouter models. I'll retest soon. Also, it was born saturated. V2 will be harder... ↓ Link and more charts below ↓

English

898

49.7K

Quantum Skull@SigmaMaleAnon·22 Nis

@ar0cket1 @scaling01 He's desperate b/c xAI isn't catching up and is if anything falling farther behind.

English

ar0cket1@ar0cket1·22 Nis

@scaling01 Cursor shouldn’t be worth anywhere near that. Weird that Elon did this.

English

163

Lisan al Gaib@scaling01·22 Nis

"Cursor has also given SpaceX [xAI] the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together" as expected. without coding data you are cooked.

SpaceX@SpaceX

SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.

English

116

7.1K

Quantum Skull@SigmaMaleAnon·20 Nis

@mattyglesias Mamdani is being viewed as revolutionary because his supporters aren't looking at the problem as 'how do Democrats win elections' but 'how does the progressive faction gain more power' and in that specific regard he has indeed been unusually successful.

English

1.8K

Quantum Skull@SigmaMaleAnon·19 Nis

@zachpruckowski @GarrisonLovely Chatbots have some of the best user retention curves of online products in history.

English

122

@zachpruckowski where the Sky is Blue@zachpruckowski·19 Nis

@GarrisonLovely Yeah, but like what percent of those users are financially net-positive? Or consistent users? "11 trillion people have tried ChatGPT one time" is a pointless metric.

English

1.3K

Garrison Lovely is in SF@GarrisonLovely·19 Nis

the hotter kenny omega@SamGreszes

People are quoting this saying it's fake bc they marketed the internet and cell phones but those marketing campaigns worked quickly and ended bc the tech was good. They've been trying to shove chatgpt down our throats since like 2019

ZXX

353

26.5K

Quantum Skull@SigmaMaleAnon·18 Nis

@shift_in2_turbo @Noahpinion The Nazi Germany comparison is particularly apt because the allies clearly did a bunch of bad stuff as well (Bengal Famine), yet Nazi Germany was still clearly worse and is morally indefensible to support.

English

Quantum Skull@SigmaMaleAnon·18 Nis

@shift_in2_turbo @Noahpinion It was a brutal authoritarian regime; same reason that supporting Nazi Germany is extreme. Wanting the USSR to have won the Cold War is analogous to wanting Nazi Germany to win WW2 (though not as extreme perhaps).

English

382

Karim is reading A Farewell to Arms@shift_in2_turbo·18 Nis

Supporting a free Palestine is not extreme

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion

I’ve sat here for years and watched the Republicans embrace their worst extremists. I don’t want to see the Democrats do the same. noahpinion.blog/p/hasan-piker-…

English

32.7K

Quantum Skull@SigmaMaleAnon·18 Nis

@BenjaminDEKR @Jaraff Dwarkesh asked Elon this question in his interview dwarkesh.com/i/186967347/02…

English

Benjamin De Kraker@BenjaminDEKR·18 Nis

@Jaraff My entire X feed is following how AI has improved. Everything we have today was not difficult to predict or see happening a year ago. We were solidly on this track one year ago

English

3.9K

Benjamin De Kraker@BenjaminDEKR·18 Nis

I am genuinely curious what changed so dramatically that Elon went from "inflation and government spending will collapse America" to "government should write endless checks to all Americans and this will create deflation" in, like, 14 months The supposed reason is "AI" but that didn't suddenly emerge in the last year. Yet his views on this have taken a full 180 in that same timeframe. To the point that it sounds like a different person entirely. Still have not seen a coherent explanation.

English

756

125

1.8K

84.6K

Quantum Skull@SigmaMaleAnon·17 Nis

@VileEdgelord @segyges I'm pretty sure the post is sarcastic.

English

Noble Savage@VileEdgelord·17 Nis

@segyges The whole point is that your labour is not needed anymore.

English

268

SE Gyges@segyges·17 Nis

as good leftists we must reject right wing ideology like the government giving people money unconditionally. this reactionary policy is often championed by fascists like elon musk, charles murray and andrew yang. good proletarians know that if you don't work you don't eat

English

111

3.4K

Quantum Skull@SigmaMaleAnon·16 Nis

@Tim_Hua_ Cynically speaking, having some safety focus probably helps with recruiting / talent retention esp. since the parent company already has something of an antisocial reputation.

English

Tim Hua 🇺🇦@Tim_Hua_·15 Nis

Gotta hand it to Alexandr Wang I wasn't familiar with his game

Shakeel@ShakeelHashim

This is a huge change in how Meta does AI safety testing — or at least how it talks about it publicly.

English

11.6K

Quantum Skull@SigmaMaleAnon·16 Nis

@ShakeelHashim "It is difficult to get a man to understand something, when his salary depends on his not understanding it." -Upton Sinclair

English

632

Shakeel@ShakeelHashim·16 Nis

I’m slightly surprised at how incoherent this is

Dwarkesh Patel@dwarkesh_sp

Distilled recap of the back-and-forth with Jensen on export controls: Dwarkesh: Wouldn’t selling Nvidia chips to China enable them to train models like Claude Mythos with cyber offensive capabilities that would be threats to American companies and national security? Jensen: First of all, Mythos was trained on fairly mundane capacity and a fairly mundane amount of it by an extraordinary company. The amount of capacity and the type of compute it was trained on is abundantly available in China. Dwarkesh: With that, could they eventually train a model like Mythos? Yes. But the question is, because we have more FLOPs, American labs are able to get to this level of capabilities first. Furthermore, even if they trained a model like this, the ability to deploy it at scale matters. If you had a cyber hacker, it's much more dangerous if they have a million of them versus a thousand of them. Jensen: Your premise is just wrong. The fact of the matter is their AI development is going just fine. The best AI researchers in the world, because they are limited in compute, also come up with extremely smart algorithms. DeepSeek is not an inconsequential advance. The day that DeepSeek comes out on Huawei first, that is a horrible outcome for our nation. Dwarkesh: Currently, you can have a model like DeepSeek that can run on any accelerator if it's open source. Why would that stop being the case in the future? Jensen: Suppose it optimizes for Huawei. Suppose it optimizes for their architecture. It would put others at a disadvantage. As AI diffuses out into the rest of the world, their standards and their tech stack will become superior to ours because their models are open. Dwarkesh: Tesla sold extremely good electric vehicles to China for a long time. iPhones are sold in China. They didn't cause some lock-in. China will still make their version of EVs, and they're dominating, or smartphones, they're dominating. Jensen: We are not a car. The fact that I can buy this car brand one day and use another car brand another day is easy. Computing is not like that. There's a reason why x86 still exists. There's a reason why Arm is so sticky. These ecosystems are hard to replace. Dwarkesh: It's just hard to imagine that there's a long-term lock-in to the Chinese ecosystem, even if they have this slightly better open-source model for a while. American labs port across accelerators constantly. Anthropic's models are run on GPUs, they're run on Trainium, they're run on TPUs. There are so many things you can do, from distilling to a model that's well fit for your chips. Jensen: China is the largest contributor to open source software in the world. China's the largest contributor to open models in the world. Today it's built on the American tech stack, Nvidia’s. Fact. All five layers of the tech stack for AI are important. The United States ought to go win all five of them. in a few years time, I'm making you the prediction that when we want American technology to be diffused around the world—out to India, out to the Middle East, out to Africa, out to Southeast Asia—on that day, I will tell you exactly about today's conversation, about how your policy ... caused the United States to concede the second largest market in the world for no good reason at all.

English

11.9K

Quantum Skull@SigmaMaleAnon·14 Nis

@AlejandroZarUrd @Noahpinion Efficiency gains in training are not spontaneous though; pausing AI (i.e. not using the compute and talent on improving the frontier) would also prevent us from developing algorithmic efficiency, or at least slow it down by a lot.

English

Alejandro Zarzuelo Urdiales@AlejandroZarUrd·14 Nis

@Noahpinion The efficiency of AI improves by 1.5-2 OOM per year, if we can get a superintelligence that costs a trillion dollars in 2030, by 2036 at the latest, someone on a 1k dollar computer would be able to train it, should we ban all computers? Pause only barely delays the inevitable

English

351

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion·14 Nis

I do think that IF there were a technology that had a significant chance of exterminating the human race, it would probably be smart to pause its development until we were very sure that its chance of exterminating the human race was very low.

🎭@deepfates

For the record I think AI safety is a very real problem and that artificial superintelligence has a non-zero chance of extinguishing human life. I just don't think it's a certainty. And the way to actually create safe human AI relations is to do science, not to start a jihad

English

224

29.5K

Keşfet

@MattZeitlin @ShakeelHashim @hamandcheese @emollick @RafaRuizdeLira @provisionalidea @Tim_Hua_ @JeffDean