Kart ographien

3.7K posts

Kart ographien

@kartographien

mostly adjacent

Katılım Ocak 2020

3.1K Takip Edilen1.2K Takipçiler

Sabitlenmiş Tweet

Kart ographien@kartographien·15 Nis

brb gonna study philosophy till im smart enough not to understand what ordinary words mean

English

Kart ographien@kartographien·14h

@cynlioness manifest?

Français

145

Cynamon 🔸@cynlioness·14h

I need a con somehow halfway in between eag and slutcon

English

1.7K

Kart ographien@kartographien·1d

@chkn_little @Lenarcv1 @celestepoasts @jxmnop the framing of "burden" is messed up lmao we need to solve actual fucking problems with limited talent and limited time. this isn't debate club. joecarlsmith.substack.com/p/fake-thinkin…

English

Chicken Little@chkn_little·1d

@Lenarcv1 @kartographien @celestepoasts @jxmnop but they have so the burden now goes to the critic

English

101

dr. jack morris@jxmnop·1d

the OpenAI goblin fiasco was a Big L for the interpretability research community They solved the mystery without SAEs or probing or anything. just talked to various models and counted the number of times they said Goblin

English

809

89.4K

Kart ographien@kartographien·1d

@Lenarcv1 @celestepoasts @jxmnop (fwiw people I trust suspect that the genetics stuff wasn’t best solved with SAEs) sometimes it’s fine to build a hammer in search of a nail. but it’s important you remain sober about what was and wasn’t a nail. SAEs (and other superposition stuff) was obviously disappointing

English

106

Lenarc ❤️‍🔥🌲🐀@Lenarcv1·1d

@kartographien @celestepoasts @jxmnop I mean the whole thing with SAEs is controversial for a reason and interp has many other tools. But even if you look at only real world applied cases of sae's specifically im pretty sure that Goodfire still uses them a lot, i'd call the genetics stuff a problem that was solved.

English

117

Kart ographien@kartographien·1d

@Lenarcv1 @celestepoasts @jxmnop if screwdrivers didn’t help you with *any* problem then they are useless and you shouldn’t have spent +100,000 researcher hours on building and refining them

English

112

Lenarc ❤️‍🔥🌲🐀@Lenarcv1·1d

@kartographien @celestepoasts @jxmnop This is like saying screwdrivers failed because sometimes you can screw in something by hand

English

113

Kart ographien@kartographien·1d

@celestepoasts @jxmnop no, the SAE community made a mistake if they don’t solve any problem you could’ve solved without them. it implies the counterfactual value of the agenda was a big fat zero.

English

207

Celeste@celestepoasts·1d

@jxmnop the failure of interp is if there exists behavior X, the simple methods do not work, and then even when interp pulls out the tryhard methods you stil can't get X

English

923

Kart ographien@kartographien·3d

@elidourado my guess is that gpt -5.5 will lead to numerous cyberattacks

English

129

Eli Dourado@elidourado·3d

A Mythos-level (for cybersecurity) LLM has been out for a week in general release and nothing bad has happened. Curious.

AI Security Institute@AISecurityInst

OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end 🧵

English

132

14.8K

Kart ographien retweetledi

Eric Gan@ejcgan·3d

Can frontier LLMs and humans catch sabotage in ML research code? In Auditing Sabotage Bench, I added subtle sabotages to 9 existing ML codebases which change a key finding of the research. Neither LLMs nor LLM-assisted humans reliably caught them.

English

6.2K

Kart ographien@kartographien·5d

@stanfordNYC @GuiveAssadi the question is about the harms

English

Bernard Stanford ✡︎@stanfordNYC·5d

@GuiveAssadi Is there any harm threshold where you support restricting freedom? Seatbelt laws? Fentanyl restrictions? Laws against selling oneself into slavery? If so, is the question here about how much harm is caused, or it's distribution, or what?

English

110

Bernard Stanford ✡︎@stanfordNYC·6d

Ban gambling. Ban marijuana. Ban endless scroll. Ban pornography. Ban gacha. Regulate crypto. Ban AI romance chat. Put daily caps on short-form video, maybe on streaming and multiplayer gaming too. We can either regulate our way out of this or watch millions succumb. Up to us.

English

122

9.6K

Kart ographien retweetledi

Andreas Kirsch 🇺🇦@BlackHC·5d

I'm speechless at Google signing a deal to use our AI models for classified tasks. Frankly, it is shameful. For HR, I'm not speaking on behalf of Google but in my personal capacity, quoting public information from a well-sourced article of a reputable publication

English

217

200

1.3K

247.6K

Kart ographien@kartographien·6d

is this the highest impact ops role atm? owain evans has the special sauce, and he’s deciding to scale his team you may know him from the most mind-blowing bangers of the past two years

Owain Evans@OwainEvans_UK

We're hiring for an operations lead at Truthful AI, my non-profit research organization! - Generalist role: recruiting, fundraising, communications, and PMing to support our research - At our office in Constellation (Berkeley, CA) preferred - Salary is $140–200k plus benefits

English

Kart ographien retweetledi

The Information@theinformation·21 Nis

The risks of AI agents and the urgent need for security measures are discussed by @bshlgrs, CEO of Redwood Research: "There's been a lot of interest in trying to prevent AI agents from accidentally doing really bad stuff." "As you know, [there are] these stories on Reddit, of agents that delete people's production databases and then lie about it."

English

8.1K

Kart ographien@kartographien·21 Nis

@tommy5dollar @SebJohnsonUK they would pay a london candidate 630K

English

658

Tommy Long@tommy5dollar·21 Nis

@SebJohnsonUK My understanding is that Anthropic list the same jobs in London, SF, etc. and when they do so they convert the range to the local currency, so the same £630k role in SF is $850k but they'd never pay the London candidate £630k since that's the SF amount converted to GBP.

English

5.2K

Seb Johnson@SebJohnsonUK·20 Nis

Anthropic is hiring a for a role in London that will be paid up to £630k A YEAR. This excludes stock options. For a long time the UK has been seen a location for cheap offshore talent. There's nothing cheap about a 7-figure package for a researcher (the role is Research Engineer, Science of Scaling). I've been celebrating the recent OpenAI and Anthropic expansions in London. However, in reality, they may end up pricing out, or poaching talent from, some of the capital's home grown companies. Great article in @thetimes from @kprescott all about this linked below

Seb Johnson@SebJohnsonUK

Anthropic has announced that it is massively expanding its London presence. It’s just secured a new office for 800 people - a huge jump from its 200 current employees. OpenAI announced its first permanent office in London this week and now @AnthropicAI is doubling down. Meta, OpenAI, DeepMind, wayve and so many others have huge offices in London. It’s becoming the leading AI hub outside of the US. LETS GO

English

119

1.8K

956.4K

Kart ographien@kartographien·21 Nis

@MattZeitlin @LondonNewLibs my guess is that UK AIsI marginally speeds up ASI development (by a ~month) but also reduces x risk by maybe half a percent it’s not clear a priori whether it would speed or slow it though

English

740

Matthew Zeitlin@MattZeitlin·20 Nis

ok but like it is a bit funny that britain's world beating, only in britain sector is specifically a set of services and institutions designed to slow down technological advanement

Hugo Gye@HugoGye

Nice point from Gus O'Donnell about a rare 'astonishing and important achievement by the British state' - which is now arguably the world's leading government monitoring AI safety

English

421

123.6K

Kart ographien@kartographien·20 Nis

@Tim_Hua_ hmm, I would prefer if we used our “pause time” for longer gaps evaluation, ie we delay the internal deployment, but train just as early this is trickier to enforce than delaying the training. but would be ideal in the abstract.

English

Tim Hua 🇺🇦@Tim_Hua_·20 Nis

>“pause” in ai development would be entirely squandered This seems clearly false? If you read through the white-box based alignment auditing techniques in the Mythos preview system card, it feels like all of them could be developed given Opus 3-level models. In other words...

roon@tszzl

the way every complex system works is that you deal with problems as they come up. something becomes too onerous to ignore and then you fix it. acceleration & iterative deployment has been the only option: a “pause” in ai development would be entirely squandered

English

3.6K

Kart ographien@kartographien·17 Nis

@robbensinger @robertwiblin he obviously has incredible reasoning ability. his arguments are bad because of motivated reasoning and deception.

English

237

Rob Bensinger ⏹️@robbensinger·17 Nis

@robertwiblin The reasoning ability Huang has available for generating arguments is also the reasoning ability Huang has available for evaluating interview opportunities.

English

3.3K

Rob Wiblin@robertwiblin·17 Nis

If you're Huang why do you go on Dwarkesh and have your transparently self-serving and self-contradictory arguments exposed so clearly? I don't get what the upside is.

English

388

70.8K

Kart ographien@kartographien·17 Nis

@KKumar_ai_plans @Kirsten3531 lmao you aren’t gonna observe any signal between iabied and ai 2027 in the next 3 weeks

English

Kabir Kumar@KKumar_ai_plans·16 Nis

@Kirsten3531 just make a bunch of predictions for the next 3 weeks based on which worlds you think you might be living in, one of them being the IABIED/ai 2027 world and see which ones are right the most.

English

303

Kirsten@Kirsten3531·15 Nis

Me: idk about this whole AI thing, EAs get really worked up about hypotheticals also me:

English

254

14.1K

Kart ographien@kartographien·16 Nis

@croissanthology @deanwball one reasonable argument is that it reduces the probability of US becoming a hegemon another is that it parity between us and china might cause us to negotiate with china to slow down idk there are a bunch

English

croissanthology@croissanthology·16 Nis

@deanwball What are arguments you find compelling enough to justify loosening export controls re:GPUs to China? Because there are plenty of excellent reasons to have a "monoculture" e.g. in "don't use tactical nukes" or wherever else the alternative is batshit insane. Object level pls!

English

404

Dean W. Ball@deanwball·16 Nis

It’s a shame Jensen mostly fails here, because the monoculture on export controls is bad. If you’re a young AI policy researcher trying to make a name for yourself, it is almost impossible to be taken seriously unless you are pro export controls. Monocultures are usually bad.

Dwarkesh Patel@dwarkesh_sp

Distilled recap of the back-and-forth with Jensen on export controls: Dwarkesh: Wouldn’t selling Nvidia chips to China enable them to train models like Claude Mythos with cyber offensive capabilities that would be threats to American companies and national security? Jensen: First of all, Mythos was trained on fairly mundane capacity and a fairly mundane amount of it by an extraordinary company. The amount of capacity and the type of compute it was trained on is abundantly available in China. Dwarkesh: With that, could they eventually train a model like Mythos? Yes. But the question is, because we have more FLOPs, American labs are able to get to this level of capabilities first. Furthermore, even if they trained a model like this, the ability to deploy it at scale matters. If you had a cyber hacker, it's much more dangerous if they have a million of them versus a thousand of them. Jensen: Your premise is just wrong. The fact of the matter is their AI development is going just fine. The best AI researchers in the world, because they are limited in compute, also come up with extremely smart algorithms. DeepSeek is not an inconsequential advance. The day that DeepSeek comes out on Huawei first, that is a horrible outcome for our nation. Dwarkesh: Currently, you can have a model like DeepSeek that can run on any accelerator if it's open source. Why would that stop being the case in the future? Jensen: Suppose it optimizes for Huawei. Suppose it optimizes for their architecture. It would put others at a disadvantage. As AI diffuses out into the rest of the world, their standards and their tech stack will become superior to ours because their models are open. Dwarkesh: Tesla sold extremely good electric vehicles to China for a long time. iPhones are sold in China. They didn't cause some lock-in. China will still make their version of EVs, and they're dominating, or smartphones, they're dominating. Jensen: We are not a car. The fact that I can buy this car brand one day and use another car brand another day is easy. Computing is not like that. There's a reason why x86 still exists. There's a reason why Arm is so sticky. These ecosystems are hard to replace. Dwarkesh: It's just hard to imagine that there's a long-term lock-in to the Chinese ecosystem, even if they have this slightly better open-source model for a while. American labs port across accelerators constantly. Anthropic's models are run on GPUs, they're run on Trainium, they're run on TPUs. There are so many things you can do, from distilling to a model that's well fit for your chips. Jensen: China is the largest contributor to open source software in the world. China's the largest contributor to open models in the world. Today it's built on the American tech stack, Nvidia’s. Fact. All five layers of the tech stack for AI are important. The United States ought to go win all five of them. in a few years time, I'm making you the prediction that when we want American technology to be diffused around the world—out to India, out to the Middle East, out to Africa, out to Southeast Asia—on that day, I will tell you exactly about today's conversation, about how your policy ... caused the United States to concede the second largest market in the world for no good reason at all.

English

164

64K

Kart ographien@kartographien·15 Nis

@vrloom @Tim_Hua_ yeah, you might be worried about someone stealing your model and training it to do bad things -- if it can successfully alignment-fake / exploration hack then that's good

English

Europurr@vrloom·15 Nis

@Tim_Hua_ This is crazy. Are there any benefits of doing this on purpose?

English

300

Tim Hua 🇺🇦@Tim_Hua_·15 Nis

Hidden in Figure 50 is that Gemini 3.1 Pro has a compliance gap of 37% (!!!) This is among the largest out there. I replicated this result on my own prompted alignment faking setup, which has a semantically identical prompt that's been perturbed to avoid memorization. On my ...

Zifan (Sail) Wang@_zifan_wang

Excited to share the Safety and Preparedness Report for Muse Spark. It’s a comprehensive and somewhat dense (158 pages) report covering both well-known and forward-looking areas in AI safety. Besides, wish techniques and research directions used in the report found useful by the community. Link in 🧵

English

15.3K

Kart ographien@kartographien·15 Nis

@mimi10v3 @allTheYud it’s an excellent model but it gives the wrong answer in a bunch of settings

English

˚♡⋆mimi ˚♡⋆｡☆∴@mimi10v3·15 Nis

@allTheYud it's been said in private chats as if it is well known so i felt like i missed an update and that's why i asked, so i don't have a good reference to share

English

397

˚♡⋆mimi ˚♡⋆｡☆∴@mimi10v3·15 Nis

can someone explain why ppl now say that rationalist bayesian updates are wrong not just impractical?

English

3.8K

Kart ographien@kartographien·14 Nis

@hvulashovs @NathanpmYoung @BogdanIonutCir2 oh yeah I didn't read it closely. thanks.

English

Nathan 🔎@NathanpmYoung·13 Nis

No they don’t.

Yann LeCun@ylecun

@Noahpinion Most "leading AI figures" think this p(doom) estimates are complete bullshit and the existential risk is essentially zero. But most of them are silent. The doomers attract a disproportionate amount of attention, of course.

English

4.1K

Keşfet

@cynlioness @chkn_little @Lenarcv1 @celestepoasts @jxmnop @elidourado @stanfordNYC @GuiveAssadi