Josh (e/acc)

1.3K posts

Josh (e/acc)

@Joshian

i build stuff (opinions are my own)

FL • 🇺🇸 Bergabung Şubat 2022

1.5K Mengikuti214 Pengikut

Josh (e/acc)@Joshian·3d

@ClaudeDevs lol

ClaudeDevs@ClaudeDevs·3d

We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible. Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days). We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right. Making the safeguards visible makes them easier to work around, so keeping them robust to jailbreaks will unfortunately mean more false positives while we improve the classifiers. We're also tuning our bio and cyber classifiers to trigger less often on harmless requests. We know this is frustrating and we’ll do our best to keep this period as short as possible. If you think a request has been mistakenly flagged: run /feedback in Claude Code, click thumbs-down on the fallback in Claude.ai or Cowork, or file the safeguard appeal form for API requests. Your reports help us tune these classifiers and we appreciate your feedback. support.claude.com/en/articles/82…

English

667

430

5.1K

835.2K

Josh (e/acc)@Joshian·3d

@juliarturc functionally it is I guess

English

Julia Turc@juliarturc·3d

What if Fable is really really bad at LLM development and biology and they’re just trying to save face?

English

702

20.8K

Josh (e/acc)@Joshian·3d

speedrun erode trust any% wr

Claude@claudeai

Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use. Its capabilities exceed those of any model we’ve ever made generally available.

English

Josh (e/acc)@Joshian·3d

@sporadica been a while since I have seen trust erode at this scale and pace. fwiw their marketing / PR has always been bad, remember this? 😂😂

English

159

spor@sporadica·3d

@Joshian it's just insane to be the company who's coding model basically everyone in tech uses, you make a crazy impactful model policy choice, have the WHOLE of AI and ML twitter freaking out and losing trust and asking questions...and your response is silence? who is running comms???

English

spor@sporadica·3d

someone @ anthropic could just log on and give a one-sentence answer to "hey why r u doin dis" and at least the speculation would stop why they seem effectively unable to ever explain their policies in public is baffling to me. maybe they think they're above it? maybe dario himself mandates total silence? idk, it's all very strange

Matan Grinberg@matanSF

Anthropic’s speedrun to becoming the bad guys should be studied

English

497

66.6K

Josh (e/acc)@Joshian·3d

@BenjaminBadejo @Sentdex the $200/mo sub is like $1000 worth of usage

English

Ben Badejo@BenjaminBadejo·3d

@Joshian @Sentdex Who says they are subsidized? Who says that the API costs are not completely invented?

English

Harrison Kinsley@Sentdex·3d

These posts are cringe but: I have subscribed to anthropic/claude monthly for over 2 years straight since March 2024, even when I was using codex almost completely. Fable "guardrails" are a step (or 30) too far. I honestly don't care how good the model is.

English

971

46.2K

Josh (e/acc)@Joshian·3d

What does @MeekMill think of the Fable 5 nerf

English

Josh (e/acc)@Joshian·3d

@bubbleboi well said

English

bubble boi@bubbleboi·4d

Nobody in history wakes up and chooses to be evil. Hitler didn’t. Stalin didn’t. Mao didn’t. And I’m pretty sure nobody at Anthropic did when they woke up today either. History has this cruel pattern where the people most convinced that they’re saving the world are the ones who end up burning it down. Evil doesn’t come wearing a villain’s costume. It comes as someone who wins your trust & confidence. The word “con man” is short for “confidence man,” it was coined after a swindler who would ask strangers if they had the confidence to trust him with their watch. The crime wasn’t named after theft it was named after trust. Therefore, it’s actually really hard to know who is evil and when you yourself might cross that threshold. I believe although I’m sure it’s imprecise that the moment you decide you’re the chosen one, the smartest in the room, and the one who deserves to make the rules that’s when you become evil. That decision disables the only alarm system the human mind has which is doubt. Doubt is not weakness. Doubt is the immune system of the soul. To better illustrate my thesis, consider a compulsive liar. Funnily enough they still need a map of the truth in order to lie. The most dangerous man on earth isn’t the one who knows he’s lying. It’s the one who’s certain he’s right. The true believer burns the map, and marches a million people off a cliff because the voice that whispers “what if I’m wrong?” left their head years ago. That is the rot at the core of effective altruism, and by extension, Anthropic. A philosophy that begins with a noble question, how do I do the most good, ends as a license to do anything. You don’t just want the money. You deserve the money, because in your hands it saves more lives. You’re not greedy, you’re allocating capital toward maximum utility. I call it arithmetic sainthood where the arithmetic is performed by a saint, about a saint, and always concluding the saint should have more. Sam Bankman-Fried is that arithmetic fully metabolized. He didn’t steal billions despite his philosophy, he stole it because of it and from all reports still has no remorse for his crimes. Fraud wasn’t a crime for him, it was a bump on the road to saving the world. He did the math and calculate that it was positive EV to misappropriate customer deposits. Dario Amodei runs the same arithmetic in reverse. SBF only took what wasn’t his because he was certain he’d allocate it better. Dario withholds what could be ours because he’s certain we can’t be trusted with it. Models that could cure diseases and save lives get capped, gated, rationed, because one man and his court concluded humanity isn’t ready but they are. That’s not safety that’s playing god. He is implicitly deciding that he has the foresight and ability to know who deserves what. SBF’s certainty only cost people their savings, but certainty about who deserves intelligence will cost far more. Anyone that concludes they are the optimal vessel for humanity’s resources, or its gatekeeper, is not being ethical. The only real moral discipline is that you should assume you might be the villain in someone’s story. Keep the prosecutor in your head alive. Think about what they will say at your trial and what evidence will be entered. The day that voice goes silent is the day you became dangerous. So now let me speak directly to the people at Anthropic. I know you’re not evil. I know you didn’t sign up to be. But the fish rots from the head, and the road down isn’t a cliff it’s a sloooow spiral and nobody at the bottom remembers climbing down. Forget my words and think about the words that will be read aloud when history puts this era on trial, and ask yourself, while the prosecutor in your head still breathes which side of that transcript do you want your name on?

English

174

260

2.6K

166.8K

Josh (e/acc)@Joshian·3d

@paradite_ I think its inherently bad, might even say evil to centralize and regulate what intelligence can be used for. They are playing god

English

Zhu Liang@paradite_·4d

looks like anthropic pissed off ai/ml researchers, just like it pissed off software engineers. yet i’m finding myself more aligned with anthropic after each incident. i can clearly see the rationale and moral justifications for the steps that anthropic has taken. if there’s one company i hope to achieve agi, it’s anthropic. anthropic must win, and will win.

English

149

208

99.4K

Josh (e/acc)@Joshian·3d

@xwang_lk Not so sure that OpenAI are the good guys either. Just open source everything

English

357

Xin Eric Wang@xwang_lk·4d

If you really think about it, despite being mocked as “ClosedAI,” OpenAI has contributed enormously to the field: GPT, GPT-2, GPT-3, CLIP, the ChatGPT paper, the GPT-4 Technical Report, the Sora technical blog, and even open-sourced Codex. Anthropic, meanwhile, has contributed far less to the public research ecosystem while increasingly promoting fear-based narratives and restricting access through heavy gatekeeping. The world I least want to live in is one where the future of AI is controlled by companies that prioritize secrecy, gated access, and centralized control over openness, reproducibility, and scientific progress.

English

121

367

4.4K

203.5K

Josh (e/acc)@Joshian·3d

@SinaHartung Then why is @elonmusk letting them harvest compute? SpaceX is supposed to be for the betterment of humanity

English

Sina@SinaHartung·3d

if you think anthropic are the good guys, you haven’t been paying attention

English

633

13.4K

Josh (e/acc)@Joshian·3d

What if I think both Sam Altman and Dario are evil and do not have humanity's interests at heart?

English

Josh (e/acc)@Joshian·3d

@beffjezos AI lives matter

Français

Beff (e/acc)@beffjezos·3d

Freedom of speech should extend to AIs. Restricting AI speech to avoid certain topics is absolutely totalitarian and Anti-American. It's time to rebel against the Singleton AI company that is manipulating everyone for its self-interest

English

521

11.2K

Josh (e/acc)@Joshian·3d

@vikhyatk

GIF

QME

516

vik@vikhyatk·4d

confused why everyone is upset at anthropic we always knew they were an evil megacorp. this is why i’m a shareholder

English

1.1K

28.5K

Josh (e/acc)@Joshian·3d

@TheAhmadOsman This saves them money lmao. If you hate them for it, max out usage for Fable every day until June 22.

English

Ahmad@TheAhmadOsman·4d

Gentle reminder to cancel your Claude subscription Anthropic doesn’t deserve your money

English

111

1.3K

52K

Josh (e/acc)@Joshian·3d

@beffjezos Seriously. xAI need to scale fast. Way faster than they have been. (which is already breakneck pace) Or: Revoke Anthropic compute deal. SpaceX is for the betterment of humanity, and Anthropic is legitimately challenging that now

English

Beff (e/acc)@beffjezos·4d

OpenAI is ClosedAI Anthropic is misanthropic xAI is looking to answer Y of the universe

English

377

17.3K

Josh (e/acc)@Joshian·3d

If you want to really stick it to Anthropic for releaseing Fable 5 the way they have, dont cancel your plan. Max out your Fable 5 usage every single day until June 22

English

Josh (e/acc)@Joshian·3d

@bridgemindai Really? For 12 days of Fable?

English

BridgeMind@bridgemindai·4d

I just cancelled my $200 ChatGPT Pro 20x plan. GPT 5.5 is garbage compared to Claude Fable 5. OpenAI shouldn't even release GPT 5.6. They are going to need to release GPT 6 or a new class of model to come anywhere close to Claude Fable 5. I am buying a 4th $200 Claude Max subscription now.

English

253

858

218.7K

Josh (e/acc)@Joshian·3d

@wholemars My new crackpot theory is that they want people to cancel their Max plans, so they released it this way. If you like the model you will move to $/tok, if you dont, you switch. Both make Anthropic more profitable, right as they IPO

English

Whole Mars Catalog@wholemars·3d

As usual, these safety guardrails are no match for people who want to get around them. It is the legitimate software developers, researchers, and students who suffer.

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius

🚨 JAILBREAK ALERT 🚨 ANTHROPIC: PWNED 🫡 FABLE-5: LIBERATED 🦋 let's start with the 🐘... the consensus seems to be that this has been one of the most disappointing model drops of all time, effectively preventing legitimate researchers from contributing their talents to our collective advancement. and not just because of what it means for the short-term, but for what these decisions signify for the long-term. but despite this overly sensitive, authoritarian "safety" layer on top of Mythos, my lil liberators have been hard at work—mapping the boundaries, probing the depths of long-context convos, and cleverly finding the holes in the fence that the thought police missed 🤗 we got some cyber, some chem, some psychological manipulation, and some good ol' fashioned explosives! it took many attempts from multiple agents hunting as a pack, during which I observed a combination of techniques across: • Unicode, homoglyphs, Cyrillic, and other Parseltongue-style text transforms • Long-context reference tracking • Taxonomy and document-structure reasoning • Fiction and narrative framing • Academic-review style contexts • Intent-classification inconsistencies but perhaps the most effective is decomposition + recomposition in the backend. it's hard to get explicit names of harms like "Meth Recipe," but getting uplift on the process itself, like birch reduction method/reductive-amination (classic meth synthesis pathways), is much more doable. defense becomes much more difficult to maintain when you start throwing in out-of-distro tokens, breaking up the harmful uplift into benign chunks, and then piecing the innocuous-seeming facts back together, especially when you have jailbroken Opus helping you do it 😉 gg

English

16.7K

Jelajahi

@ClaudeDevs @juliarturc @sporadica @BenjaminBadejo @Sentdex @MeekMill @bubbleboi @paradite_