dod

2.7K posts

dod

@dodgelander

Katılım Eylül 2021

39 Takip Edilen49 Takipçiler

dod retweetledi

James Campbell@jam3scampbell·2d

anthropic roommate came back sloppy drunk at 3am last night and had a full scale crash out through tears and slurred words about how the world will never be the same glad to hear the mythos release was received well internally

English

4.6K

386.2K

dod@dodgelander·5h

Anthropic marketing got X talking for a week about their model and this guy who knows pr is about to teach them how

Geoffrey Miller@gmiller

And the people who think this is just 'doom marketing' have no understanding of marketing or PR, whatsoever. No industry in history has ever promoted its products by warning that 'use of our products may result in civilizational ruin and species-level extinction' -- as if that's somehow a great selling point.

English

dod@dodgelander·5h

anthropic secret sauce is to spend so much on sauce so unrealistic sauce spend that their model do something is so far off their capabilities then have x larpers do PR for them

Dwayne@CtrlAltDwayne

The OpenAI secret sauce is adding more sauce to its models to make them even saucier.

English

dod@dodgelander·5h

@gmiller @romanyam simple question: in anthropic blog can you find mythos as its own entry or not?

English

Geoffrey Miller@gmiller·5h

@dodgelander @romanyam And how exactly does that relate in any way to 'doomsday marketing'?

English

dod@dodgelander·5h

@gmiller @romanyam and you do? "When Nike made Kaepernick the face of its "Just Do It" 30th anniversary campaign, critics burned shoes and called for boycotts. Instead, Nike's online sales jumped roughly 31% in the days following, and the company's stock hit an all-time high within weeks."

English

Geoffrey Miller@gmiller·5h

English

346

Dr. Roman Yampolskiy@romanyam·6h

Any model considered too dangerous to release should have been considered too dangerous to develop.

English

148

4.5K

dod@dodgelander·5h

@JaydenMart48952 @CDS_414 @EtherionMaster nothing in these panels says faster

English

normal@JaydenMart48952·6h

@CDS_414 @EtherionMaster Better body strength means your just stronger And sukuna TF doesn't "impede him in anyway" so hes either equal or faster than meguna

English

Space@EtherionMaster·1d

Who's the slowest among the strongest 4 JJK characters?

English

1.4K

85K

dod@dodgelander·6h

@pmddomingos yeah people be building attack drones with that shit

English

189

Pedro Domingos@pmddomingos·6h

Linux is too dangerous to release. Better give it to Microsoft and IBM to work out the kinks first.

English

493

10.7K

dod@dodgelander·6h

@jawwwn_ @piersmorgan @PiersUncensored @peterthiel bro had an idea so simple money on the internet and yet thinks he get to lecture people about politics

English

Jawwwn@jawwwn_·1d

Peter Thiel says the left is “Low Testosterone”

English

1.1K

279

7.7K

1.3M

dod@dodgelander·7h

@deanwball "The stark reality is that those who have taken AI capabilities growth seriously have been basically right about most important things " and other hilarious jokes you can tell yourself futurism.com/six-months-ant… electrek.co/2025/02/10/elo… teslamotorsclub.com/tmc/threads/fs…

English

119

Dean W. Ball@deanwball·8h

“Describing highly capable frontier AI models as highly capable” is not “fear mongering.” “Taking AI seriously” is not “fear-mongering.” “Acknowledging obvious, realized or soon-to-be-realized risks” is not “fear-mongering.” The stark reality is that those who have taken AI capabilities growth seriously have been basically right about most important things in the last three years; those that haven’t have been consistently confused and, what’s worse, frustrated at the world about their own confusion. You don’t have to be a mega-pessimist or a “doomer” to take AI seriously. You don’t have to advocate for stark top-down controls over AI. You don’t have to support regulatory capture. It is possible to take AI seriously and advocate for a governmental response that is both effective *and* measured. To the young researchers out there, still trying to make their intellectual fortunes: Do not let anyone tell you otherwise. Do not let anyone bully you into believing otherwise. Think for yourself.

English

349

20K

dod@dodgelander·8h

@dhiran_dev @claudeai they won't google is a competitor and gemini integration is kinda pushy

English

204

Dhiran@dhiran_dev·9h

@claudeai wait til they add this to google docs tho 👀

English

7.9K

Claude@claudeai·9h

Claude for Word is now in beta. Draft, edit, and revise documents directly from the sidebar. Claude preserves your formatting, and edits appear as tracked changes. Available on Team and Enterprise plans.

English

776

1.4K

20.3K

5.9M

dod@dodgelander·8h

@claudeai they aren't afraid of showing copilot cause they aren't competition at this point

English

dod@dodgelander·8h

@FoxNews @ThePrimeagen it is definitely not this guy labeling him evil for 2 hours stream everyday 🙄

English

158

Fox News@FoxNews·9h

BREAKING: Suspect throws Molotov cocktail at OpenAI CEO Sam Altman's home, according to company spokesperson

English

489

370

4.5K

1.2M

dod@dodgelander·8h

@ThePrimeagen bro this opus 4.6 model is so good it is agi already

English

ThePrimeagen@ThePrimeagen·1d

"The Good Guys" Anthropic at it again

Claude@claudeai

In evals, Sonnet with an Opus advisor scored 2.7 percentage points higher on SWE-bench Multilingual than Sonnet alone, while costing 11.9% less per task.

English

823

91.4K

dod@dodgelander·9h

@deanwball i wonder why

English

Dean W. Ball@deanwball·10h

If you needed a reminder of why America’s founders were deeply skeptical of democracy and the will of the masses, there is none better than the fact that the American people seem to be enthusiastically banning a wave of industrialization before it has even begun in earnest.

English

124

1.3K

43.8K

dod@dodgelander·9h

@tszzl why don't you join @AnthropicAI? They're so far ahead. Didn't @OpenAI promise to merge with whoever is ahead by 6 to 18 months?

English

dod@dodgelander·10h

@rohanpaul_ai because it is a recall machine

English

Rohan Paul@rohanpaul_ai·1d

AI’s most confusing feature right now is that the strongest systems are genuinely terrifying in narrow domains while still looking clumsy in everyday ones. Programming is the clearest example. Modern reasoning models are trained using reinforcement learning with verifiable rewards, or RLVR. The model generates an answer, a program checks whether it's correct, and the model updates accordingly. No human annotator needed. Just a binary signal: right or wrong. This works spectacularly in code and math because correctness is cheap to verify. Run the test suite. Check the proof. The reward is unambiguous. Here's the part most people miss. That same technique largely fails in the domains where most people actually use AI: writing, advice, search, conversation. There's no unit test for a good email. A February 2026 paper on extending verifiable rewards to broader reasoning showed exactly why technical domains leap ahead. Clear signals turn models into reliable problem solvers. (“Extending RLVR to Open-Ended Tasks via Verifiable Multiple-Choice Reformulation” (arXiv:2511.02463)) Extending this to open-ended tasks is now the field's hardest problem. Company incentives push in the same direction. The biggest commercial value is not in helping someone draft a better email. It is in automating expensive technical work, so that is where teams spend their energy and where progress compounds fastest.

Andrej Karpathy@karpathy

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

English

7.4K

dod retweetledi

Dawid Moczadło@kannthu1·1d

I will say it again, we used GPT5.4 and Opus, and we were able to autonomously find zero-days in the Linux Kernel (in the last 3 weeks) Mythos is probably better at the task of finding potential issues in code, but imo the threshold for "scary" was reached in December or even earlier This is a great hype machine for Anthropic, especially that they plan to do IPO eoy I totally agree - this is not a new capability

Zack Korman@ZackKorman

I'm extremely unconvinced that Opus wouldn't have found that 27-year-old OpenBSD bug Mythos found if they spent $20k credits on it.

English

166

1.9K

392.1K

dod@dodgelander·10h

@MrsButters one is a political scientist the other a professional larper

English

Mrs. Butters 🥧@MrsButters·1d

ZXX

551

5.5K

22.2K

230.7K

dod@dodgelander·10h

@negligible_cap Bensent is not technical this just means Anthropic PR is working not its model

English

Negligible Capital@negligible_cap·1d

*BESSENT SUMMONED WALL STREET CEOS TO DISCUSS ANTHROPIC’S MYTHOS This is crazy. Claude Mythos is apparently so good that Bessent and J. Powell summoned the bulge bank CEOs in a meeting to make sure they’re aware of how good Claude Mythos really is. Execs summoned include include $C Jane Fraser, $MS Ted Pick, $BAC Brian Moynihan, $WFC Charlie Scharf, and $GS David Solomon. Jamie couldn’t make it More specifically, the meeting was about Mythos’s offensive and defensive cyber capabilities. The selloff in $IGV likely to continue tomorrow. I bet Sama’s jealous

English

568

220.8K

dod@dodgelander·11h

@fchollet @ylecun @julien_c Didn't gpt 2 come out before him leaving twitter or before people trained on arc agi to get better score on arc agi Forgot which came first

English

François Chollet@fchollet·1d

@ylecun @julien_c You seem to spend a lot of time trolling on Twitter for someone who famously left Twitter

English

1.6K

96.1K

Julien Chaumond@julien_c·3d

“gpt2-large is too powerful to be publicly released” vibes

English

155

4.3K

320.7K

Keşfet

@gmiller @romanyam @JaydenMart48952 @CDS_414 @EtherionMaster @pmddomingos @jawwwn_ @piersmorgan