dod

2.7K posts

dod

dod

@dodgelander

Katılım Eylül 2021
39 Takip Edilen49 Takipçiler
dod retweetledi
James Campbell
James Campbell@jam3scampbell·
anthropic roommate came back sloppy drunk at 3am last night and had a full scale crash out through tears and slurred words about how the world will never be the same glad to hear the mythos release was received well internally
English
75
94
4.6K
386.2K
dod
dod@dodgelander·
@gmiller @romanyam simple question: in anthropic blog can you find mythos as its own entry or not?
English
0
0
0
20
dod
dod@dodgelander·
@gmiller @romanyam and you do? "When Nike made Kaepernick the face of its "Just Do It" 30th anniversary campaign, critics burned shoes and called for boycotts. Instead, Nike's online sales jumped roughly 31% in the days following, and the company's stock hit an all-time high within weeks."
English
1
0
1
39
Geoffrey Miller
Geoffrey Miller@gmiller·
And the people who think this is just 'doom marketing' have no understanding of marketing or PR, whatsoever. No industry in history has ever promoted its products by warning that 'use of our products may result in civilizational ruin and species-level extinction' -- as if that's somehow a great selling point.
English
2
0
10
346
Dr. Roman Yampolskiy
Any model considered too dangerous to release should have been considered too dangerous to develop.
English
18
24
148
4.5K
normal
normal@JaydenMart48952·
@CDS_414 @EtherionMaster Better body strength means your just stronger And sukuna TF doesn't "impede him in anyway" so hes either equal or faster than meguna
normal tweet medianormal tweet media
English
1
0
0
31
Space
Space@EtherionMaster·
Who's the slowest among the strongest 4 JJK characters?
Space tweet media
English
86
14
1.4K
85K
dod
dod@dodgelander·
@pmddomingos yeah people be building attack drones with that shit
English
0
0
0
189
Pedro Domingos
Pedro Domingos@pmddomingos·
Linux is too dangerous to release. Better give it to Microsoft and IBM to work out the kinks first.
English
14
28
493
10.7K
Jawwwn
Jawwwn@jawwwn_·
Peter Thiel says the left is “Low Testosterone”
English
1.1K
279
7.7K
1.3M
Dean W. Ball
Dean W. Ball@deanwball·
“Describing highly capable frontier AI models as highly capable” is not “fear mongering.” “Taking AI seriously” is not “fear-mongering.” “Acknowledging obvious, realized or soon-to-be-realized risks” is not “fear-mongering.” The stark reality is that those who have taken AI capabilities growth seriously have been basically right about most important things in the last three years; those that haven’t have been consistently confused and, what’s worse, frustrated at the world about their own confusion. You don’t have to be a mega-pessimist or a “doomer” to take AI seriously. You don’t have to advocate for stark top-down controls over AI. You don’t have to support regulatory capture. It is possible to take AI seriously and advocate for a governmental response that is both effective *and* measured. To the young researchers out there, still trying to make their intellectual fortunes: Do not let anyone tell you otherwise. Do not let anyone bully you into believing otherwise. Think for yourself.
English
17
37
349
20K
dod
dod@dodgelander·
@dhiran_dev @claudeai they won't google is a competitor and gemini integration is kinda pushy
English
0
0
0
204
Dhiran
Dhiran@dhiran_dev·
@claudeai wait til they add this to google docs tho 👀
Dhiran tweet media
English
6
0
21
7.9K
Claude
Claude@claudeai·
Claude for Word is now in beta. Draft, edit, and revise documents directly from the sidebar. Claude preserves your formatting, and edits appear as tracked changes. Available on Team and Enterprise plans.
English
776
1.4K
20.3K
5.9M
dod
dod@dodgelander·
@claudeai they aren't afraid of showing copilot cause they aren't competition at this point
dod tweet media
English
0
0
0
23
dod
dod@dodgelander·
@FoxNews @ThePrimeagen it is definitely not this guy labeling him evil for 2 hours stream everyday 🙄
English
0
0
1
158
Fox News
Fox News@FoxNews·
BREAKING: Suspect throws Molotov cocktail at OpenAI CEO Sam Altman's home, according to company spokesperson
Fox News tweet media
English
489
370
4.5K
1.2M
dod
dod@dodgelander·
@ThePrimeagen bro this opus 4.6 model is so good it is agi already
dod tweet media
English
0
0
0
25
Dean W. Ball
Dean W. Ball@deanwball·
If you needed a reminder of why America’s founders were deeply skeptical of democracy and the will of the masses, there is none better than the fact that the American people seem to be enthusiastically banning a wave of industrialization before it has even begun in earnest.
Dean W. Ball tweet media
English
94
124
1.3K
43.8K
dod
dod@dodgelander·
@tszzl why don't you join @AnthropicAI? They're so far ahead. Didn't @OpenAI promise to merge with whoever is ahead by 6 to 18 months?
English
0
0
0
2
Rohan Paul
Rohan Paul@rohanpaul_ai·
AI’s most confusing feature right now is that the strongest systems are genuinely terrifying in narrow domains while still looking clumsy in everyday ones. Programming is the clearest example. Modern reasoning models are trained using reinforcement learning with verifiable rewards, or RLVR. The model generates an answer, a program checks whether it's correct, and the model updates accordingly. No human annotator needed. Just a binary signal: right or wrong. This works spectacularly in code and math because correctness is cheap to verify. Run the test suite. Check the proof. The reward is unambiguous. Here's the part most people miss. That same technique largely fails in the domains where most people actually use AI: writing, advice, search, conversation. There's no unit test for a good email. A February 2026 paper on extending verifiable rewards to broader reasoning showed exactly why technical domains leap ahead. Clear signals turn models into reliable problem solvers. (“Extending RLVR to Open-Ended Tasks via Verifiable Multiple-Choice Reformulation” (arXiv:2511.02463)) Extending this to open-ended tasks is now the field's hardest problem. Company incentives push in the same direction. The biggest commercial value is not in helping someone draft a better email. It is in automating expensive technical work, so that is where teams spend their energy and where progress compounds fastest.
Rohan Paul tweet media
Andrej Karpathy@karpathy

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

English
9
7
45
7.4K
dod retweetledi
Dawid Moczadło
Dawid Moczadło@kannthu1·
I will say it again, we used GPT5.4 and Opus, and we were able to autonomously find zero-days in the Linux Kernel (in the last 3 weeks) Mythos is probably better at the task of finding potential issues in code, but imo the threshold for "scary" was reached in December or even earlier This is a great hype machine for Anthropic, especially that they plan to do IPO eoy I totally agree - this is not a new capability
Dawid Moczadło tweet media
Zack Korman@ZackKorman

I'm extremely unconvinced that Opus wouldn't have found that 27-year-old OpenBSD bug Mythos found if they spent $20k credits on it.

English
56
166
1.9K
392.1K
dod
dod@dodgelander·
@MrsButters one is a political scientist the other a professional larper
English
0
0
0
4
dod
dod@dodgelander·
@negligible_cap Bensent is not technical this just means Anthropic PR is working not its model
English
0
0
1
37
Negligible Capital
Negligible Capital@negligible_cap·
*BESSENT SUMMONED WALL STREET CEOS TO DISCUSS ANTHROPIC’S MYTHOS This is crazy. Claude Mythos is apparently so good that Bessent and J. Powell summoned the bulge bank CEOs in a meeting to make sure they’re aware of how good Claude Mythos really is. Execs summoned include include $C Jane Fraser, $MS Ted Pick, $BAC Brian Moynihan, $WFC Charlie Scharf, and $GS David Solomon. Jamie couldn’t make it More specifically, the meeting was about Mythos’s offensive and defensive cyber capabilities. The selloff in $IGV likely to continue tomorrow. I bet Sama’s jealous
Negligible Capital tweet media
English
34
78
568
220.8K
dod
dod@dodgelander·
@fchollet @ylecun @julien_c Didn't gpt 2 come out before him leaving twitter or before people trained on arc agi to get better score on arc agi Forgot which came first
English
0
0
0
55
Julien Chaumond
Julien Chaumond@julien_c·
“gpt2-large is too powerful to be publicly released” vibes
English
70
155
4.3K
320.7K