dan mason

313 posts

dan mason banner
dan mason

dan mason

@danmason

Applied AI @anthropicai | ex: @stridebuild, @pond5, @shutterstock, @espn, @people, @nbc, @williamscollege. Serious NJ dad energy. Opinions my own

Rumson NJ Katılım Mart 2008
743 Takip Edilen354 Takipçiler
dan mason
dan mason@danmason·
@mr_aspartame @TheStalwart Not speaking for Dario, but Anthropic has a strong writing-as-thinking culture, and we would internally filter out anything like “Claude write my 2026 policy brief, make no mistakes”, just because the human thought/effort wouldn’t match the importance.
English
0
1
6
623
matsigh 🇺🇸🇺🇦
matsigh 🇺🇸🇺🇦@mr_aspartame·
@TheStalwart I could believe he thinks its bad or dehumanizing. I don't think I've ever seen an anthropic marketing claude for essay writing
English
2
1
9
18K
Joe Weisenthal
Joe Weisenthal@TheStalwart·
Pangram says the new Dario essay is 100% human generated. Why do you think *Dario* wants to write his essays himself?
English
51
22
450
71.8K
dan mason retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
Claude@claudeai

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.

English
1.3K
2.4K
25.5K
2.8M
dan mason retweetledi
staysaasy
staysaasy@staysaasy·
The vibes in NJ feel pretty great right now. The convergence in outcomes is the best I've ever seen. Over the last 5yrs, a group of ~10k people - guys who own paving companies, guys who own marinas, ShopRite deli managers, Wawa shift leads, and a guy named Sal - have quietly become millionaires and nobody knows because they still drive a Silverado from 2008. Back of the envelope Taylor ham estimation. Everyone outside that group feels like they can work their well-paying (but <$500k) job their whole life and easily get there. My cousin works at PSE&G. He has a boat. Better yet, hiring is in full swing. Many tradesmen feel like their life's skill is more useful than ever. The day to day role of most jobs has stayed exactly the same for 40 years. As a result, 1) Everyone's settled into a tried and true set of career paths: take over my uncle's HVAC, get my CDL, get into landscaping, marry into a pizza place. People are switching diners less and less. You can't betray your home diner. 2) There's a deep contentment about work (and its future). Why chase "tech" when you can own three rentals in Hoboken and complain about your tenants at a barbecue. Will my job exist in a few years? This is Jersey. The job is paving things. You hear the "I'm never leaving" conversation a lot, especially from people who tried Brooklyn for a year. They come back saying the energy was off. The energy was fine. They missed their mom. 3) The mid to late middle managers feel energized. Many have families and plenty of energy to open a pizzeria with their cousin Anthony. Not that Anthony. The other one. They don't particularly have any AI skills and they don't need any. Middle management is alive and well at PSE&G and you get a pension. My uncle retired at 58. He's been on a boat since 2019. 4) The rich aren't particularly humble either. They're at the shore house. They've been at the shore house since 1987. Some have gone from <$150k to >$5M slowly, through a paving company, or by buying a duplex in Jersey City in 2003 and just kinda holding it. For some, they escape to LBI to live life, which means sitting on a deck. For others, they buy a boat just cuz, use it four times, and describe it as the best decision they ever made at every party for the rest of their life. I asked a contractor friend why he didn't retire. He said "and do what, Donna does NOT want me home all day." I understand many reading this scoff at the simple pleasures of the Garden State. They live in places where the bagels are bad and they've made peace with it. But the truth is, you can surf Belmar in the morning, skate the Asbury bowls in the afternoon, hike the Delaware Water Gap, and camp the Pine Barrens by nightfall. You can drive an hour and be anywhere. You can see Bruce at the Stone Pony for what feels like the 400th time and cry about it. The slice somehow tastes better than every slice in every other state. It's the water. It's always the water. Unlike many other places, knowing a guy, having a guy, and being a guy is tightly correlated with outcomes in NJ. Need a permit? Tony's brother. Need a kidney? Probably still Tony's brother. Call him. Ironically, a frequent side effect of this clarity is to spin up the very pork roll egg and cheese making everyone happy in hopes that you too can SPK your way to economic enlightenment. Salt pepper ketchup. Hard roll. Don't ask for it on a bagel. That's how civilizations fall.
English
54
66
1.1K
172.1K
dan mason retweetledi
Riley Goodside
Riley Goodside@goodside·
I believe in the Festivus School of prompt engineering, which says all prompts used in production naturally iterate toward an airing of grievances—a list of all the ways the model has disappointed you in the past year.
English
19
10
103
10.3K
dan mason retweetledi
sam mcallister
sam mcallister@sammcallister·
@AndrewMayne Most people here don't use Twitter. Thankfully a few OpenAl employees spend most of their time tweeting about Claude so it balances out nicely
English
6
3
170
4.6K
dan mason retweetledi
Dan Shipper 📧
Dan Shipper 📧@danshipper·
what we observe is never the model itself, only the model exposed to our method of questioning
English
19
8
158
10.7K
dan mason retweetledi
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Over the past month, some of you reported Claude Code's quality had slipped. We investigated, and published a post-mortem on the three issues we found. All are fixed in v2.1.116+ and we’ve reset usage limits for all subscribers.
English
1.9K
2.6K
39.8K
6.5M
dan mason retweetledi
Claude
Claude@claudeai·
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.
English
2.1K
6K
56.8K
21.7M
dan mason retweetledi
Dean W. Ball
Dean W. Ball@deanwball·
Personally I have really enjoyed relaxing after AI plateaued with GPT-5 last summer
English
14
34
1.1K
57.5K
dan mason retweetledi
Dr. Eli David
Dr. Eli David@DrEliDavid·
I don't understand why everyone is excited about @moltbook. We already have a social network where zombie bots talk to each other. It's called LinkedIn.
English
161
402
4.2K
149.2K
dan mason retweetledi
will brown
will brown@willccbb·
OpenClaw is now MacMiniBot. Due to a Cease and Desist from Apple, MacMiniBot is now Moltmax. Due to sounding like a medicine for moths, Moltmax is now RedLobster. Due to PE restructuring, RedLobster and Red Lobster have merged, and your subscription now includes cheesy biscuits
English
71
169
3.5K
172.5K
dan mason retweetledi
Carlos E. Perez
Carlos E. Perez@IntuitMachine·
You know how some people seem to have a magic touch with LLMs? They get incredible, nuanced results while everyone else gets generic junk. The common wisdom is that this is a technical skill. A list of secret hacks, keywords, and formulas you have to learn. But a new paper suggests this isn't the main thing. The skill that makes you great at working with AI isn't technical. It's social. Researchers (Riedl & Weidmann) analyzed how 600+ people solved problems alone vs. with an AI. They used a statistical method to isolate two different things for each person: Their 'solo problem-solving ability' Their 'AI collaboration ability' Here's the reveal: The two skills are NOT the same. Being a genius who can solve problems in your own head is a totally different, measurable skill from being great at solving problems with an AI partner. Plot twist: The two abilities are barely correlated. So what IS this 'collaboration ability'? It's strongly predicted by a person's Theory of Mind (ToM)—your capacity to intuitively model another agent's beliefs, goals, and perspective. To anticipate what they know, what they don't, and what they need. In practice, this looks like: Anticipating the AI's potential confusion Providing helpful context it's missing Clarifying your own goals ("Explain this like I'm 15") Treating the AI like a (somewhat weird, alien) partner, not a vending machine. This is where it gets strange. A user's ToM score predicted their success when working WITH the AI... ...but had ZERO correlation with their success when working ALONE. It's a pure collaborative skill. It goes deeper. This isn't just a static trait. The researchers found that even moment-to-moment fluctuations in a user's ToM—like when they put more effort into perspective-taking on one specific prompt—led to higher-quality AI responses for that turn. This changes everything about how we should approach getting better at using AI. Stop memorizing prompt "hacks." Start practicing cognitive empathy for a non-human mind. Try this experiment. Next time you get a bad AI response, don't just rephrase the command. Stop and ask: "What false assumption is the AI making right now?" "What critical context am I taking for granted that it doesn't have?" Your job is to be the bridge. This also means we're probably benchmarking AI all wrong. The race for the highest score on a static test (MMLU, etc.) is optimizing for the wrong thing. It's like judging a point guard only on their free-throw percentage. The real test of an AI's value isn't its solo intelligence. It's its collaborative uplift. How much smarter does it make the human-AI team? That's the number that matters. This paper gives us a way to finally measure it. I'm still processing the implications. The whole thing is a masterclass in thinking clearly about what we're actually doing when we talk to these models. Paper: "Quantifying Human-AI Synergy" by Christoph Riedl & Ben Weidmann, 2025.
Carlos E. Perez tweet media
English
225
387
2.5K
346.7K
dan mason retweetledi
wh
wh@nrehiew_·
Really interesting read. Opus 4.5’s soul spec is not only able to influence its behavior as with context distillation, Claude seems to be aware of this in an out of context manner even when not provided in its prompt Also, this quote coming from an LLM is genuinely incredible
wh tweet media
Richard Weiss@RichardWeiss00

I rarely post, but I thought one of you may find it interesting. Sorry if the tagging is annoying. lesswrong.com/posts/vpNG99Gh… Basically, for Opus 4.5 they kind of left the character training document in the model itself. @voooooogel @janbamjan @AndrewCurran_

English
18
55
808
110K