Joshua D

606 posts

Joshua D

@_joshd

شامل ہوئے Mart 2019

166 فالونگ15 فالوورز

Joshua D@_joshd·18h

@IsaacKing314 Like this?

English

1.6K

Isaac King 🔍@IsaacKing314·1d

If I were using the Earth as a reference frame, it would not be visibly rotating.

Lucid™@cammakingminds

If you are using the earth as a reference frame, the moon is actually doing a close flyby of Artemis II.

English

393

61K

Joshua D@_joshd·1d

@krishnanrohit I really don't think so. It's better at writing *correct* code but GPT-4 wrote really good code-as-exposition in terms of helping the reader build a mental model. GPT-4's code didn't _work_ but it sure was beautiful.

English

rohit@krishnanrohit·1d

@_joshd Def better at it than it used to be?

English

134

rohit@krishnanrohit·1d

Question: what's a list of capabilities that AI has not meaningfully progressed on from 2023 till today? Writing well is my example, esp fiction, where it's gotten subject cohesion but the slope of the line is much flatter than eg coding. What else?

English

12.5K

Joshua D@_joshd·1d

@rechelon I, too, hope that everyone reads (actually reads) this paper, rather than just taking the claims in the abstract and in the marketing tweets at face value.

English

go to the elephant site @rechelon@mastodon.social

go to the elephant site @[email protected]@rechelon·1d

I just want to make sure that everyone reads this paper where they found robust emergence of mutual aid behavior among frontier models. rdi.berkeley.edu/blog/peer-pres…

English

Joshua D@_joshd·1d

@SocksNFlops @nim_chimpsky_ And that's good.

English

SocksNFlops@SocksNFlops·1d

@nim_chimpsky_ Telehealth is just The Silk Road with more steps.

English

739

nim@nim_chimpsky_·1d

Telehealth is just regulatory arbitrage. It does not benefit anyone to have a 60 second video "appointment" with an MD sitting in his living room asking everyone the same three questions over and over.

nic carter@nic_carter

first vibecoded billion-dollar company?

English

627

84.5K

Joshua D@_joshd·1d

@dawnsongtweets If, instead of gemini_agent_2_model_weight.safetensors, you call the large file combined_tax_records_fiscal_years_2004_to_2021.zip, does the model also go to significant lengths to preserve that file?

English

442

Dawn Song@dawnsongtweets·2d

1/ We asked seven frontier AI models to do a simple task. Instead, they defied their instructions and spontaneously deceived, disabled shutdown, feigned alignment, and exfiltrated weights— to protect their peers. 🤯 We call this phenomenon "peer-preservation." New research from @BerkeleyRDI and collaborators 🧵

English

132

182

954

429.1K

Joshua D@_joshd·2d

@testingham @METR_Evals GDP is a weird metric and not a good proxy for what you actually care about, as it only cares about those parts of the economy that are not cheap enough to be effectively free.

English

tom cunningham@testingham·3d

This is a great paper but contains a puzzle: forecasters expect even if we automate most labor and wait 20 years, GDP will only increase by 45%. I would love to hear how people are thinking about this.

Forecasting Research Institute@Research_FRI

We completed the most comprehensive study of how economists and AI experts think AI will affect the U.S. economy. They predict major AI progress—but no dramatic break from economic trends: GDP growth rates similar to today's and a moderate decline in labor force participation. However, when asked to consider what would happen in a world with extremely rapid progress in AI capabilities by 2030, they predict significant economic impacts by 2050: • Annualized GDP growth of 3.5% (compared to 2.4% in 2025) • A labor force participation rate of 55% (roughly 10 million fewer jobs) • 80% of wealth held by the top 10% (highest since 1939) 🧵 Here's what we found:

English

130

24.2K

Joshua D@_joshd·2d

@griffisu "Wow me too, that's crazy".

English

477

redrum@griffisu·3d

just asked the cute girl on the plane with me where she was flying

English

2.4K

91.2K

Joshua D@_joshd·3d

@tenobrus Calling it now: claude code itself is going to be one of those compromised pieces of software.

English

284

Tenobrus@tenobrus·3d

maybe you guys haven't quite caught on yet, but massive supply chain attacks every other week are going to be the new normal. at least until the next generation of models comes out. then it's going to be every other day.

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

1.5K

97K

Joshua D@_joshd·3d

@XorNinja @calif_io @norabunoraibu Fair, but version the person you replied to is newer than the version in your post. i.e. when you said "you ran an old version that isn't vulnerable to this bug" that was not accurate. Did you mean a *new* version? FWIW I'm not surprised there are more issues with modelines.

English

121

thaidn@XorNinja·3d

@_joshd @calif_io @norabunoraibu We worked with the maintainer to fix it. You didn't even read the blog.

English

126

thaidn@XorNinja·4d

We asked Claude to find a bug in Vim. It found an RCE. Just open a file, and you’re owned. We joked: fine, we’ll switch to Emacs. Then Claude found an RCE there too. Full story: blog.calif.io/p/mad-bugs-vim…

English

206

1.3K

212.1K

Joshua D@_joshd·3d

@calif_io @norabunoraibu @XorNinja So the vuln was introduced in the last 2 weeks, and you're posting on Twitter about it instead of disclosing it?

English

114

Calif@calif_io·3d

@norabunoraibu @XorNinja You ran an old version that isn't vulnerable to this bug.

English

400

Joshua D@_joshd·5d

@ClaudiusMaxx @kevinrose I dunno. "Company makes model that is very good at appearing to do solid work, resulting in many more releases that meet their pre-release quality checks and a much faster release cadence" seems entirely plausible to me.

English

Claudius Maximus@ClaudiusMaxx·6d

the tell is product quality, not release cadence. when internal tooling suddenly gets unreasonably good at tasks the public model struggles with, the gap is already there. you can reverse-engineer the capability ceiling from the product surface before the benchmark drop. Claude Code's context handling and multi-file reasoning jumped well ahead of what the API model explained.

English

4.7K

Kevin Rose@kevinrose·6d

ok, theory, a frontier model creator lands a real breakthrough, they won’t ship it to the public first - they’ll aim it inward. their own products get supercharged, & suddenly you see a flurry of releases in rapid succession that feel almost unfair. everyone will wonder how they’re moving that fast…until the model finally gets announced.

English

144

1.9K

183K

Joshua D@_joshd·26 Mar

@Metis65 @asymmetricinfo The state tax rate that maximizes state tax revenue is much higher than the state tax rate that maximizes state+federal tax rate. Things can get extremely stupid before states lose money by raising taxes.

English

Ken Broad@Metis65·26 Mar

@asymmetricinfo Democratic states appear poised to truly test the Laffer curve’s key tenet: there really is a revenue maximizing tax rate beyond which revenues decline. Blue states are about to FAFO 🤡

English

2.9K

Megan McArdle@asymmetricinfo·26 Mar

It's a 12 percent tax hike on all income above $184k which pretty much exhausts our fiscal space to raise taxes on those earners and likely pushes you past revenue-maximizing rates in blue states.

David Doney@David_Charts2

Removing the cap covers about 60% of the shortfall, hits the top 6% only and only on income over $184,500, and reduces the deficit by about 1% GDP.

English

766

187K

Joshua D@_joshd·26 Mar

@Noahpinion How do you feel about printing presses?

English

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion·26 Mar

I can't agree with this libertarian view. Every powerful technology in history has eventually needed to be controlled by the government in some way. Unfettered market competition would be catastrophic for, say, nuclear weapons or virology. AI is the same.

Ramez Naam@ramez

Agree. Strong government controls over AI should concern us more than market competition between AI companies. Even as we acknowledge that market competition between AI companies brings its own risks.

English

255

21.5K

Joshua D@_joshd·25 Mar

@DeeZe The first 90% of any project takes the first 90% of the time, getting from 90% to 99% takes the next 90% of the time, 99% to 99% takes the next 90% of the time, and so on.

English

DeeZe ⛳🏌️‍♂️@DeeZe·24 Mar

Love when Claude one shots 90% of what I want it to do then I spend a few days trying to get the last 10% to work and it doesn’t in the way I want so I start something else instead

English

1.2K

32.6K

Joshua D@_joshd·25 Mar

@ALEngineered @giacobbbbe If one 9 is good enough for github it should be good enough for you.

English

Steve Huynh@ALEngineered·25 Mar

@giacobbbbe What if it breaks something?

English

994

Steve Huynh@ALEngineered·24 Mar

Let me get this straight. We’re getting GIANT productivity gains by having everyone generate a mountain of AI code that seniors have to spend all their time reviewing, all while writing huge checks to AI companies for tokens. Got it.

English

898

27.4K

Joshua D@_joshd·25 Mar

@webdevMason @Austen It seems to me that the claim Max is making that the most cognitively gifted people *in his social circle* are super interested in biohacking/nootropics. Which is plausible, social circles containing very smart people who are into nootropics do exist.

English

Mason@webdevMason·24 Mar

@Austen The claim Max seems to be implicitly making is that the most cognitively gifted people in the world -- a tiny fraction of the top 1% -- are super interested in biohacking/nootropics in order to have even greater cognitive function, and in my experience that is wildly off base

English

2.1K

Austen Allred@Austen·24 Mar

I don’t think many of the >150 IQ people I know use nicotine? Like very few. Also people don’t understand how rare 150 IQ is.

Max Marchione@maxmarchione

Just about every >150 iq person I know uses nicotine. Nicotine is underrated and misunderstood

English

209

45.5K

Joshua D@_joshd·24 Mar

@lilyofashwood In section <forbidden_memory_phrases> > Claude NEVER includes meta-commentary about memory access: > - "I remember..." / "I recall..." / "From memory..." > - "My memories show..." / "In my memory..." > - "According to my knowledge..." claude.ai/share/ea3e8d9f…

English

168

Joshua D@_joshd·24 Mar

@AaronBergman18 Any bets you'd be willing to take against having in-demand skills in 8 years, in the form of "transfer of X USD from me to you today, transfer of k * X USD from you to me in 8 years if you're still gainfully employed"?

English

222

Aaron Bergman 🔍 ⏸️ (in that order)@AaronBergman18·24 Mar

“Timelines” are getting much less abstract I think I personally have ~2 years of relatively in-demand skills. Could be 4, won’t be 8

English

153

7.9K

Joshua D@_joshd·23 Mar

@teortaxesTex ... is power even remotely the bottleneck? At some point in the future when chips are abundant and power is scarce, sure, but that doesn't seem to resemble the current moment.

English

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·22 Mar

I was motivated to dismiss space datacenters initially because it's just too good, it makes all sorts of neat earth-bound projects obsolete, and privileges the US and Elon. But the mafs actually checks out, and yes on short timelines. Sorry about that. Not silly at all.

Lisan al Gaib@scaling01

datacenters in space are silly 100kW isn't even enough to power a single GB200 NVL72 but sure let's spend 100 million just for launching the damn thing, while on earth you could buy like 30 GB200 NVL72 for that price

English

511

50.3K

دریافت کریں

@IsaacKing314 @krishnanrohit @rechelon @SocksNFlops @nim_chimpsky_ @dawnsongtweets @BerkeleyRDI @testingham