Paul Modderman

1.2K posts

Paul Modderman

Paul Modderman

@PaulModderman

Principal Nerd @boringnerds, founding engineer @novaintel_ai, author. @paulmodderman.bsky.social

Minnetonka, MN Beigetreten Kasım 2014
1.1K Folgt637 Follower
Paul Modderman
Paul Modderman@PaulModderman·
@emollick There's something joyful in a technology forcing me to update my understanding of what "creativity" is, because I also have a hard time arguing this isn't real creativity
English
0
0
5
528
Ethan Mollick
Ethan Mollick@emollick·
Opus 4.5: "you need to build a game that is coherent, fun and story driven. There are precisely two controls. One is slider, in which one side is labelled Maximum Potato and the other is labelled Formalware, there are four positions for the slider. The other is a dial that goes from Monet to Drive-Thru. These labels are literal, not figurative. Figure it out. Don't ask any questions." Then I wrote "make it even better." I didn't make a single design decision (and Claude "hand drew" all the art). I dunno, hard to argue this is not real creativity. Play it here (there are five scenes): elaborate-toffee-be16e4.netlify.app
English
18
26
415
40K
Paul Modderman
Paul Modderman@PaulModderman·
@jonerp I would like to see/hear/read this stump speech. got a link handy?
Paul Modderman tweet media
English
1
0
0
31
Paul Modderman retweetet
Boring Enterprise Nerds
Boring Enterprise Nerds@BoringNerds·
Confused by the latest SAP "clean core" hubbub? Let's break it down and crack it wide open! How did 3 tiers become 4 levels and what does it mean for you? Only one way to find out: watch it! #SAP #ABAP youtu.be/wKl_qFmR9l0
YouTube video
YouTube
English
0
3
6
422
👩‍💻 Paige Bailey
👩‍💻 Paige Bailey@DynamicWebPaige·
@PaulModderman you get roughly a million points for recognizing the reference 😄
GIF
el Prat de Llobregat, Espanya 🇪🇸 English
1
0
1
58
Paul Modderman
Paul Modderman@PaulModderman·
@jeffzwang If this is true I’m either not doing sex right or doing vim right
English
1
0
3
159
Paul Modderman
Paul Modderman@PaulModderman·
@dmitriid I’d love to hear more about the setup/languages/prompts/etc in your scenario. I think people should aggressively share these counter examples.
English
1
0
2
151
Dmitrii
Dmitrii@dmitriid·
I keep saying: I have no idea what code these people are working on. Yesterday I spent $20+ and almost two hours trying to get the simplest functionality working in Claude Code. It had all context, original working code, docs, and its own analysis to work from
Steve Yegge@Steve_Yegge

Another late-night Claude Code post. First, if you've just arrived here at the party, Claude Code is NOT the same thing as Claude 3.7 Sonnet, Claude.ai, nor any other Claudey thing. It is its own new experimental thing, also from Anthropic, makers of Claude and Other Claudey Things. Claude Code (CC) is a new coding assistant, one which, strangely enough, only runs in a terminal. Like an xterm, or a bash shell. Or any of six WSL shells that don't quite work. As a result, CC looks comically retro-futuristic: A late-1980s vision of what AI might become. And here we are. CC is what we all thought Devin was going to be last year. When Devin came out in December, people Muntz-laughed and moved on. But CC is a bona-fide AI software engineer. It deserves the title. And the funny thing is, they appear to have eschewed RAG completely, and just told Claude to go figure stuff out on its own. I am here to say that I am addicted to Claude Code. I can't put it down. I don't mean that figuratively. I mean I literally do not know how to put my computer down and go to sleep. Because Claude Code keeps doing stuff. It keeps solving massive problems, one after another. I throw larger and larger things at it, and it is unfazed. Chomp. Chomp. Chomp. It's like that old Assassin's Creed game, Rome maybe, when you had that big network of spies working for you towards the end of the game, and you just sent them out on missions while you sat on your fat ass, and it was absolutely just as much fun as "running" around the game world? Well I remember. This, is that. You know what? We can't be more than 2-3 months away from being able to say, "Yo, CC, just... go make tests. For everything. All the stuff I failed to test over the past 2 decades, go redeem me. Write tests for it all, and make sure they are clever and meaningful, and follow our testing patterns." And then you deposit like, I dunno, $5000 into its gaping maw. It just goes off for a week or two, doing its thang on a branch somewhere, mostly I/O bound waiting on your builds. And one day you get the email you've been waiting for. It says: "Send Money". After a few more weeks of this, it finally takes you from 10% to 90% test coverage, so that when you die you will be admitted into Good Engineer Heaven. All other coding assistants will follow CC's approach, in some form factor. They are all falling over themselves right now, as we speak. Because yes, to answer all your exact same FAQs: Claude Code is that much better. The race is on!

English
1
0
9
3K
Paul Modderman
Paul Modderman@PaulModderman·
(Because I am in the same camp that she is on this discussion)
English
0
0
0
47
Paul Modderman
Paul Modderman@PaulModderman·
I’ve enjoyed this whole discussion. I especially appreciate @AmandaAskell ‘s formulation in the first sentence of this tweet. I’m often not sure how to respond to those who make the same inferential leap, and this formulation helps my communication. Thank you, Amanda.
Amanda Askell@AmandaAskell

I claimed the inference from X="LLMs are next token predictors" to Y="LLMs lack understanding, etc." is fallacious. Marcus claims that I'm saying not-X and not-Y. So I guess I'll point out that the inference "Y doesn't follow from X" to "not-X and not-Y" is also fallacious.

English
1
0
1
116
Paul Modderman
Paul Modderman@PaulModderman·
I don’t know what numbers I’d have guessed, but software dev being a top use is exactly what I’d have guessed prior to seeing the data. AI/LLMs might not be perfect software engineers yet, but they sure as hell know about code
Philipp Schmid@_philschmid

Interesting! Based on “The @AnthropicAI Economic Index” (millions of Claude conversations) AI usage is currently concentrated in software development and technical writing. TL;DR: 💻 Computer & Mathematical jobs dominate AI usage (37.2% of queries) despite being only 3.4% of workforce 📝 Writing/editing tasks are second most common (10.3% of queries) 🤝 AI augments human work (57%) more than automates it (43%) 💼 36% of jobs use AI for at least 25% of tasks, only 4% use it for 75%+ of tasks 💰 Mid-to-high wage jobs show highest AI adoption, both lowest and highest wage jobs show minimal usage 🔄 Tasks are being enhanced rather than jobs being fully automated 📊 Analysis based on real usage data from millions of Claude ai conversations 🎯 Usage concentrated in specific tasks rather than entire occupations

English
0
0
0
88
Paul Modderman
Paul Modderman@PaulModderman·
@GaryMarcus Thank you for an interesting test idea! o1, o1-pro, o3-mini, o3-mini-high all got this correct in one pass - though I didn't repeat multiple times. I took the output from o3-mini-high, validated it in a spreadsheet, and then simply compared the others' results to that one.
English
2
0
5
371
Gary Marcus
Gary Marcus@GaryMarcus·
Can o3-high (or other models) beat Perplexity on this? (Note the nonmonotonic sort.)
Gary Marcus tweet media
English
4
0
13
6.7K
Paul Modderman
Paul Modderman@PaulModderman·
Even if you doubt whether an LLM can truly “think”, to get the best results you should treat it as if it can.
English
0
0
1
85
sn1990u
sn1990u@sn1990u·
@kevinroose @PaulModderman @benspringwater @benhylak has good coverage specifically on o1 not being your trad'l chatbot x.com/benhylak/statu… im sure there's more covering deepseek deepresearch etc in depth which are different from standard chatbot llms and can articulate this for a general audience
ben (is hiring engineers)@benhylak

o1 is mind-blowing when you know how to use it. it's really not a chat model -- you have to think of it more like a "report generator" (link to article below)

English
2
0
2
289
Ben Springwater
Ben Springwater@benspringwater·
NYT is doing a disservice by not preparing the public to understand what’s happening and what’s coming. The awareness gap between people who are paying attention (via a very particular info diet, heavy on X) vs. not is *stunning*, and not a good thing. If I was in charge at @nytimes , I would enlist people like @WilliamBryk , @dwarkesh_sp , @simonw , @KelseyTuoc , @nabeelqu . Insiders who (1) don’t work in the labs, (2) but have deep access, (3) actually understand the technology, (4) think broadly about its implications, and (5) are great communicators. Hire them as columnists. Heck, create a new first class section! New podcasts! There’s crater sized hole/opportunity in legacy media to really cover what's happening. It’d be amazing to see the NYT get over its reflexive "anti" stance (Gary Marcus syndrome) and lead. It'd be good for business and great for society.
Ben Springwater tweet media
Will Bryk@WilliamBryk

I spoke to / heard from a few senior AI labs people this weekend and I now feel even more confident in these predictions from a month ago. Hearing these same ideas come out of their mouths was quite strange in how real it became. There is a sequence of future events that if you think about from first principles seems inevitable. We really do now have a test-time compute paradigm that is a straight shot to AGI (no matter how you define AGI). The bottleneck is gathering good data for RL and then all the little product details like improving speed, integrations into our workflows, etc. Just assume that by end of 2025, at minimum, we'll have multimodal agentic phd AIs doing complex tasks for you on your computer at o1-mini speeds. I highly recommend people update their worldviews and company plans to assume this. This is not hype. The world will still feel similar bc it will take long for Bank of America and the government of France and everyone to integrate AGI and there will be a compute shortage. But the AGI will exist nonetheless. Robotics will also take a bit longer and timelines are a bit less clear there but I also talked to people who are doing robotics, and seems we're only talking a couple years later for fully functional humanoid robots. I believe there is a major gap in our information ecosystem about these topics. There are a couple thousand researchers and engineers thrusting us into the next era of humanity at breakneck speed, but they're not allowed to talk about it bc of NDAs. This is the most important story of the century -- far more important than covid -- yet there is far less media and discussion about it, beside hype tweets from AI influencers and coy tweets from AI researchers. That's because there are very few people who: a) understand how these systems work b) are able to suppress the "this sounds too crazy" part of their brain c) are able to suppress the "go full hype" part of their brain d) are in SF so hear things on the ground beyond twitter/the news e) actually care about helping the world and are not just trying to sell you something f) recognize the gravity of the situation and treat it as such g) are reasonably competent at converting thoughts into words I believe I am one of these people. As a person who's been running an AI company that trains real models, but one that is not an AI lab (so I can say whatever I want), and someone who enjoys writing about things that are helpful, I believe I can bring an important perspective to this quiet but immensely important conversation. I feel a duty to write more on X, and plan to do so. You have my word that I will always strive to be as correct as possible and will not write things with hype, though I won't be scared of being provocative. I also highly encourage anyone else out there who passes the criteria above to write more, create more. The world is desperately in need of media about the world that is coming. We need realistic AI short stories, new films beyond Terminator, mocked video demos of what future systems will look like. We all need visceral examples to shake us out of apathy so that we don't just predict the AGI but feel it. That way we can prepare accordingly.

English
19
24
438
107.4K