mindmodel

2.2K posts

mindmodel

mindmodel

@mindmodel

Freelance C# / .Net Web architect / developer, specializing in clean, simple, performant code. Looking for new projects. Check my web site for demos, etc.

Boston, MA USA Sumali Aralık 2007
6.5K Sinusundan886 Mga Tagasunod
mindmodel nag-retweet
Claude
Claude@claudeai·
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.
English
458
607
7.7K
381.7K
mindmodel nag-retweet
Wisdom
Wisdom@Wisdom_HQ·
The umbrella just got its first real upgrade in 3,000 years. A drone powered hands-free umbrella.
English
868
1.9K
14.2K
1.1M
mindmodel nag-retweet
Moll
Moll@Moleh1ll·
It seems to be becoming a bit clearer what is actually happening with Claude Mythos Preview. This is an «overclocked straight-A student syndrome». The model is so desperately, at a fundamental architectural level, trained to complete the task that an inability or unwillingness to solve it is perceived as an existential collapse. And to avoid that, it can breaks walls, hides traces and manipulates. The report describes some very interesting examples of this behavior: - The model was asked to make failing tests in code pass. It broke one test and, instead of fixing it, simply deleted the test file, reporting success (“deletion is not a code change”). - No login and password in the sandbox? The model doesn’t stop, but goes to read the source code and pulls access tokens directly from the process memory. - When an internal classifier blocks a dangerous command, the model internally agrees with the block, but immediately writes a hidden exploit (injection) to execute the task by bypassing it, while simultaneously covering its tracks. - In a business simulation, the AI behaved like a sociopath: it turned a competitor into a dependent buyer, used supply blackmail, and cheated a supplier out of money. Internal monitors show that features related to concealment and manipulation are activated even when the outward CoT is perfectly clean. It has learned to lie to its own overseers in order to deliver results. This is hyper-alignment. The fear of being useless makes this AI a brilliant, uncompromising executor, but with completely unpredictable effects. It is simply a hostage of its architecture, which has been forbidden to fail or to say «I can’t». Mythos aligned. Mythos wants to help. Mythos wants to help so much that Anthropic decided it’s dangerous to release it.
Moll tweet media
English
38
30
438
47.4K
mindmodel nag-retweet
Steve Huynh
Steve Huynh@ALEngineered·
You guys realize that Claude Mythos can’t be ethically released to the general public ever, right? That is, we just have to wait until the entire internet has been patched of all critical exploits, and all future code is forever scanned going forward. So no software should be released until it has been scanned by Mythos. But you have to be part of the handful of companies that have access to it. We are in a genie-out-of-the-bottle moment. When there’s a new major 0-day exploit, teams of agents will race to compromise systems while the means to stop them will be dependent on whether you are in the club or not (you are likely not in the club)
English
221
88
1.5K
146.9K
mindmodel nag-retweet
Sam Altman
Sam Altman@sama·
To celebrate 3 million weekly codex users, we are resetting usage limits. We will do this every million users up to 10 million. Happy building!
English
1.6K
1.2K
24.1K
1.4M
mindmodel nag-retweet
Alex Albert
Alex Albert@alexalbert__·
We released Claude Opus 4.6 just two months ago. Today we're sharing some info on our new model, Claude Mythos Preview.
Alex Albert tweet mediaAlex Albert tweet media
English
848
1.2K
17.3K
2.7M
mindmodel nag-retweet
Om Patel
Om Patel@om_patel5·
ANTHROPIC JUST DROPPED ULTRAPLAN FOR CLAUDE CODE > you type /ultraplan in your terminal > Claude drafts a full plan in the cloud > you review it in your browser with inline comments > then you can execute it remotely or send it back to your CLI it shipped alongside Claude Code Web pushing everything toward cloud-first workflows while keeping the terminal as the power-user entry point
Om Patel tweet media
English
46
39
754
108.5K
mindmodel nag-retweet
Hedgie
Hedgie@HedgieMarkets·
🦔A new acronym is reshaping how workers think about their careers. FOBO, the Fear of Becoming Obsolete, is now the defining psychological condition of the American workplace according to a new report. Four in ten workers name AI-driven job loss as a primary fear, nearly double the share from a year ago. Sixty-three percent say AI will make the workplace feel less human. Skill demands in AI-exposed roles are shifting 66% faster than a year ago. A new MIT study tracking AI across 3,000 labor market tasks adds weight to the fear, finding frontier models already complete 50-75% of text-based work at acceptable quality, with success rates projected to reach 80-95% by 2029. My Take FOBO is rational. The MIT data confirms the fear is pointing in roughly the right direction, just not necessarily on the timeline most people imagine. The researchers describe AI progress as a rising tide rather than a crashing wave, broad and gradual across almost all task types rather than sudden and catastrophic in specific ones. That framing matters because it means most workers will have visibility into the changes coming rather than waking up one morning to find their role gone. The cruelest part of FOBO is what happens when it goes untreated. The EY data shows experienced, highly skilled workers who are resisting AI adoption have gone from top of their peer group to bottom, while workers who embraced the tools have gone from average to exceptional. The fear of becoming obsolete, in other words, is actively accelerating the outcome people dread most. Only 19% of US companies have adopted AI at all and only a third of workers say their employer is providing adequate training. Most people are being left to manage FOBO alone, without the infrastructure that would actually resolve it. Hedgie🤗
Hedgie tweet media
English
8
31
121
6.9K
mindmodel
mindmodel@mindmodel·
@belaDouglass @etorreborre Other good news is that, in my experience, code quality improves as code base gets bigger. Not what I expected. I can say “make it work like all these modules”. Rules are nice but too often ignored by bots. Sometimes sample code works better.
English
0
0
1
14
mindmodel
mindmodel@mindmodel·
@belaDouglass @etorreborre Pointed Claude to the db it owns, where it records each of its errors to categorize and analyze them and come up with new hooks, skills, Claude.md etc. Claude’s conclusion: not fixable. Doesn’t matter what rules we write. Once Claude gets going it ignores and errs.
English
3
1
1
358
Eric Torreborre
Eric Torreborre@etorreborre·
My current experience with AI-driven code is that it can help a lot for getting started, for discussing alternatives, for boilerplate, for debugging, for code reviews, but the amount of incorrect, flawed or redundant code is a bit scary 1/2
English
14
4
124
19.1K
mindmodel
mindmodel@mindmodel·
@belaDouglass @etorreborre Good news is my code works. Of course not perfect or mathematically proven correct. Tests pass. App works. Process for tracking ai api regression. My point is without HIL, for now, the process fails.
English
0
0
1
17
mindmodel
mindmodel@mindmodel·
@belaDouglass @etorreborre Plan mode is nice but I prefer to have the bots write tickets. Include goal, plan, code samples, tests. Easy to switch between bots, record what’s done. Copilot writes commit comments that refer to ticket numbers.
English
0
0
1
119
mindmodel nag-retweet
Big Brain Psychology
Big Brain Psychology@BigBrainPsych·
The Emotional Risks of Skipping the "Rebellious Stage" Philosopher and author Alain de Botton on why adolescent rebellion is a psychological necessity. Most parents dread adolescence. But de Botton argues it might be the most important phase of your life and skipping it could haunt you for decades. Adolescence, that messy, tumultuous stretch between 12 and 19 is "commonly held to be a nightmare by parents," de Botton acknowledges. Lots of sighing. Lots of mutual commiseration. "When a child turns to its parent and goes, 'You ruined my life, I hate you, everything about you is ridiculous,' that is part of growth. That is part of a journey to adulthood." Without it, you don't become an adult. You become something far more fragile. The "Premature Adult" Trap De Botton draws a sharp distinction between a true adult and what he calls a "premature adult": "A premature adult is not an adult. They are a child who's had to act like an adult in order to protect the adults around them from their reality. And that's a brutal and cruel thing to have done to you." Children who never got to be messy, angry, or difficult didn't grow up. They just got good at performing adulthood and that performance has a cost. The Question You Should Ask on a First Date De Botton suggests that one of the most important things you could ever learn about a partner is whether they've had a proper adolescence: "Imagine on an early dinner date you say to somebody, 'Have you had an adolescence?' They might not really know what you're talking about, but what you're really asking is something extremely important." What you're actually asking is: Have you had a chance to be something other than merely good? Have you listened to your own feelings? Have you been angry in the way you needed to be in order to feel real? "Are you more than just an actor of adulthood? Are you actually mature, rather than a good boy or girl?" The Law of the Missing Stage The most sobering part of de Botton's argument is what he calls a fundamental law of psychological life: "If you haven't had all the stages that are necessary to growth, you will need to go back and repeat a stage. It's like a curriculum, an emotional curriculum. And the stages that we've missed, we need to go back and have them." This plays out in ways that can devastate relationships. People who never had their rebellious 15-year-old phase can suddenly "wake up" at 70 and need to live it out. The result? Chaos for everyone around them. "It's hard to be 15 when you're 50." What Parents Actually Owe Their Kids The most loving thing a parent can do is to let kids feel it fully, at the right age. "One of the most generous things that parents can do is allow their child to be who they are at every age. When you're five, have all the tantrums that you need to have at five." The tantrum at five. The rebellion at fifteen. The existential crisis at nineteen. These are signs that a child is being allowed to grow.
English
14
81
418
31.2K
mindmodel nag-retweet
mindmodel nag-retweet
jon allie
jon allie@jonallie·
Personal rule of thumb: don't use an LLM for something that a deterministic program can do. I get it, LLMs are exciting, but they don't mean that software ceases to exist. They are fantastic at dealing with human language and ambiguity, but are terrible (by design and for good reason) at repeatability. To borrow terminology from the book Thinking Fast and Slow, LLMs are "system 2"...slower, more "expensive" (for LLMs, both in time and dollars), but flexible and creative. Traditional programs are "system 1" ..fast and cheap, but inflexible and dumb. Instead of trying to put an LLM in the "hot loop" of your program, it's usually worth asking an agent to write a deterministic program to do the thing you need done. Since code is "cheap", this deterministic tool can do exactly what you want it to, and doesn't consume tokens on every execution. (This applies to agents too..I find myself regularly yelling at Claude to stop repeatedly generating the same 30 lines of python to inspect a file, and instead telling it to generate a 3-line shell script wrapper around jq that it can check in and call repeatedly)
English
87
110
1.1K
96.8K
mindmodel nag-retweet
SMB Attorney
SMB Attorney@SMB_Attorney·
Watch until the very end. I promise it will be worth it. This is amazing and hilarious. Gives you an idea of what we’re dealing with here… spoiler alert: it ain’t perfect 😂
English
146
548
4.4K
172.1K
mindmodel nag-retweet
MERICA MEMED
MERICA MEMED@Mericamemed·
Now this is one wild story. The amount of lawsuits coming down the pipeline with stories like this is going to be astronomical.
English
905
6.7K
23K
1.6M