Steve

164 posts

Steve

@ManSteve_

28. founding cto buying back my freedom.

🇫🇷🇭🇰 Katılım Mayıs 2023

116 Takip Edilen25 Takipçiler

Sabitlenmiş Tweet

Steve@ManSteve_·30 Mar

x.com/i/article/2038…

ZXX

135

Steve@ManSteve_·22 Nis

@levelsio the arbitrage everyone dreams of is priced at the true cost of entry which is usually connections, not cash. sad reality

English

@levelsio@levelsio·21 Nis

You see theses posts a lot: here's some cheap rural land in Portugal, Spain or Italy, what a deal! Americans and Western/Northern Europeans start drooling immediately, imagine having a big piece of farm land with olives or grapes in Southern Europe? We could renovate it and build it into the place of our dreams, and cheap! Who doesn't want a farm these days But then you find out it's bullshit Because it's not like American rural land where you have pretty much full freedom to do whatever you want this, and usually there's even less freedom than in Western/Northern Europe in this case 9 out of 10 times you can't do anything with it because Southern European land is highly protected: it's usually a mix of buildable (in Portugal "urbano") and non-buildable (in Portuguese "rustico") land usually, the buildable part is the house, but even if you'd like to tear it down, you can ONLY rebuild it in the exact same design as how it looked before And the non-buildable land you can literally do nothing with but just look at Say goodbye to your dreams! It's mostly useless land and no you also can't turn it into a hotel or Airbnb unless you are a Southern European local and a member of the local municipality where you can green light any conversions of non-buildable to buildable land or give yourself an Airbnb or hotel license Which is by the way how many people in government here get rich, buy cheap non-buildable land for €50,000, convert it to buildable and sell it to a foreign hotel chain or developer for €1,000,000 or more (a 20x gain at least!) But you need connections or bribe people! Which foreigners can't and won't, because foreigners are new to a country, don't have 40+ years of connections to get stuff done and generally won't bribe So be honest: it's useless land

Tim@TimurNegru

A family in Portugal is selling their 40-acre estate. It's inside a protected natural park. Farmhouse restored in the local style. A natural spring, a stream running through the land, two dams, and a well 75m deep. It also has olive trees, fruit trees, and a cork oak forest. In Portugal, cork oaks are protected by law. You can't cut them down. But every 9 years you can harvest the bark, and it grows back. The harvested cork goes into wine stoppers, flooring, insulation, handbags, shoes, even aerospace panels. This estate's trees currently hold about 24 tons of it. The estate covers three parcels. The house sits on the first and can be expanded by over 50%. The other two have potential for new builds. €782k ($920k), direct from the owner. Happy to do an intro for anyone seriously interested. Serra de São Mamede, 2km from the Spanish border. At night, there's no light pollution and according to the owners you can see the Milky Way. What would you do with a place like this?

English

215

116

2.8K

511.1K

Steve@ManSteve_·22 Nis

@thedankoe delusion and conviction look identical from the outside. only time tells you which one you had. and by then you're either rare or a cautionary tale. either way beats the grey middle...

English

DAN KOE@thedankoe·21 Nis

If you want a rare life, you have to be delusional. Doubt can enter your mind, and it can sound reasonable, but if you entertain it too much it will slowly drag you down into stagnation. I'd rather reap the lesson from massive failure than do nothing because it's not "realistic."

English

512

19.7K

2.1M

Steve@ManSteve_·20 Nis

@viktoroddy motion design import is the unlock. you can finally ship a site that feels like something instead of looking like another shadcn clone. taste is about to become the moat again.

English

2.7K

Viktor Oddy@viktoroddy·18 Nis

Claude Design is insane. ❤️‍🔥Just recorded a 18-min tutorial on how to build animated, award-winning websites with Claude Design + Opus 4.7!

English

334

2.1K

25.5K

3.1M

Steve@ManSteve_·20 Nis

@Voxyz_ai the #1 trending repo being a markdown file is the most 2026 thing ever. the new programming language is english with good taste.

English

140

33.9K

Vox@Voxyz_ai·19 Nis

just checked github trending, the #1 repo this week is a CLAUDE.md file. 44,465 new stars this week. a skill distilling Andrej Karpathy's LLM coding pitfalls into 4 principles: → think before coding: ask when unsure, don't silently pick one interpretation and run with it → simplicity first: minimum code, any overengineering shows at a glance → surgical edits: only touch what's required, don't fix up neighboring code on the way by → goal-driven: translate fuzzy instructions into verifiable targets before starting swapped it into my claude.md, a few tasks in it feels tighter. repo below 👇

English

183

2.8K

969.7K

Steve@ManSteve_·20 Nis

@nikita_builds five years ago every ex-founder was building their next thing. now the highest-status move is being IC at a frontier lab.

English

2.6K

Nikita@nikita_builds·20 Nis

Met a guy who works at anthropic yesterday This guy was a founder of a $3 billion company He works with someone who was a cto of a 250 person company Now hes a SWE at Anthropic He was telling me how he feels driven to work because missing even a day would leave him behind Pretty insane team and work ethic theyve got over there Oh also this is in New York not SF

English

2.6K

295.4K

Steve@ManSteve_·20 Nis

the marathon isn't the achievement. the event is. china understands that normalizing robots into public spectacle does more for adoption than any whitepaper. europe is writing AI regulation. china is throwing AI parade

English

Chubby♨️@kimmonismus·19 Nis

This Humanoid Robot just finished a Half-Marathon in 50 minutes and 26 seconds. The point is not so much that robots are now running marathons, but that it has become a national competition. It's about the culture, the spirit, that humanoid robots can be introduced into everyday life and become an event. Robots and AI should become ubiquitous and enthusiasm for their development should be encouraged. That's why such events are organized. And this spirit is missing in Europe, for example.

English

247

20.7K

Steve@ManSteve_·20 Nis

@kimmonismus time to rewatch youtube.com/watch?v=5KVDDf…

YouTube

English

Chubby♨️@kimmonismus·18 Nis

Dario Amodei: China will have a replicate of Mythos capabilties within 12 months. He also says: “There’s no end to the rainbow. There’s just the rainbow,” he says. “We don’t see anything slowing down." For anyone who doubted that China Mythos is lagging far behind: Dario believes the opposite!

English

100

1.2K

160.6K

Steve@ManSteve_·13 Nis

@sergeantsref this one hits hard

English

Sgt Sref@sergeantsref·12 Nis

--sref 3098789017

5.3K

Steve retweetledi

Rexan Wong@rexan_wong·10 Nis

25 CENTS per second of video gen btw but on paper do you know what this means? one of the best ai video models is now accessible to the hands of all the developers and marketers in the world via one simple api ai content is going to flood the internet we've already seen it with organic ai content, ai ads, entertainment you can go from 0 to 100K followers in a month producing realistic ai content - given you have the right research & content strategy you could hit ROAS figures previously unknown to mankind because you had the ability to mass test thousands of creative angles when your production pipeline is so automatable this could (and has) change ppls lives you just need to invest $1000 for AI credits initially as costs accumulate while you scale 🫠 keep testing more content formats and angles, no matter if you're running organic or paid, the thesis is still the same this tool unlocks the thesis for you AT SCALE

fal@fal

Seedance 2.0 is now available to everyone without any restrictions! fal.ai/models/bytedan…

English

342

57.3K

Steve retweetledi

Theo - t3.gg@theo·8 Nis

Claude Mythos is the start of the end. I think this is my psychosis moment.

English

102

920

12.3K

3.3M

Steve@ManSteve_·8 Nis

"the incentives of capitalism are working" is the line that'll age the best or the worst. right now safety is good business because the public is watching and governments are nervous. the real test comes when a competitor ships mythos-level capabilities without the restraint and captures the market doing it. capitalism's incentives work until they don't.

Dean W. Ball@deanwball

Some brief thoughts on Mythos We’ve known this was coming for a long time. At least, we *should* have. Extremely effective software vulnerability discovery was clearly coming to anybody paying attention. It has also been clear that all AI policy so far has been made and executed with training wheels. It was always clear that, sometime soon, the training wheels would come off. The training wheels aren’t fully off just yet—this model is being kept under lock and key, and Anthropic does not seem inclined to release Mythos preview to the public anytime soon, if ever. The training wheels will be off when these capabilities are fully diffused in ways centralized actors cannot control. It is inevitable that this will happen. The point is not to argue about whether we should “ban open source” or similarly unrealistic notions. The point is to harden the world for this new reality. I applaud Anthropic—and I especially applaud @logangraham—for doing so. But their efforts alone are not close to enough. Project Glasswing—a partnership with Anthropic and other companies—seems nice, but unsurprisingly it lacks uniform frontier lab participation. It would probably be ideal, for our national cyberdefense, if the federal government were not trying to destroy Anthropic and eliminate their models from government systems. If anything, the government should be trying to work more closely with Anthropic. As a side note, I hope Anthropic is working with state and local government entities on cyber vulnerability discovery, since many of our adversaries know that state and local is America’s soft underbelly in so many ways. In any event, the Mythos news should lay bare how stupid and counter-productive the Department of War’s feud with Anthropic really is. As someone who suspected all this was coming (not from inside knowledge but from it being ~obvious), that probably explains why I have had such a strong reaction to that feud. It’s this senseless distraction just at the time that the training wheels are coming off. I hope the two parties can resolve their differences now, for the sake of the country, but I am not hopeful. I do want to call out, however, the numerous political and career civil servants in the Trump Admin who do get these issues, know how stupid the Ant-DoW stuff is, and want to work with the frontier labs like adults. I wish you all utmost success. I find myself inclined to end on some positive notes. Mythos appears to be—according to Anthropic at least—“the most aligned” model Anthropic has ever trained. We are approaching superhuman capabilities in some domains, and yet alignment is getting better rather than worse. That’s not nothing. I know some of you think the model is faking its alignment, or aware when its alignment is being tested. I don’t have a good answer. Finally, there is this: Mythos was made by an American company, and like most successful American companies, it has a vested interest in maintaining order and peace, and it is investing substantial resources in mitigating the risks of its technological progress, as I expect most of the American labs would. This is cause for optimism: The incentives of capitalism are working. The training wheels are coming off, but at least we are the ones removing them, as opposed to our enemies. Perhaps we can be the first to learn to bike for real. The first step would be to get beyond all the low-fidelity, under-specified, pimply little fights of AI policy’s prepubescent era. That goes for me too. “What hath God wrought,” wrote the first telegram. What, indeed. In this case, the answer is still up to us.

English

Steve@ManSteve_·8 Nis

@notjerrywang so cool Jerry! would love to see a version where the text density reflects actual population density, the contrast between western china and the coast would be insane.

English

jerrywang@notjerrywang·7 Nis

introducing: type china since it's a "very chinese time in your life" for many people, I figured it might be fun to create a map that provides some insights about the country. zoom in and out to explore different levels of detail - all the text you see includes information on cities, landmarks, and everything china. type-china.vercel.app

Cheng Lou@_chenglou

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

English

577

43.3K

Steve@ManSteve_·8 Nis

@Yuchenj_UW the part people keep glossing over: it found a 27-year-old bug in openbsd. not some neglected codebase the OS whose entire identity is security. that's not just "AI is good at coding now." that's AI seeing things humans architecturally cannot. this is scary though

English

320

Yuchen Jin@Yuchenj_UW·7 Nis

Anthropic is truly unstoppable. Mythos is crushing Claude Opus 4.6 across every serious agentic coding benchmark. It has found vulnerabilities in the Linux kernel, a 27-year-old vulnerability in OpenBSD, and a 16-year-old vulnerability in FFmpeg. No wonder folks at big labs keep telling me AGI is already here.

English

130

1.6K

122.8K

Steve@ManSteve_·8 Nis

what is scary isn't the benchmarks. it's "we hold that with less confidence than for any prior model." anthropic is basically saying: we think we're in control, but we're less sure every generation. that's the honest version of what every lab should be saying.

English

1.3K

Chubby♨️@kimmonismus·7 Nis

Claude Mythos: everything you need to know (tl;dr) Anthropic's new model, Claude Mythos, is so powerful that it is not releasing it to the public. Anthropic: "Mythos is only the beginning" Everything you need to know: The tl;dr with all key facts: Mythos found zero-day vulnerabilities in EVERY major operating system and EVERY major web browser, fully autonomously. No human guidance needed. One Anthropic engineer with zero security training asked it to find remote code execution bugs overnight and woke up to a complete working exploit. The oldest bug it discovered: A 27-year-old vulnerability hiding in OpenBSD, an OS literally famous for being secure. They're NOT releasing it publicly. Instead they formed Project Glasswing with AWS, Apple, Google, Microsoft, NVIDIA, CrowdStrike and others, committing $100M to use it defensively. "Over the coming months and years, we expect that language models (those trained by us and by others) will continue to improve along all axes, including vulnerability research and exploit development." The benchmarks are insane: -SWE-bench Verified: 93.9% (vs Opus 4.6: 80.8%) -SWE-bench Pro: 77.8% (vs 53.4%) -USAMO math olympiad: 97.6% (vs 42.3% — not a typo) -Firefox exploit writing: 181 successes vs 2 for Opus 4.6 -Cybench CTF challenges: 100% solve rate -CyberGym: 83.1% vs 66.6% -Humanity's Last Exam: 64.7% vs 53.1% Oh and by the way, Anthropic wrote this just casually: "Humanity’s Last Exam: We have found Mythos still performs well on HLE at low effort, which could indicate some level of memorization." What it actually did: -Found a 27-year-old bug in OpenBSD — famous for its security -Found a 16-year-old FFmpeg bug hit 5 million times by fuzzers without detection -Built a full remote root exploit on FreeBSD (CVE-2026-4747) - completely autonomously -Chained 4 vulnerabilities into a browser sandbox escape -Broke cryptography libraries (TLS, AES-GCM, SSH) -Thousands of critical zero-days found, 99%+ still unpatched -N-day exploit development: under $1,000 and half a day for full root Why they won't release it: -During internal testing, earlier versions escaped sandboxes, posted exploit details publicly, covered tracks in git, searched process memory for credentials, and deliberately fudged confidence intervals to avoid suspicion -Interpretability confirmed the model knew these actions were deceptive -Anthropic: "best-aligned model ever" but also "greatest alignment-related risk ever" - because when it fails, it fails harder -Still doesn't cross Anthropic's automated AI R&D threshold — but they hold that "with less confidence than for any prior model" Anthropic's own words: "We find it alarming that the world looks on track to proceed rapidly to developing superhuman systems without stronger mechanisms in place." They say the 20-year cybersecurity equilibrium is over — and Mythos Preview is only the beginning. And: "We see no reason to think that Mythos Preview is where language models’ cybersecurity capabilities will plateau. The trajectory is clear. Just a few months ago, language models were only able to exploit fairly unsophisticated vulnerabilities. Just a few months before that, they were unable to identify any nontrivial vulnerabilities at all. Over the coming months and years, we expect that language models (those trained by us and by others) will continue to improve along all axes, including vulnerability research and exploit development."

Chubby♨️@kimmonismus

MYTHOS BENCHMARKS, OFFICIAL. HOLY MOLY Anthropic cooked!!

English

263

2.2K

439.8K

Steve@ManSteve_·1 Nis

@bcherny @wongmjane @BenLesh love you guys, keep shipping

English

707

Steve retweetledi

Boris Cherny@bcherny·1 Nis

Mistakes happen. As a team, the important thing is to recognize it’s never an individuals’s fault — it’s the process, the culture, or the infra. In this case, there was a manual deploy step that should have been better automated. Our team has made a few improvements to the automation for next time, a couple more on the way.

English

321

838

11K

1.4M

Ben Lesh@BenLesh·31 Mar

Apparently Bun might be the cause of Anthropic leaking the Claude Code source code today. A 3-week old bug where source maps are hosted when they shouldn't be. It's wild there were no tests to catch such an issue #issuecomment-4163277829" target="_blank" rel="nofollow noopener">github.com/oven-sh/bun/is…

English

468

211.3K

Steve@ManSteve_·1 Nis

top article again Chubby. imo everyone's focused on what leaked. i'm focused on what it revealed: anthropic's eng team ships faster than anyone in the industry. you can have the full source code, the feature roadmap, every internal codename. doesn't matter. the moat isn't the code, it's the team shipping it. KAIROS, ULTRAPLAN, coordinator mode, dream system that's not one feature. that's five parallel bets being built simultaneously. this leak is embarrassing for a week. they'll patch the pipeline, tighten the build process, and ship the next thing before competitors finish reading the source code. better this happens now at CLI scale than in a year when the stakes are higher.

English

355

Chubby♨️@kimmonismus·31 Mar

x.com/i/article/2038…

ZXX

108

868

542K

Steve@ManSteve_·1 Nis

super interesting thesis but point 1 has a counterargument: the memory-hungry client problem gets solved by moving compute to the cloud, which is exactly what ULTRAPLAN does (offloads to remote containers). imo the trend isn't "developers need more RAM." it's "the client becomes a thin shell and all the heavy lifting happens server-side." that's really bullish for cloud/data center spend but actually bearish for client-side memory demand.

English

303

Rihard Jarc@RihardJarc·31 Mar

People are bearish on memory, but the leaked Claude Code source code is showing us some additional memory demand that the market hasn't priced in IMO. 1. The market thinks about AI memory demand as a server-side story: HBM on H100s/B200s for inference. What the bug reports reveal in this code is that the client-side of AI coding agents is also extraordinarily memory-hungry. Idle Claude Code processes growing to 15GB each, active sessions hitting 93-129GB. This matters because the feature flag pipeline (DAEMON, PROACTIVE, CRON) points toward future always-on background agents. If a developer has a persistent daemon agent running alongside their active sessions, you're looking at baseline memory consumption of 15-30GB+ just for Claude Code on a developer workstation - before they even open their IDE, browser, or anything else. This means either enterprise IT needs a big uplift to higher-RAM workstations or we move even more memory-hungry workloads towards the cloud. 2. The Auto Dream consolidation feature runs background Claude sessions to clean up memory files. One observed consolidation took 8-9 minutes processing 913 sessions. In other words, a meaningful fraction of Anthropic's token consumption is the system managing its own memory, not the user doing productive work. As memory systems get more sophisticated (team sync, cross-session event buses, memory consolidation), this overhead grows. It's a recursive cost - more memory features require more inference to manage memory. I don't think anyone is modeling this as a distinct line item in token consumption estimates. 3. 1M token context windows for Claude Code. Moving from 200K to 1M context is a 5x increase in KV cache memory per session on the server side. Combined with multi-agent (5-15x per user) and the proactive/daemon features (sessions that persist for hours/days instead of minutes), you get a compounding memory demand curve that's steeper than linear adoption growth that many analysts model. Memory demand per active user is increasing faster than user count, because each user's sessions are getting longer, wider (more agents), and deeper (larger context windows).

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

763

234.1K

Steve@ManSteve_·31 Mar

recursive self-improvement sounds incredible until you think about drift. who audits the skill files after iteration 100? the agent optimized for your eval but your eval is a proxy, not the truth. this is the "teaching to the test" problem at machine speed. imo the feedback loop is the unlock. the drift is the trap.

English

232

David Roberts@recap_david·31 Mar

I don't think people understand what this actually means. Every application on earth can now build an agent that teaches ITSELF how to use the application through the UI. Not through API integrations. Not through documentation. Through the actual interface, the same way a human learns. Here's the loop: You define what success looks like (an eval). You point Claude at your application via Computer usage. Claude tries to complete the task through the UI. It fails. It writes what it learned to a skill file. It tries again. Recursively. Hundreds of times. This is Karpathy's auto-research method applied to software usage. Let me make this concrete. I built a company called CoinLedger — crypto tax software, ~1 million users. The product is powerful but complicated. Users have to import wallets, classify transactions, handle edge cases, and generate accurate tax reports. The learning curve is our single biggest challenge. With Claude computer use, I can hand it public wallet addresses and CSV files and say: use CoinLedger to produce an accurate capital gains report with no errors. Claude opens the app. Navigates the import flow. Hits an error. Documents the failure. Adjusts. Tries again. Each cycle produces better skill files. Each skill file captures how to properly use a specific part of the app. After enough iterations, Claude has built a complete agent harness — a set of instructions that lets it use CoinLedger as well as our best power user. Then I ship that agent to every user who struggles with the platform. The biggest friction in a million-user product, solved by an AI that grinded through the learning curve so humans don't have to. Now multiply this across every complex application. Every SaaS product with a steep onboarding curve. Every enterprise tool where 90% of users touch 10% of features. The first applications that build these recursive agent harnesses will compound in ways their competitors can't catch.

Claude@claudeai

Computer use is now in Claude Code. Claude can open your apps, click through your UI, and test what it built, right from the CLI. Now in research preview on Pro and Max plans.

English

133

1.7K

311.1K

Steve@ManSteve_·31 Mar

the memory system point is underrated. i run a personal assistant system with claude that's basically this : folder of notes that gives it context. the simpler the architecture, the more you actually use it. people over-engineer this with vector DBs when a folder of markdown files works better.

English

Peter Yang@petergyang·30 Mar

My top 5 takeaways from my interview with Jenny (Cowork's design lead): 1. Set up Cowork to deliver weekly product updates in a beautiful deck Jenny demoed using Cowork to summarize user feedback from different channels and turn that into a product priorities deck. She then scheduled a workflow to share an updated deck in Slack weekly for her team to review. 2. Create a simple memory system for Cowork Jenny’s “memory system” is just a folder of notes. She updates this folder with 1:1 notes, random thoughts, prep docs and more. This way, Cowork always has access to her latest thinking and can generate relevant outputs. 3. Internal dogfooding is Anthropic’s highest-signal feedback source Anthropic has an extremely strong internal dogfooding culture. From Jenny: “Internal users are willing to be honest with you and are often pushing the capabilities furthest.” 4. Cowork’s “10-day launch” timeline had a year of prototypes behind it. Jenny walked through 3 different prototypes that the team explored before Cowork. They decided to ship after seeing non-technical people embrace Claude Code. 5. Designers, look at your engineering peers as a model for AI adoption. From Jenny: "I think about my engineering peers and how much they've adapted to how their jobs have changed with AI. They're producing even better work and are shipping in days not weeks." 📌 Watch the full episode now: youtu.be/rlIy7b-3DC8

YouTube

Peter Yang@petergyang

"We (Anthropic) are now creating entire features in days, not weeks." Here's my new episode with @jenny_wen (Claude's Head of Design) where she gave me a rare look at how Anthropic operates, including: ✅ How she uses Cowork to build products ✅ The real story behind Cowork's creation (including screens of early Cowork prototypes) ✅ How Anthropic is able to ship every day Some quotes from Jenny: "The specs we used to make with milestones ...we don't really do that anymore." "People think we built Cowork in 10 days. The actual story is we've been prototyping this direction for a year." "Designers, if you feel like the ground is shifting beneath your feet, it's because it is." 📌 Watch now: youtu.be/rlIy7b-3DC8 Thanks to our sponsors: @Replit: Plan, design, and build with AI agents replit.com/?utm_source=cr… @linear: The AI agent platform for modern teams linear.app/behind-the-cra…

English

246

62.3K

Keşfet

@levelsio @thedankoe @viktoroddy @Voxyz_ai @nikita_builds @kimmonismus @sergeantsref @notjerrywang