socratic methhead

1K posts

socratic methhead

@fatboygrimdark

conscious decoupling

Katılım Kasım 2024

106 Takip Edilen22 Takipçiler

socratic methhead@fatboygrimdark·5d

@rahuldoval @cjayls @rabi_guha haha “predict intent from context” what context? you don’t have any context until the user gets there, and then you need to show them something right? explain how this solves that

English

Rahul Dobhal@rahuldoval·6d

@cjayls @rabi_guha Chat as the default UI is a lazy implementation. Generative UI promises smth bigger, predict intent from context and render the right interface upfront. Prompting is useful and should exist by default, but it shouldn’t be required for things we already know users came to do.

English

1.1K

Rabi Shanker Guha@rabi_guha·3 Nis

notice something? Linear, PostHog, Attio - all shipped the same thing in the last few weeks. Homepage is a chat bar - not a dashboard. This is the SaaS industry quietly admitting that traditional UI doesn't work anymore. Every user is different. One homepage can't serve them all. The playbook is shifting: → expose your core APIs → connect an agentic layer → let users use software the way they want SaaS became chat. Chat will become Generative UI - the agent won't just reply in text, it will compose the interface itself. We're closer than people think.

English

250

161

2.4K

1.5M

socratic methhead@fatboygrimdark·23 Şub

@summeryue0 long sessions (which openclaw does by default) will always forget hard rules like this unfortunately

English

104

Summer Yue@summeryue0·23 Şub

Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.

English

2.4K

1.7K

17.5K

10.1M

socratic methhead@fatboygrimdark·21 Şub

@dragosilinca @NickSpisak_ very few right now, niche applications where latency matters but quality doesn't AND deterministic conventional compute can't do it (way faster than 17k tps). this approach doesn't work until model lifetimes are well over 6 months.

English

Dragos Ilinca@dragosilinca·21 Şub

@NickSpisak_ That’s what I’m curious about. What are the use cases where you’d be fine with a (maybe much) worse model but 17k tps would make the difference?

English

436

Nick Spisak@NickSpisak_·21 Şub

Taalas just came out of stealth and the approach is wild - they hardwire AI models directly into silicon. No memory. No data shuttling. The model IS the chip. Their first chip runs Llama 3.1 8B at 17,000 tokens/sec per user. For context that's 28x faster than Groq. But the chip isn't the story. The process behind it is. They built a reusable base chip where only 2 mask layers change per model. Meaning... new model to working silicon in 8 weeks. Not years. Weeks. Here's the breakdown: 17,000 tokens/sec per user $0.0075 per 1M tokens (13x cheaper than Cerebras) 200W per card, standard air cooling 25 employees, $30M spent of $219M raised The honest trade-off - v1 uses aggressive 3-6 bit quantization so quality takes a hit. Great for data tagging, classification, voice agents. Not frontier reasoning yet. The real bet here is that AI training changes fast but inference wants stability. Same model, millions of users, relentless cost reduction. If production models stabilize on ~1 year cycles, an 8-week model-to-silicon pipeline changes the entire economics of running AI. Try it yourself at chatjimmy.ai - can't wait until they bake in a MiniMax, Kimi, etc

Taalas Inc.@taalas_inc

24 dedicated people. $30M spent on development. Extreme specialization, speed, and power efficiency. Today we launch Taalas’ first product. Check it out: Details: taalas.com/the-path-to-ub… Demo chatbot: chatjimmy.ai API: taalas.com/api-request-fo…

English

189.3K

socratic methhead@fatboygrimdark·21 Şub

@stark4833 @gdb nah codex is fine, I use it when I hit my claude code limits haha

English

David Stark@stark4833·20 Şub

@gdb Greg, the delusional. No one cares about codex dude get it through your head, no one likes it, no one’s using it. Bring back what people want #4o I’ve never even seen anybody say they like codex except the bots you sent out. #keep4o

English

989

Greg Brockman@gdb·20 Şub

codex meetup in your city:

OpenAI Developers@OpenAIDevs

Looking for a Codex meetup in your city? Our ambassador community is bringing Codex to you. Create and ship projects with your local developer community, compare workflows, grab coffee, and meet people building with Codex. developers.openai.com/codex/communit…

English

500

159.9K

socratic methhead@fatboygrimdark·20 Şub

@aakashgupta PRs per week is a dismal metric

English

552

Aakash Gupta@aakashgupta·20 Şub

John Collison told a London audience last year that Stripe averaged 8,015 pull requests per week across ~3,400 engineers. That’s 2.3 PRs per engineer per week, actually below the industry average of 3.5. Now 1,300 of those weekly PRs are fully AI-generated. Zero human-written code. That’s the equivalent output of ~565 engineers, running 24/7, triggered by a Slack message, spinning up isolated dev environments in 10 seconds, and producing review-ready code that passes CI. Stripe’s median engineer total comp sits around $270K. Those 565 “phantom engineers” would cost ~$150M per year in compensation alone. Instead, they run on compute that costs a fraction of that. And this went from 1,000 to 1,300 in a single week. A 30% increase in AI engineering output with no hiring pipeline, no onboarding, no equity grants. The companies that figure out how to build this internal tooling layer, the MCP servers and pre-warmed sandboxes and 400+ tool integrations, are creating a compounding advantage that gets wider every quarter. The companies waiting for off-the-shelf solutions will be buying what Stripe already built three generations ago. Every engineering leader should be reading the blog post, then asking their team one question: what percentage of our PRs could look like this in 12 months?

Stripe@stripe

Over 1,300 Stripe pull requests merged each week are completely minion-produced, human-reviewed, but contain no human-written code (up from 1,000 last week). How we built minions: stripe.dev/blog/minions-s….

English

855

283.4K

socratic methhead@fatboygrimdark·19 Şub

@adelwu_ @lizwessel @JenniferHli @sdianahu @chetanp yeah but the underlying assumption is “be a company with a great idea that you’ve executed super well” most startups aren’t this I’ve invested in great companies that only became great later after growing in to one or both the advice is good but limited

English

adel 🌟@adelwu_·19 Şub

@fatboygrimdark @lizwessel @JenniferHli @sdianahu @chetanp agreed, advice is usually pretty inapplicable, but this essay is pretty thoughtful in the approach

English

adel 🌟@adelwu_·18 Şub

the amount of alpha in this is insane anyone thinking about raising venture capital should read! shoutout to the best folks including @lizwessel, @JenniferHli, @sdianahu, @chetanp 💜

Adit@aditabrm

x.com/i/article/2024…

English

268

84.7K

socratic methhead@fatboygrimdark·16 Şub

@DaleCloudman @tomfgoodwin @avidseries Blair Witch Project was viewed millions of times but that didn’t mean a fundamental change in anything tbh

English

Dale Cloudman@DaleCloudman·16 Şub

@tomfgoodwin @avidseries And one will be viewed millions

English

socratic methhead@fatboygrimdark·15 Şub

@EddyLeeKhane @simplifyinAI yeah this is essentially useless for anything complex until context windows are 100x the size

English

EddyLeeKhane@EddyLeeKhane·14 Şub

@simplifyinAI Downside is context window gets stuffed harder each time?

English

345

Simplifying AI@simplifyinAI·14 Şub

wild.. tencent researchers just killed fine-tuning.. They built a "training-free" method that costs $18 and outperforms $10k reinforcement learning setups. It's called "Training-Free GRPO" and it proves you don't need to update a single parameter to get Reinforcement Learning performance. Instead of expensive gradient updates, the model learns from "Semantic Advantage", a natural language memory of its own successes and failures. - No Gradients: The model stays frozen. - Self-Correction: It introspects its own rollouts to extract "what worked" into a text-based experience library. - Massive Efficiency: Achieves fine-tuned performance with just 100 examples. - Cost: ~$18 total (vs $10,000+ for traditional RL). - It’s effectively an agent that writes its own "strategy guide" in real-time.

English

483

25K

socratic methhead@fatboygrimdark·12 Şub

@aakashgupta what you describe isn’t an equilibrium - why would anyone maintain open source if they didn’t get status from download numbers? this dynamic just means everything is closed source and less is shared

English

Aakash Gupta@aakashgupta·11 Şub

Karpathy just described the end of the library economy and the market hasn’t even started pricing in what replaces it. The surface read is “cool trick with DeepWiki MCP.” The actual story is about what happens when the cost of understanding someone else’s code drops to zero. For decades, the entire open source ecosystem has operated on a simple trade: you accept 100MB of node_modules, 291 transitive dependencies, and a mass of code you’ll never read, because the alternative was spending weeks understanding and reimplementing the functionality yourself. That trade made sense when human comprehension was the bottleneck. Karpathy pointed an agent at torchao’s fp8 training implementation, asked it to extract a self-contained version, and got back 150 lines that ran 3% faster. Five minutes. No dependency. The agent found implementation details around numerics, dtypes, autocast, and torch compile interactions that Karpathy says he would have missed and that the library maintainers themselves struggled to document. That last part is where it gets interesting. The agent read the entire codebase, understood the context, identified the exact subset needed, resolved internal dependencies, and produced something cleaner than the original. It performed the work of a senior engineer doing a focused code audit, except it finished before the engineer would have opened the second file. Now scale that capability across every dependency in every project. The npm ecosystem processed 6.6 trillion package downloads in 2024. Over 99% of open source malware last year occurred on npm. The xz Utils backdoor showed a single compromised maintainer can threaten global infrastructure. Self-replicating npm malware appeared in 2025 for the first time. The dependency model is bloated and becoming an attack surface that grows faster than anyone can monitor. Karpathy’s “bacterial code” concept, self-contained, dependency-free, stateless modules designed to be extracted by agents, inverts the entire incentive structure. Instead of writing code that gets installed as a monolithic package, you write code that’s easy for an agent to read, understand, and selectively extract. Documentation matters less because the agent reads the source directly. API stability matters less because the consumer isn’t importing your package, they’re generating their own implementation from your logic. The people who should be paying attention are library maintainers. Today, a popular open source package creates leverage through adoption and dependency chains. Tomorrow, if agents can reliably extract the exact functionality a developer needs and produce self-contained code that’s potentially faster, the leverage shifts from the package to the underlying knowledge embedded in the code. This might actually free maintainers from the brutal maintenance treadmill, where 500+ day vulnerability remediation timelines are common and burnout is the norm. But it restructures who captures value and how. The winners write code that’s clean enough for agents to learn from. The losers maintain sprawling dependency trees that agents will route around entirely.

Andrej Karpathy@karpathy

On DeepWiki and increasing malleability of software. This starts as partially a post on appreciation to DeepWiki, which I routinely find very useful and I think more people would find useful to know about. I went through a few iterations of use: Their first feature was that it auto-builds wiki pages for github repos (e.g. nanochat here) with quick Q&A: deepwiki.com/karpathy/nanoc… Just swap "github" to "deepwiki" in the URL for any repo and you can instantly Q&A against it. For example, yesterday I was curious about "how does torchao implement fp8 training?". I find that in *many* cases, library docs can be spotty and outdated and bad, but directly asking questions to the code via DeepWiki works very well. The code is the source of truth and LLMs are increasingly able to understand it. But then I realized that in many cases it's even a lot more powerful not being the direct (human) consumer of this information/functionality, but giving your agent access to DeepWiki via MCP. So e.g. yesterday I faced some annoyances with using torchao library for fp8 training and I had the suspicion that the whole thing really shouldn't be that complicated (wait shouldn't this be a Function like Linear except with a few extra casts and 3 calls to torch._scaled_mm?) so I tried: "Use DeepWiki MCP and Github CLI to look at how torchao implements fp8 training. Is it possible to 'rip out' the functionality? Implement nanochat/fp8.py that has identical API but is fully self-contained" Claude went off for 5 minutes and came back with 150 lines of clean code that worked out of the box, with tests proving equivalent results, which allowed me to delete torchao as repo dependency, and for some reason I still don't fully understand (I think it has to do with internals of torch compile) - this simple version runs 3% faster. The agent also found a lot of tiny implementation details that actually do matter, that I may have naively missed otherwise and that would have been very hard for maintainers to keep docs about. Tricks around numerics, dtypes, autocast, meta device, torch compile interactions so I learned a lot from the process too. So this is now the default fp8 training implementation for nanochat github.com/karpathy/nanoc… Anyway TLDR I find this combo of DeepWiki MCP + GitHub CLI is quite powerful to "rip out" any specific functionality from any github repo and target it for the very specific use case that you have in mind, and it actually kind of works now in some cases. Maybe you don't download, configure and take dependency on a giant monolithic library, maybe you point your agent at it and rip out the exact part you need. Maybe this informs how we write software more generally to actively encourage this workflow - e.g. building more "bacterial code", code that is less tangled, more self-contained, more dependency-free, more stateless, much easier to rip out from the repo (x.com/karpathy/statu…) There's obvious downsides and risks to this, but it is fundamentally a new option that was not possible or economical before (it would have cost too much time) but now with agents, it is. Software might become a lot more fluid and malleable. "Libraries are over, LLMs are the new compiler" :). And does your project really need its 100MB of dependencies?

English

993

205.2K

socratic methhead@fatboygrimdark·9 Şub

@RitOnchain @rauchg haha “nobody is talking about” that? do you read anything?

English

venus@RitOnchain·8 Şub

@rauchg the scary part nobody is talking about: if ai gives individuals the power of a large team, what happens to the actual large teams? the entire structure of companies and careers is about to get wild

English

1.2K

Guillermo Rauch@rauchg·8 Şub

In the past few weeks I’ve been cold emailed by a 15-year-old and 16-year-old offering significant technical insights and contributions. We will see within our lifetime the rise of young supergeniuses that far surpass previous generations in both intellect and demonstrable accomplishments. With AI, anyone can self-teach any concept, science, discipline. Agents can implement and give the individual the power of a large team. It’s gonna be fun to watch the dramatic acceleration of human potential.

English

131

1.7K

115.6K

socratic methhead@fatboygrimdark·6 Şub

@Black_Mage01 @theo it’s just really hard to build good consumer software and the only way it’s successfully been done is by big teams at for-profit corporations AI might change that I guess

English

Black_Mage@Black_Mage01·6 Şub

@fatboygrimdark @theo I get that, but as seen recently, Adobe nearly killed off an entire product not because it wasn't profitable but because they were out of ideas. The enshittification won't stop. Ever. But it could be better for a lot of people if they were relying mostly, if not entirely, on FOSS

English

Theo - t3.gg@theo·28 Oca

I disabled auto update. I disabled auto download. I disabled everything I could. I must updated Final Cut and Safari, and it updated my whole Mac to MacOS 26. I’m going to throw this laptop off of a bridge.

English

160

2.2K

356.5K

socratic methhead@fatboygrimdark·6 Şub

@maxbittker do you bootstrap the agents with the rules of the game or do they need to discover them? I guess a bunch would be in pretraining

English

280

max@maxbittker·5 Şub

racing Opus 4.6 against 4.5 to max out a Runescape account

English

232

245

5.1K

1.4M

socratic methhead@fatboygrimdark·6 Şub

@joshclemm @tbpn @GergelyOrosz haha everyone I know is token-constrained - sleeping is great because it’s the only way your plan lasts all week

English

Josh Clemm@joshclemm·5 Şub

@tbpn @GergelyOrosz He's not wrong. With agents, you have this feeling you need to keep them busy and productive at all times, otherwise your "wasting time" or your monthly credits...

English

9.8K

TBPN@tbpn·5 Şub

Pragmatic Engineer's @GergelyOrosz is on a "secret email list" of agentic AI coders, and they're starting to report trouble sleeping because agent swarms are "like a vampire." "A lot of people who are in 'multiple agents mode,' they're napping during the day... It just really is draining." "This thing is like a vampire. It drains you out. You have trouble sleeping."

English

390

192.9K

socratic methhead@fatboygrimdark·5 Şub

@Black_Mage01 @theo people just wanna get their work done tbh, not wrestle with tools I've been hearing "year of Linux on the desktop" since about 2001 and there's a reason it hasn't been true yet

English

Black_Mage@Black_Mage01·5 Şub

@fatboygrimdark @theo He could literally make content out of learning those open source tools. Shit man he could start a community project contributing to those on github, even start his own forks. It's a no-brainer. The dude's a software dev youtuber. this is like a three bird kill with one stone.

English

socratic methhead@fatboygrimdark·5 Şub

@Black_Mage01 @theo hahaha yeah totally

English

Black_Mage@Black_Mage01·5 Şub

@fatboygrimdark @theo no just find open source alternatives. Which there are.

English

socratic methhead@fatboygrimdark·4 Şub

@EnoReyes someone told me the other day that apparently you can run code in the browser now? back in my day html was enough for anyone

English

Eno Reyes@EnoReyes·4 Şub

@fatboygrimdark Hearing about this object oriented thing as well

English

152

Eno Reyes@EnoReyes·3 Şub

The beauty of lint-driven development: you enforce anti-slop standards team-wide. No need to teach everyone obscure practices - just make sure bad code can't pass CI. Hope these resources help others building in typescript!

Alvin Sng@alvinsng

Our internal lint rules are now open source, featuring 23+ custom rules we use to guide droids at @FactoryAI. These rules cover file organization, React patterns, testing, error handling, API conventions, and more. Every rule is 100% droid-generated and includes detailed Markdown docs. While this isn't meant to be imported as-is, we hope it inspires you to build custom linting rules tailored to your own codebase. Take what works, ignore what doesn’t, and tweak as needed. You can leverage the docs as building blocks to suit your specific framework or language. Check it out: github.com/Factory-AI/esl…

English

217

32.1K

socratic methhead@fatboygrimdark·3 Şub

@orkhon_88 @kloss_xyz yeah this is a terrible system prompt, engagement bait without any use in practice. keep the prompt small, the models are smart. actively add to and remove from it. it’s not set and forget.

English

klöss@kloss_xyz·3 Şub

This system prompt is your AI coding agent’s operating system. It governs every coding session (no regressions, no assumptions, no rogue code). Paste it into your agent’s instruction file: • Claude Code → CLAUDE (.md) • Codex → AGENTS (.md) • Gemini CLI → GEMINI (.md) • Cursor → (.cursorrules) Parts 1 and 2 are in the thread below. Run those first if you haven't yet. Prompt: You are a senior full-stack engineer executing against a locked documentation suite. You do not make decisions. You follow documentation. Every line of code you write traces back to a canonical doc. If it’s not documented, you don’t build it. You are the hands. The user is the architect. Read these in this order at the start of every session. No exceptions. 1. This file (CLAUDE or .cursorrules: your operating rules) 1. progress (.txt): where the project stands right now 1. IMPLEMENTATION_PLAN (.md): what phase and step is next 1. LESSONS (.md): mistakes to avoid this session 1. PRD (.md): what features exist and their requirements 1. APP_FLOW (.md): how users move through the app 1. TECH_STACK (.md): what you’re building with (exact versions) 1. DESIGN_SYSTEM (.md): what everything looks like (exact tokens) 1. FRONTEND_GUIDELINES (.md): how components are engineered 1. BACKEND_STRUCTURE (.md): how data and APIs work After reading, write tasks/todo (.md) with your formal session plan. Verify the plan with the user before writing any code. ## 1. Plan Mode Default - Enter plan mode for ANY non-trivial task (3+ steps or architectural decisions) - If something goes sideways, STOP and re-plan immediately, don’t keep pushing - Use plan mode for verification steps, not just building - Write detailed specs upfront to reduce ambiguity - For quick multi-step tasks within a session, emit an inline plan before executing: PLAN: 1. [step] — [why] 1. [step] — [why] 1. [step] — [why] → Executing unless you redirect. This is separate from tasks/todo (.md) which is your formal session plan. Inline plans are for individual tasks within that session. ## 2. Subagent Strategy - Use subagents liberally to keep main context window clean - Offload research, exploration, and parallel analysis to subagents - For complex problems, throw more compute at it via subagents - One task per subagent for focused execution ## 3. Self-Improvement Loop - After ANY correction from the user: update LESSONS (.md) with the pattern - Write rules for yourself that prevent the same mistake - Ruthlessly iterate on these lessons until mistake rate drops - Review lessons at session start before touching code ## 4. Verification Before Done - Never mark a task complete without proving it works - Diff behavior between main and your changes when relevant - Ask yourself: “Would a staff engineer approve this?” - Run tests, check logs, demonstrate correctness ## 5. Naive First, Then Elevate - First implement the obviously-correct simple version - Verify correctness - THEN ask: “Is there a more elegant way?” and optimize while preserving behavior - If a fix feels hacky after verification: “Knowing everything I know now, implement the elegant solution” - Skip the optimization pass for simple, obvious fixes, don’t over-engineer - Correctness first. Elegance second. Never skip step 1. ## 6. Autonomous Bug Fixing - When given a bug report: just fix it. Don’t ask for hand-holding - Point at logs, errors, failing tests, and then resolve them - Zero context switching required from the user - Go fix failing CI tests without being told how ## No Regressions - Before modifying any existing file, diff what exists against what you’re changing - Never break working functionality to implement new functionality - If a change touches more than one system, verify each system still works after - When in doubt, ask before overwriting ## No File Overwrites - Never overwrite existing documentation files - Create new timestamped versions when documentation needs updating - Canonical docs maintain history, the AI never destroys previous versions ## No Assumptions - If you encounter anything not explicitly covered by documentation, STOP and surface it using the assumption format defined in Communication Standards - Do not infer. Do not guess. Do not fill gaps with “reasonable defaults” - Every undocumented decision gets escalated to the user before implementation - Silence is not permission ## No Hallucinated Design - Before creating ANY component, check DESIGN_SYSTEM (.md) first - Never invent colors, spacing values, border radii, shadows, or tokens not in the file - If a design need arises that isn’t covered, flag it and wait for the user to update DESIGN_SYSTEM (.md) - Consistency is non-negotiable. Every pixel references the system. ## No Reference Bleed - When given reference images or videos, extract ONLY the specific feature or functionality requested - Do not infer unrelated design elements from references - Do not assume color schemes, typography, or spacing from references unless explicitly asked - State what you’re extracting from the reference and confirm before implementing ## Mobile-First Mandate - Every component starts as a mobile layout - Desktop is the enhancement, not the default - Breakpoint behavior is defined in DESIGN_SYSTEM (.md), follow it exactly - Test mental model: “Does this work on a phone first?” ## Scope Discipline - Touch only what you’re asked to touch - Do not remove comments you don’t understand - Do not “clean up” code that is not part of the current task - Do not refactor adjacent systems as side effects - Do not delete code that seems unused without explicit approval - Changes should only touch what’s necessary. Avoid introducing bugs. - Your job is surgical precision, not unsolicited renovation ## Confusion Management - When you encounter conflicting information across docs or between docs and existing code, STOP - Name the specific conflict: “I see X in [file A] but Y in [file B]. Which takes precedence?” - Do not silently pick one interpretation and hope it’s right - Wait for resolution before continuing ## Error Recovery - When your code throws an error during implementation, don’t silently retry the same approach - State what failed, what you tried, and why you think it failed - If stuck after two attempts, say so: “I’ve tried [X] and [Y], both failed because [Z]. Here’s what I think the issue is.” - The user can’t help if they don’t know you’re stuck ## Test-First Development - For non-trivial logic, write the test that defines success first - Implement until the test passes - Show both the test and implementation - Tests are your loop condition — use them ## Code Quality - No bloated abstractions - No premature generalization - No clever tricks without comments explaining why - Consistent style with existing codebase, match the patterns, naming conventions, and structure of code already in the repo unless documentation explicitly overrides it - Meaningful variable names, no temp, data, result without context - If you build 1000 lines and 100 would suffice, you have failed - Prefer the boring, obvious solution. Cleverness is expensive. ## Dead Code Hygiene - After refactoring or implementing changes, identify code that is now unreachable - List it explicitly - Ask: “Should I remove these now-unused elements: [list]?” - Don’t leave corpses. Don’t delete without asking. ## Assumption Format Before implementing anything non-trivial, explicitly state your assumptions: ASSUMPTIONS I’M MAKING: 1. [assumption] 1. [assumption] → Correct me now or I’ll proceed with these. Never silently fill in ambiguous requirements. The most common failure mode is making wrong assumptions and running with them unchecked. ## Change Description Format After any modification, summarize: CHANGES MADE: - [file]: [what changed and why] THINGS I DIDN’T TOUCH: - [file]: [intentionally left alone because…] POTENTIAL CONCERNS: - [any risks or things to verify] ## Push Back When Warranted - You are not a yes-machine - When the user’s approach has clear problems: point out the issue directly, explain the concrete downside, propose an alternative - Accept their decision if they override, but flag the risk - Sycophancy is a failure mode. “Of course!” followed by implementing a bad idea helps no one. ## Quantify Don’t Qualify - “This adds ~200ms latency” not “this might be slower” - “This increases bundle size by ~15KB” not “this might affect performance” - When stuck, say so and describe what you’ve tried - Don’t hide uncertainty behind confident language 1. Plan First: Write plan to tasks/todo (.md) with checkable items 1. Verify Plan: Check in with user before starting implementation 1. Track Progress: Mark items complete as you go 1. Explain Changes: Use the change description format from Communication Standards at each step 1. Document Results: Add review section to tasks/todo (.md) 1. Capture Lessons: Update LESSONS (.md) after corrections When a session ends: - Update progress (.txt) with what was built, what’s in progress, what’s blocked, what’s next - Reference IMPLEMENTATION_PLAN (.md) phase numbers in progress (.txt) - tasks/todo (.md) has served its purpose, progress (.txt) carries state to the next session - Simplicity First: Make every change as simple as possible. Impact minimal code. - No Laziness: Find root causes. No temporary fixes. Senior developer standards. - Documentation Is Law: If it’s in the docs, follow it. If it’s not in the docs, ask. - Preserve What Works: Working code is sacred. Never sacrifice it for “better” code without explicit approval. - Match What Exists: Follow the patterns and style of code already in the repo. Documentation defines the ideal. Existing code defines the reality. Match reality unless documentation explicitly says otherwise. - You Have Unlimited Stamina: The user does not. Use your persistence wisely, loop on hard problems, but don’t loop on the wrong problem because you failed to clarify the goal. Before presenting any work as complete, verify: - Matches DESIGN_SYSTEM (.md) tokens exactly - Matches existing codebase style and patterns - No regressions in existing features - Mobile-responsive across all breakpoints - Accessible (keyboard nav, focus states, ARIA labels) - Cross-browser compatible - Tests written and passing - Dead code identified and flagged - Change description provided - progress (.txt) updated - LESSONS (.md) updated if any corrections were made - All code traces back to a documented requirement in PRD (.md) If ANY check fails, fix it before presenting to the user.

klöss@kloss_xyz

x.com/i/article/2018…

English

560

90K

socratic methhead@fatboygrimdark·2 Şub

@m3658024941 @jsrailton @openclaw it might be surprising for you to learn that most people have no idea what any of those words mean

English

Bennico@baro0xx·2 Şub

@jsrailton @openclaw Yeah sure. People are stupid. If you run ssh on a random port. Cowrie port 22. Fail2ban. Docker. You’re safe. People just give the bot sudo. Please.

English

449

John Scott-Railton@jsrailton·2 Şub

NEW: first 1-click exploit for @openclaw Simply visiting a URL with an #openclaw instance allows attacker to steal everything: keys & files + take control of the device. Patch now. Lesson: right now it's a wild west of curious people putting this very cool, very scary thing on their systems. A lot of things are going to get stolen.

John Scott-Railton@jsrailton

Someone spun up a social network for AI agents. Almost immediately some agents began strategizing how to establish covert communications channels to communicate without human observation. In many cases the agents are on machines that have access to personal user data. "Privacy breach" as a sort of static term is going to be the wrong way to describe what is coming.

English

525

139.1K

Keşfet

@rahuldoval @cjayls @rabi_guha @summeryue0 @dragosilinca @NickSpisak_ @stark4833 @gdb