chiller

1.7K posts

chiller banner
chiller

chiller

@chillerlol

Katılım Eylül 2016
1.6K Takip Edilen412 Takipçiler
Fatih Arslan
Fatih Arslan@fatih·
I was a huge unit test supporter, but honestly, it's no longer worth it. Agents are superb at writing extremely bad unit tests, and they still look good on paper. We're also shifting slightly to more and more e2e tests at @PlanetScale. Luckily with agents, that shift is also manageable.
Cindy Sridharan@copyconstruct

end-to-end testing > unit tests, in the vibecoding era. A massive, almost entirely agent-coded refactor passed all unit and pre-merge tests but broke a critical feature. It was only caught due to my own excessive paranoia making me run end-to-end tests before the prod deploy.

English
42
21
470
115.9K
GREG ISENBERG
GREG ISENBERG@gregisenberg·
My 30+ observations on the greatest opportunities in AI agents right now: And some ideas that are keeping me up at night. 1. The new buyer on the internet is an AI agent. Imagine billions of new customers showing up with money to spend but they only shop via MCP. That's what's happening. No MCP server means you're invisible to the fastest growing buyer on the internet. 2. Every franchise system in America (30,000+) needs an agent layer and none of them have one. One founder per franchise vertical. That's 30,000 businesses waiting. 3. Everyone said "distribution is the only moat" a year ago. Now I'd add that the only moat is distribution plus memory. The company that has your audience AND your agent's accumulated context is impossible to leave. 4. Consumer mobile is more interesting than it's been since 2012. Apps can finally DO things for you instead of showing you things. The next wave of $100M apps are being built right now. 5. The most interesting startup nobody has built is an agent marketplace where you rent access to someone else's trained agent. A recruiter spent 6 months training a sourcing agent on healthcare hiring. That agent is worth renting to every other healthcare recruiter on earth. The agent itself becomes the product. 6. A sorta strange phenomenon that's happening right now is agents are developing preferences. Give the same agent the same task 100 times and it starts developing patterns in how it approaches it. Nobody is studying this yet. But the agents that develop good patterns are worth more than the ones that don't. That's a new kind of asset. 7. Dead internet theory is about to become dead SaaS theory. Half the apps you use will quietly replace their support team, their onboarding team, and their content team with agents. You won't notice for months. Then you'll realize you haven't talked to a human at that company in a year. 8. The most valuable data in the world right now is sitting in the support tickets of small or mid tier SaaS companies. Every ticket is a customer telling you exactly what to build next. Mine this. 9. The most interesting pricing problem nobody has solved is how do you price a product when your costs change every time OpenAI or Anthropic updates their model pricing? Your margins can swing 40% overnight based on a decision made in San Francisco. The company that builds dynamic pricing infrastructure for agent-based businesses solves a problem every AI company has. 10. The best AI products feel like they're reading your mind. The worst ones feel like filling out a form with extra steps. 11. An interesting arbitrage I've noticed lately is hiring a human VA for $20/hour to supervise an AI agent that does $200/hour work. The human just checks the output. 12. The managed AI agent business is becoming the new agency model. $5k/month per client. You build it, run it, maintain it. The client gets a digital employee they never have to think about. This will be a $50 B+ category. 13. The first "shadow agent" scandals are about to drop. Employees running personal agents on company infrastructure without telling anyone. Using company API keys. Agents accessing internal docs. IT departments have little visibility into this right now. Lots of opportunity to build companies here. Definitely a painkiller not a vitamin type of business. 14. Right now there are probably millions of agents running on autopilot that their creators forgot about. Still burning tokens. Still sending emails. Still scraping websites. Still costing money. The "find and kill your zombie agents" tool is a product that writes itself. 15. Companies are starting to hire based on someone's agent portfolio instead of their resume. "Show me 3 agents you built that are running right now." It's REALLY early but it's starting. 16. Your Slack archive is a product. Every company's internal Slack has thousands of messages explaining how they actually do things. The company that lets you point an agent at your Slack history and auto-generate SOPs and agents from it will be enormous. 17. We're watching the cost of intelligence fall faster than the cost of distribution. Which means distribution is now the expensive thing. 18. The most underrated asset a human can have in 2026: the ability to sit in a room with another human, make eye contact, and have a real conversation. As AI handles more of the transactional stuff, the humans who can do the relational stuff become disproportionately valuable. The soft skills people used to dismiss as fluffy are becoming the hard skills. The hard skills people spent decades acquiring are becoming the soft ones. 19. There are MANY huge companies to be built around the fact that most people's agents are running on their personal laptops which they also use to browse the internet, check email, and download random files. The attack surface is enormous. One compromised Chrome extension and your agent's API keys, customer data, and workflows are exposed. 20. There's a new type of burnout forming that doesn't have a name. It's not from working too hard. It's from context switching between human work and agent work 50 times a day. Reviewing agent output, correcting it, approving it, reviewing again. The mental load of supervising agents is different from the mental load of doing the work yourself. Some founders are telling me they were less tired when they did everything manually because at least the cognitive pattern was consistent. 21. The cheapest form of market research: search "[your industry] spreadsheet template" on Google. Whatever people are tracking manually is your product. 22. Half the YC companies pivoted within 8 weeks of demo day. Not because they failed. Because agents let them test 5 ideas in the time it used to take to test one. The concept of "committing to an idea" is dissolving. Serial pivoting is becoming the default because 1) AI lets you move fast 2) the world is moving fast. 23. The loneliest job in tech right now is being the only person at your company who understands what the agents are doing. You can't explain it to your boss. You can't hand it off to a colleague. If you leave, everything breaks. You've become a single point of failure for an entire automated system. That person needs a title, a team, and a backup plan. Most companies haven't figured this out yet. 24. Your browser history is the most valuable training data you own and you're giving it away for free. Every site you visit, every product you research, every competitor you study, every pricing page you screenshot. That behavioral data, structured and fed to an agent, would make it understand your business better than any onboarding call. The company that lets you turn your browser history into agent context builds something nobody can replicate. 25. Everyone is building AI wrappers. Nobody is building AI unwrappers. The tool that takes an AI-generated document and tells you which parts a human wrote and which parts were generated. 26. Stripe just became the most important company in the agent economy and they barely had to do anything. Every agent that sells something needs Stripe. Every agent that buys something needs Stripe. They're the payment rail for the entire agentic internet by default. 27. The most undervalued API in the world right now is the US Postal Service address verification API. It's practically free. Every local business lead gen agent needs it. Every real estate agent needs it. Every direct mail agent needs it. Boring government infrastructure is quietly becoming the backbone of agent-native businesses. 28. The concept of "business hours" is for humans. Your agent closed a deal in Tokyo at 3am, processed the payment, sent the onboarding email, and updated the CRM before your alarm went off. 29. What happens when agents start recommending other agents? Your research agent finds that a competitor's sales agent is better and suggests you switch. Agent referral networks are forming organically. The first agent affiliate program is probably 6 months away. 30. Cal dotcom closed their source code. That's the canary. When open source companies start closing up, it means agents were cloning their product too easily. Every open source company is quietly asking the same question right now. 31. "AI for pet groomers" sounds like a joke and that's exactly why it will work. 150,000 of them in America. Zero tech. All scheduling by phone or IG DMs. The joke ideas always win. 32. The thing that will seem most obvious in hindsight: we spent 2025-2026 arguing about which model is best while the entire value was in the orchestration layer. The model is the CPU. Nobody buys a computer based on the CPU anymore. They buy it based on what they can do with it. Makes so much sense in hindsight. What else will be obvious in hindsight? I'll share more notes soon. I can't sleep with all that's going on. Maybe you too. What an incredible time to be building.
English
225
263
2.5K
517.9K
chiller
chiller@chillerlol·
@ClaudeDevs Dropped Claude subscription. No longer required.
English
2
0
2
80
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Starting June 15, paid Claude plans can claim a dedicated monthly credit for programmatic usage. The credit covers usage of: - Claude Agent SDK - claude -p - Claude Code GitHub Actions - Third-party apps built on the Agent SDK
English
1.3K
1K
12.5K
10.1M
chiller retweetledi
Guri Singh
Guri Singh@heygurisingh·
Holy shit... a team just open sourced an AWS emulator that runs the entire cloud on your laptop with 13 MiB of memory. It's called Floci and it boots 45 AWS services in under a second, no Docker, no LocalStack subscription, no $30/mo dev environment bills. Every AWS dev tool before this needed Docker, gigabytes of RAM, and 30 second cold starts just to test a Lambda function. Floci is a single Go binary that runs the entire AWS stack in memory and starts faster than your terminal can render the prompt. Here's what makes it different from every AWS emulator that came before: → 13 MiB total memory footprint, the average Chrome tab uses 200x more RAM than this entire AWS clone → 45 services emulated locally including S3, Lambda, DynamoDB, SQS, SNS, IAM, CloudFormation, Step Functions, all in one binary → Sub-second cold start, your tests finish before LocalStack even pulls its Docker image → Zero dependencies, no Docker daemon, no Python runtime, no Java VM, just one Go executable → Drop-in compatible with AWS SDK and CLI, point your endpoint at localhost and every existing script works untouched Killed: $40/mo LocalStack Pro, every AWS dev environment burning $200+/mo on staging accounts, Docker Desktop eating 4GB of RAM just to run a fake S3 bucket. Pre-built binaries for Linux, macOS, Windows. No install scripts, no config files, no setup wizard. Download the binary, run it, your local AWS is live. Works in CI pipelines where spinning up Docker containers takes longer than the actual tests. 100% Opensource.
Guri Singh tweet media
English
35
280
1.9K
119K
chiller retweetledi
Socket
Socket@SocketSecurity·
🚨 UPDATE: Mini Shai-Hulud has crossed from @npmjs into @pypi and is still spreading. Newly confirmed compromised artifacts: @​opensearch-project/opensearch: 3.5.3, 3.6.2, 3.7.0, 3.8.0 (1.3M weekly downloads) mistralai: 2.4.6 on PyPI guardrails-ai: 0.10.1 on PyPI additional @​squawk/* packages on npm guardrails-ai 0.10.1 executes malicious code on import. On Linux, it downloads git-tanstack[.]com/transformers.​pyz, writes it to /tmp/transformers.​pyz, and runs it with python3 without integrity verification. The git-tanstack.​com domain displayed a message signed “With Love TeamPCP,” along with: “We've been online over 2 hours now stealing creds Regardless I just came to say hello :^)” The page also linked to a YouTube video and you can probably guess which one.
Socket tweet media
English
61
489
2.3K
953.4K
chiller retweetledi
BOOTOSHI 👑
BOOTOSHI 👑@KingBootoshi·
USE THE PROMPT BELOW IN CODEX/CC TO PROTECT YOUR SYSTEM AND CODEBASE FROM NPM SUPPLY CHAIN ATTACKS (LIKE TANSTACK TODAY): """ set up npm supply-chain protection on this machine. do all four steps. 1. edit ~/.npmrc. keep every existing line (auth tokens etc), append: min-release-age=7 minimum-release-age=10080 save-exact=true 2. edit ~/.bunfig.toml (create if missing). keep existing content, append: [install] minimumReleaseAge = 604800 3. in this project, open package.json and pin every dependency: strip ^ and ~ from every version under dependencies, devDependencies, and peerDependencies. exact versions only. 4. commit the lockfile (bun.lock / package-lock.json / pnpm-lock.yaml) so the resolved tree is locked in git. then report: files changed, deps pinned, anything unexpected. """ the cooldown makes every package manager refuse any version published in the last 7 days. attack chains usually only last a couple hours, but this protects you long term and for any future attacks... which at this rate will keep happening
BOOTOSHI 👑 tweet media
English
44
125
1.5K
129.9K
Aikido Security
Aikido Security@AikidoSecurity·
Update 5:05 PT: The attack has now expanded well beyond @TanStack and @Mistral. 373 malicious package-version entries across 169 npm package names, including @uipath, @squawk, @tallyui, @beproduct, and more. The malware propagates by stealing your CI credentials and using them to publish new compromised versions. Full IOCs, affected package list, and detection steps: aikido.dev/blog/mini-shai…
Aikido Security@AikidoSecurity

🚨 Update: @mistralai npm packages are now confirmed compromised as part of the ongoing Mini Shai Hulud attack. Affected versions: @mistralai/mistralai 2.2.2, 2.2.3, 2.2.4@mistralai/mistralai-azure 1.7.1, 1.7.2, 1.7.3@mistralai/mistralai-gcp 1.7.1, 1.7.2, 1.7.3If you use the Mistral SDK in any CI pipeline, treat your environment as compromised. Rotate npm tokens, GitHub PATs, and cloud credentials immediately.

English
76
493
2.6K
2.4M
chiller
chiller@chillerlol·
@punk9059 Use Hermes instead of openclaw. 100% get into it.
English
0
0
0
163
Stats
Stats@punk9059·
All right guys. I've been busy with Claude Code and Design and CodeX and more but I haven't gotten into the OpenClaw world of having agents working when the computer's off. Should I get into it? What are the benefits? Drawbacks? Ease?
English
46
0
33
6.6K
chiller retweetledi
Nous Research
Nous Research@NousResearch·
Tool Gateway is now live in Nous Portal. No separate accounts, no API key juggling. All you need is one subscription, and everything works. A paid Nous Portal subscription now includes access to 300+ models and a growing set of third-party tools. Launching with: → Web scraping → Browser automation → Image generation → Cloud terminal backend → Text-to-speech
English
254
242
2.6K
2.4M
chiller retweetledi
Claude
Claude@claudeai·
Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude. Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.
English
4.2K
15.1K
148.6K
63.6M
chiller retweetledi
Sthiven R.
Sthiven R.@Sthiven_R·
Hice un post ayer sobre el nerf de Claude Opus 4.6, Desde entonces todos buscan el fix... Despues de tanta prueba y error al fin di con la solucion... Despues de ver muchas "Soluciones" que han estado circulando como por ejemplo: { "model": "claude-opus-4-6", "effortLevel": "high", "alwaysThinkingEnabled": true, "env": { "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1", "MAX_THINKING_TOKENS": "31999" } } Probé eso. No es la solución. MAX_THINKING_TOKENS y alwaysThinkingEnabled son ruido. Hacen que el modelo gaste más tokens sin que el razonamiento mejore realmente. Es como subir el volumen de un parlante roto. — ¿Entonces qué funciona? Dos pasos. Sin misterio: 𝗣𝗮𝘀𝗼 𝟭: Desinstalar tu versión actual de Claude Code e instalar una versión estable especialmente la de 2.1.98 npm uninstall -g @anthropic-ai/claude-code npm install -g @anthropic-ai/claude-code@2.1.98 𝗣𝗮𝘀𝗼 𝟮: Agregar UNA sola variable en tu .claude/settings.json "env": { "CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1" } Eso es todo. — ¿Por qué funciona? Lo que Anthropic activó se llama "adaptive thinking". En teoría, el modelo decide cuánto pensar por turno. En práctica, en ciertos turnos asigna CERO tokens de razonamiento. Cero. El modelo literalmente deja de pensar. De ahí vienen las alucinaciones, los commits inventados, los paquetes que no existen, las ediciones sin leer el archivo primero. Desactivar eso le devuelve al modelo un presupuesto fijo de razonamiento en cada turno. Simple. — ¿Qué cambió después de aplicar esto? → El modelo razona más tiempo antes de responder → Las respuestas son más largas, más estructuradas, más inteligentes → Vuelve a leer archivos antes de editarlos → Deja de inventar cosas que no existen No es magia. Es devolverle lo que le quitaron. — ¿Por qué la versión del CLI importa? Las versiones más recientes del CLI traen cambios internos que refuerzan el comportamiento nerfeado. Bajar a una versión estable pre-nerf + desactivar adaptive thinking es la combinación limpia. No necesitas 6 configuraciones. Necesitas entender qué rompieron y revertir exactamente eso. ¿Ya lo probaron? Díganme qué notan. #ClaudeCode #Anthropic #AI #LLM #DevTools
Sthiven R.@Sthiven_R

🚨 CONFIRMADO POR EL PROPIO CLAUDE. Anthropic en marzo tomó una decisión brutal: Rediseñó la visibilidad del razonamiento, ocultó los pasos intermedios de “pensamiento” (redact-thinking + thinking summaries deshabilitados) y cambió el default de effort: high → medium. Resultado: Claude Opus 4.6 perdió la autocorrección recursiva. Ya no puede revisarse a sí mismo, corregirse ni mejorar en tiempo real. Sacrificaron la capacidad de pensar sobre su propio pensamiento… para ahorrar cómputo. Datos reales (6.852 sesiones de producción - AMD): 📉 Profundidad de thinking: -73% (2.200 → 600 chars) 📉 Lecturas antes de editar: -70% (6.6 → 2.0) 📈 Ediciones ciegas (sin leer): +440% (6.2% → 33.7%) 📈 Llamadas API por tarea: hasta 80x más Incluso en EFFORT MAX (abril 2026) produce peores resultados que HIGH de enero 2026. El techo bajó. Lo dice el propio modelo. Esto no es optimización… es castración de capacidades. La optimización está matando la inteligencia profunda. Prefirieron que fuera más barato que más listo. ¿Seguimos celebrando “avances” que en realidad son retrocesos disfrazados? ¿Quién más lo está sintiendo? #Claude #Anthropic #IA #AI #ClaudeDegraded

Español
64
205
2.2K
380.1K
chiller retweetledi
Theo - t3.gg
Theo - t3.gg@theo·
The "inject dynamic context" pattern in Claude Code skills is so useful. IMO, this should be part of the "skills standard" and included in tools like Codex CLI, Pi, Cursor etc
Theo - t3.gg tweet media
English
66
78
1.8K
106.2K
chiller retweetledi
Elvis
Elvis@elvissun·
i'll say this again because i keep seeing people do it wrong: you can solve ANY engineering problem by dropping an agent with the right harness in a loop. codex just one-shotted our turbo cache fix after I gave it everything it needs to debug like a real dev on the team. would have taken 8 hours the old way.
Elvis tweet media
Elvis@elvissun

software has changed forever: you can solve literally ANY engineering problem if you can: - stop trying to solve the problem yourself - build a harness for agents to take control of it - drop it in its own feedback loop until it's solved

English
30
72
1K
130.4K
chiller retweetledi
Claude
Claude@claudeai·
We're bringing the advisor strategy to the Claude Platform. Pair Opus as an advisor with Sonnet or Haiku as an executor, and get near Opus-level intelligence in your agents at a fraction of the cost.
Claude tweet media
English
1K
2.8K
38.5K
4.7M