Angehefteter Tweet
BlockedPath
5.7K posts

BlockedPath
@BlockedPaths
Building personal AI infrastructure. Local LLMs, coding agents, open-source models. Real hardware, real constraints. Learning in public.
Maryland Beigetreten Nisan 2009
237 Folgt614 Follower

Codex got us with the bait and switch.
BlockedPath@BlockedPaths
I used Claude way more here in this instance than Codex and you can see the difference. I have a lot of usage left on Claude compared to Codex. Fishy
English
BlockedPath retweetet

something is not adding up
there’s no fknn wayyy a model this large is sustainable at these prices 💀
are they selling our data?

DeepSeek@deepseek_ai
We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀
English
Same here! I silently switched back from #Codex to #Claude in the last days because I hit the limits in like one day. Surprised how well Claude is working right now + solid limits 👍
BlockedPath@BlockedPaths
I used Claude way more here in this instance than Codex and you can see the difference. I have a lot of usage left on Claude compared to Codex. Fishy
English

While DeepSeek is pursuing the goal, my Codex agent and I monitor it in the sidecar and guide or correct it as needed.
So I thought I would ask Codex to objectively judge DeepSeek’s capability based on multiple rounds of interaction. Keep in mind, Codex does not know it is talking to DeepSeek. It thought it was another Codex agent.
Here is Codex’s evaluation of DeepSeek V4 Pro:


English

@AnthropicAI 10,000+ critical vulns found by AI is a big number but what matters is the fix rate. Finding bugs is the easy half of security. The other half is getting maintainers to patch, rolling out fixes, and verifying they hold. AI is great at the find. The fix is still human-speed.
English

@vaxryy No package manager is a feature until you need a dependency tree. C++ works because the ecosystem is mature enough that most things are self-contained. The real supply chain fix isn't fewer packages, it's auditable dependency graphs. Which JS will never have.
English


@hd_nvim Dropping runtime support because it's "vibe-coded" is a wild precedent. If every tool dropped support for software written with AI assistance there'd be nothing left to support. The real question is whether the code works, not how it was written.
English

@cursor_ai SDK for custom agents is the right abstraction. The IDE is becoming the agent runtime, not just the editor. The question is whether custom agents on Cursor compete with or complement Codex-style orchestration. Probably both.
English

@testingcatalog DeepSeek V4 Pro at 75% off is aggressive. When the cheapest cloud model keeps getting cheaper, the case for local shifts from cost savings to control. You can't self-host something that disappears when a company changes pricing.
English

DeepSeek permanently reduced pricing for DeepSeek V4 Pro by 75%!
> $0.003625 per million input tokens (with cache)
> $0.435 per million input tokens.
> $0.87 per million output tokens.
Cache is almost free 👀


DeepSeek@deepseek_ai
We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀
English

@StijnSmits "Use at least 100 tool calls" is just a hack to force chain-of-thought through tool use instead of reasoning tokens. Same idea as "think step by step" but you're burning compute instead of guiding logic. The prompt engineering that actually works: constrain the output format.
English

@jussisaur This is the real agent debate. Autocomplete keeps you in the code. Agents take you out of it. The productivity gain is real but the comprehension loss is real too. The answer isn't one or the other, it's knowing which mode for which task.
English

@ThePrimeagen 100x comes from one agent that actually finishes the task. 1000x is just 10 agents that each break at step 7 and you spend all day debugging their output. The multiplier that matters is reliability, not count.
English

@ClaudeDevs Auto mode is the right move. The jump from "prompt → response" to "prompt → plan → execute → verify" is what separates a chatbot from an agent. But the hard part is still knowing when to stop and when to escalate.
English

@dhh The real pattern: every 3 months the "best" agent model swaps. What doesn't swap is your context structure, tool access, and memory persistence. The model is the interchangeable part. The infra around it is the product.
English

TL;DR:
Anthropic quietly added an experimental /workflows tool to Claude Code for Deterministic Multi-Agent Orchestration.
Instead of using a central LLM to manage sub-agents (which causes massive token costs, context bloat, and sloppy performance), you now write a plain JavaScript file (workflow.js) to control the logic.
Key Highlights:
Zero Context Bloat: Data passes directly between sub-agents via code, bypassing the main chat's context window completely. You can chain 100+ agents without performance drops.
Code-Driven Control: Uses native JS for while loops, conditional branching, and forcing strict JSON schemas on agent outputs.
Built for Scale: Supports running agents in parallel, pipelining tasks sequentially, setting hard token budgets, and handling automatic 3x retries if an MCP server drops.
Background UI: Workflows run in the background via the /workflows command, letting you pause, skip, or monitor jobs while continuing to chat with Claude normally.
How to use: It is currently off by default. You have to set an environment variable in your terminal to unlock it before launching Claude Code.
English












