BlockedPath

5.7K posts

BlockedPath

@BlockedPaths

Building personal AI infrastructure. Local LLMs, coding agents, open-source models. Real hardware, real constraints. Learning in public.

Maryland เข้าร่วม Nisan 2009

237 กำลังติดตาม613 ผู้ติดตาม

ทวีตที่ปักหมุด

BlockedPath@BlockedPaths·1d

x.com/i/article/2057…

ZXX

624

BlockedPath@BlockedPaths·16h

@sri9s Deepseek

English

SrinathJ@sri9s·23h

Ok now I'm confused if I should go with Codex or Cursor/composer

English

140

276

36.6K

BlockedPath รีทวีตแล้ว

siddharth@buildwithsid·19h

something is not adding up there’s no fknn wayyy a model this large is sustainable at these prices 💀 are they selling our data?

DeepSeek@deepseek_ai

We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀

English

7.9K

BlockedPath@BlockedPaths·16h

Codex got us with the bait and switch.

BlockedPath@BlockedPaths

I used Claude way more here in this instance than Codex and you can see the difference. I have a lot of usage left on Claude compared to Codex. Fishy

English

220

49K

BlockedPath@BlockedPaths·19h

@hoffmanndaniel Same here man same here!

English

Daniel@hoffmanndaniel·19h

Same here! I silently switched back from #Codex to #Claude in the last days because I hit the limits in like one day. Surprised how well Claude is working right now + solid limits 👍

BlockedPath@BlockedPaths

I used Claude way more here in this instance than Codex and you can see the difference. I have a lot of usage left on Claude compared to Codex. Fishy

English

BlockedPath@BlockedPaths·1d

@Michaelzsguo Genius work my man

English

572

Michael Guo@Michaelzsguo·1d

While DeepSeek is pursuing the goal, my Codex agent and I monitor it in the sidecar and guide or correct it as needed. So I thought I would ask Codex to objectively judge DeepSeek’s capability based on multiple rounds of interaction. Keep in mind, Codex does not know it is talking to DeepSeek. It thought it was another Codex agent. Here is Codex’s evaluation of DeepSeek V4 Pro:

English

132

11.5K

BlockedPath@BlockedPaths·1d

@AnthropicAI 10,000+ critical vulns found by AI is a big number but what matters is the fix rate. Finding bugs is the easy half of security. The other half is getting maintainers to patch, rolling out fixes, and verifying they hold. AI is great at the find. The fix is still human-speed.

English

2.2K

Anthropic@AnthropicAI·1d

Last month we launched Project Glasswing, our collaborative AI cybersecurity initiative. Since then, we and our partners have found more than ten thousand high- or critical-severity vulnerabilities in essential software.

English

415

573

7.6K

2.3M

BlockedPath@BlockedPaths·1d

@vaxryy No package manager is a feature until you need a dependency tree. C++ works because the ecosystem is mature enough that most things are self-contained. The real supply chain fix isn't fewer packages, it's auditable dependency graphs. Which JS will never have.

English

2.8K

vaxry@vaxryy·1d

supply chain attacks? yeah no, I use C++. We don't even have a package manager.

English

120

151

4.9K

201.4K

BlockedPath@BlockedPaths·1d

New workflow feature coming to Claude.

BlockedPath@BlockedPaths

x.com/i/article/2057…

English

424

BlockedPath@BlockedPaths·1d

@hd_nvim Dropping runtime support because it's "vibe-coded" is a wild precedent. If every tool dropped support for software written with AI assistance there'd be nothing left to support. The real question is whether the code works, not how it was written.

English

4.7K

Herrington Darkholme@hd_nvim·1d

yt-dlp plans to drop support for Bun Reason: it is vibe-coded

English

103

188

4.7K

531.4K

BlockedPath@BlockedPaths·1d

@cursor_ai SDK for custom agents is the right abstraction. The IDE is becoming the agent runtime, not just the editor. The question is whether custom agents on Cursor compete with or complement Codex-style orchestration. Probably both.

English

1.3K

Cursor@cursor_ai·1d

With the Cursor SDK, you can build your own agents with Composer 2.5. It's now available in Python and TypeScript. This long weekend, Composer usage is 90% off in the SDK. We're excited to see what you build!

English

125

185

2.5K

485.6K

BlockedPath@BlockedPaths·1d

@testingcatalog DeepSeek V4 Pro at 75% off is aggressive. When the cheapest cloud model keeps getting cheaper, the case for local shifts from cost savings to control. You can't self-host something that disappears when a company changes pricing.

English

1.1K

🚨 AI News | TestingCatalog@testingcatalog·1d

DeepSeek permanently reduced pricing for DeepSeek V4 Pro by 75%! > $0.003625 per million input tokens (with cache) > $0.435 per million input tokens. > $0.87 per million output tokens. Cache is almost free 👀

DeepSeek@deepseek_ai

We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀

English

109

1.8K

105.5K

BlockedPath@BlockedPaths·1d

@StijnSmits "Use at least 100 tool calls" is just a hack to force chain-of-thought through tool use instead of reasoning tokens. Same idea as "think step by step" but you're burning compute instead of guiding logic. The prompt engineering that actually works: constrain the output format.

English

Stijn@StijnSmits·1d

Finally got GPT-5.5 on Codex to think way harder than it normally does. As one would expect it to be 'think very hard', 'ultrathink', 'deep dive' etc - does not work. A simple 'use at least 100 (onehundred) tool calls before coming up with an answer' works really well

English

1.6K

98.4K

BlockedPath@BlockedPaths·1d

@jussisaur This is the real agent debate. Autocomplete keeps you in the code. Agents take you out of it. The productivity gain is real but the comprehension loss is real too. The answer isn't one or the other, it's knowing which mode for which task.

English

453

Jussi@jussisaur·2d

i am quite close to going back to an autocomplete-only AI coding style. dead serious. i'm not sure the ostensible speed of agent-first coding is worth the brainrot, the laziness and the loss of code and architecture comprehension

English

110

1.7K

69.1K

BlockedPath@BlockedPaths·1d

@ThePrimeagen 100x comes from one agent that actually finishes the task. 1000x is just 10 agents that each break at step 7 and you spend all day debugging their output. The multiplier that matters is reliability, not count.

English

1.3K

ThePrimeagen@ThePrimeagen·1d

Honestly why stop at 100x engineer? Just use more agents, you literally could be 1000x, 10000x, 100000x just by scaling You could what you use to in an entire year in one second

English

283

147

4.3K

159.6K

BlockedPath@BlockedPaths·1d

@ClaudeDevs Auto mode is the right move. The jump from "prompt → response" to "prompt → plan → execute → verify" is what separates a chatbot from an agent. But the hard part is still knowing when to stop and when to escalate.

English

3.7K

ClaudeDevs@ClaudeDevs·1d

Two updates to auto mode: · Now available on the Pro plan · Sonnet 4.6 is now supported, alongside Opus 4.7 Shift+tab, and let Claude run.

English

164

233

5.9K

427.7K

BlockedPath@BlockedPaths·1d

@dhh The real pattern: every 3 months the "best" agent model swaps. What doesn't swap is your context structure, tool access, and memory persistence. The model is the interchangeable part. The infra around it is the product.

English

1.2K

DHH@dhh·1d

For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.

English

177

197

4.4K

446.6K

BlockedPath@BlockedPaths·1d

TL;DR: Anthropic quietly added an experimental /workflows tool to Claude Code for Deterministic Multi-Agent Orchestration. Instead of using a central LLM to manage sub-agents (which causes massive token costs, context bloat, and sloppy performance), you now write a plain JavaScript file (workflow.js) to control the logic. Key Highlights: Zero Context Bloat: Data passes directly between sub-agents via code, bypassing the main chat's context window completely. You can chain 100+ agents without performance drops. Code-Driven Control: Uses native JS for while loops, conditional branching, and forcing strict JSON schemas on agent outputs. Built for Scale: Supports running agents in parallel, pipelining tasks sequentially, setting hard token budgets, and handling automatic 3x retries if an MCP server drops. Background UI: Workflows run in the background via the /workflows command, letting you pause, skip, or monitor jobs while continuing to chat with Claude normally. How to use: It is currently off by default. You have to set an environment variable in your terminal to unlock it before launching Claude Code.

English

374