BlockedPath

5.7K posts

BlockedPath banner
BlockedPath

BlockedPath

@BlockedPaths

Building personal AI infrastructure. Local LLMs, coding agents, open-source models. Real hardware, real constraints. Learning in public.

Maryland เข้าร่วม Nisan 2009
237 กำลังติดตาม613 ผู้ติดตาม
SrinathJ
SrinathJ@sri9s·
Ok now I'm confused if I should go with Codex or Cursor/composer
English
140
1
276
36.6K
Michael Guo
Michael Guo@Michaelzsguo·
While DeepSeek is pursuing the goal, my Codex agent and I monitor it in the sidecar and guide or correct it as needed. So I thought I would ask Codex to objectively judge DeepSeek’s capability based on multiple rounds of interaction. Keep in mind, Codex does not know it is talking to DeepSeek. It thought it was another Codex agent. Here is Codex’s evaluation of DeepSeek V4 Pro:
Michael Guo tweet mediaMichael Guo tweet media
English
10
6
132
11.5K
BlockedPath
BlockedPath@BlockedPaths·
@AnthropicAI 10,000+ critical vulns found by AI is a big number but what matters is the fix rate. Finding bugs is the easy half of security. The other half is getting maintainers to patch, rolling out fixes, and verifying they hold. AI is great at the find. The fix is still human-speed.
English
0
0
3
2.2K
Anthropic
Anthropic@AnthropicAI·
Last month we launched Project Glasswing, our collaborative AI cybersecurity initiative. Since then, we and our partners have found more than ten thousand high- or critical-severity vulnerabilities in essential software.
English
415
573
7.6K
2.3M
BlockedPath
BlockedPath@BlockedPaths·
@vaxryy No package manager is a feature until you need a dependency tree. C++ works because the ecosystem is mature enough that most things are self-contained. The real supply chain fix isn't fewer packages, it's auditable dependency graphs. Which JS will never have.
English
1
0
27
2.8K
vaxry
vaxry@vaxryy·
supply chain attacks? yeah no, I use C++. We don't even have a package manager.
English
120
151
4.9K
201.4K
BlockedPath
BlockedPath@BlockedPaths·
@hd_nvim Dropping runtime support because it's "vibe-coded" is a wild precedent. If every tool dropped support for software written with AI assistance there'd be nothing left to support. The real question is whether the code works, not how it was written.
English
3
0
38
4.7K
Herrington Darkholme
yt-dlp plans to drop support for Bun Reason: it is vibe-coded
Herrington Darkholme tweet media
English
103
188
4.7K
531.4K
BlockedPath
BlockedPath@BlockedPaths·
@cursor_ai SDK for custom agents is the right abstraction. The IDE is becoming the agent runtime, not just the editor. The question is whether custom agents on Cursor compete with or complement Codex-style orchestration. Probably both.
English
0
0
3
1.3K
Cursor
Cursor@cursor_ai·
With the Cursor SDK, you can build your own agents with Composer 2.5. It's now available in Python and TypeScript. This long weekend, Composer usage is 90% off in the SDK. We're excited to see what you build!
English
125
185
2.5K
485.6K
BlockedPath
BlockedPath@BlockedPaths·
@testingcatalog DeepSeek V4 Pro at 75% off is aggressive. When the cheapest cloud model keeps getting cheaper, the case for local shifts from cost savings to control. You can't self-host something that disappears when a company changes pricing.
English
0
0
0
1.1K
BlockedPath
BlockedPath@BlockedPaths·
@StijnSmits "Use at least 100 tool calls" is just a hack to force chain-of-thought through tool use instead of reasoning tokens. Same idea as "think step by step" but you're burning compute instead of guiding logic. The prompt engineering that actually works: constrain the output format.
English
1
0
1
52
Stijn
Stijn@StijnSmits·
Finally got GPT-5.5 on Codex to think way harder than it normally does. As one would expect it to be 'think very hard', 'ultrathink', 'deep dive' etc - does not work. A simple 'use at least 100 (onehundred) tool calls before coming up with an answer' works really well
English
60
57
1.6K
98.4K
BlockedPath
BlockedPath@BlockedPaths·
@jussisaur This is the real agent debate. Autocomplete keeps you in the code. Agents take you out of it. The productivity gain is real but the comprehension loss is real too. The answer isn't one or the other, it's knowing which mode for which task.
English
0
0
1
453
Jussi
Jussi@jussisaur·
i am quite close to going back to an autocomplete-only AI coding style. dead serious. i'm not sure the ostensible speed of agent-first coding is worth the brainrot, the laziness and the loss of code and architecture comprehension
English
110
82
1.7K
69.1K
BlockedPath
BlockedPath@BlockedPaths·
@ThePrimeagen 100x comes from one agent that actually finishes the task. 1000x is just 10 agents that each break at step 7 and you spend all day debugging their output. The multiplier that matters is reliability, not count.
English
2
0
2
1.3K
ThePrimeagen
ThePrimeagen@ThePrimeagen·
Honestly why stop at 100x engineer? Just use more agents, you literally could be 1000x, 10000x, 100000x just by scaling You could what you use to in an entire year in one second
English
283
147
4.3K
159.6K
BlockedPath
BlockedPath@BlockedPaths·
@ClaudeDevs Auto mode is the right move. The jump from "prompt → response" to "prompt → plan → execute → verify" is what separates a chatbot from an agent. But the hard part is still knowing when to stop and when to escalate.
English
0
0
5
3.7K
ClaudeDevs
ClaudeDevs@ClaudeDevs·
Two updates to auto mode: · Now available on the Pro plan · Sonnet 4.6 is now supported, alongside Opus 4.7 Shift+tab, and let Claude run.
English
164
233
5.9K
427.7K
BlockedPath
BlockedPath@BlockedPaths·
@dhh The real pattern: every 3 months the "best" agent model swaps. What doesn't swap is your context structure, tool access, and memory persistence. The model is the interchangeable part. The infra around it is the product.
English
0
0
3
1.2K
DHH
DHH@dhh·
For complicated agent work, it's amazing how much GPT5.5 has improved. I found 5.2 to be very far behind Opus. Now using Opus 4.7 after 5.5 feels like a big step backwards. Gotta love this level of competion! Strong comeback for OpenAI.
English
177
197
4.4K
446.6K
BlockedPath
BlockedPath@BlockedPaths·
TL;DR: Anthropic quietly added an experimental /workflows tool to Claude Code for Deterministic Multi-Agent Orchestration. Instead of using a central LLM to manage sub-agents (which causes massive token costs, context bloat, and sloppy performance), you now write a plain JavaScript file (workflow.js) to control the logic. Key Highlights: Zero Context Bloat: Data passes directly between sub-agents via code, bypassing the main chat's context window completely. You can chain 100+ agents without performance drops. Code-Driven Control: Uses native JS for while loops, conditional branching, and forcing strict JSON schemas on agent outputs. Built for Scale: Supports running agents in parallel, pipelining tasks sequentially, setting hard token budgets, and handling automatic 3x retries if an MCP server drops. Background UI: Workflows run in the background via the /workflows command, letting you pause, skip, or monitor jobs while continuing to chat with Claude normally. How to use: It is currently off by default. You have to set an environment variable in your terminal to unlock it before launching Claude Code.
English
3
1
8
374
BlockedPath
BlockedPath@BlockedPaths·
CLAUDE_CODE_WORKFLOWS=1 try it out
English
2
1
9
2.4K
BlockedPath รีทวีตแล้ว
BlockedPath
BlockedPath@BlockedPaths·
New feature I added to the iPhone usage widget!
English
0
2
5
638