
Tim✨
2.6K posts

Tim✨
@timyangnet
Co-Founder Westar Labs | 🛠️ $STC & AI Explorer | Ex-Chief Architect Weibo (NASDAQ:WB) What we hear is opinion; what we see is perspective. 此有故彼有 此生故彼生


¿Alguna vez has visto cómo un agente de IA se queda corto de contexto… no porque le falte inteligencia, sino porque desperdicia miles de tokens solo para leer código? Usa cat o grep un par de veces en un archivo mediano y, de repente, la ventana de contexto ya está medio llena de información que ni siquiera necesitaba. Hazlo en la siguiente pregunta y el problema se multiplica. En codebases reales, esto se convierte en el cuello de botella silencioso que limita lo que los agentes pueden lograr. [pluck] llega para cambiar esa ecuación. Es un motor de recuperación de código escrito en Rust, nativo de MCP (Model Context Protocol), diseñado específicamente para que los agentes de IA lean y naveguen código de forma inteligente. El resultado medido: entre un 84 % y un 88 % menos tokens en lecturas típicas de código, manteniendo (o incluso mejorando) la capacidad de entender lo que realmente importa. En vez de servir archivos completos o fragmentos arbitrarios, pluck hace chunking a nivel AST con Tree-sitter. Entiende funciones, clases, bloques lógicos y firmas. Luego combina dos mundos: - Búsqueda por palabras clave avanzada (BM25F con campos) - Ranking semántico con embeddings estáticos (sin inferencia en tiempo de ejecución) El sistema usa una cascada de dos etapas + fusión RRF para que las consultas en lenguaje natural y las búsquedas simbólicas funcionen igual de bien. Todo indexado localmente en un daemon persistente que responde en 0,07 ms (p50 en caliente). Y hay una capa extra muy potente: deduplicación por sesión. Si ya mostraste un chunk en una interacción anterior, lo reemplaza por un placeholder ligero. Eso añade otro 23 % de ahorro en conversaciones multi-turno. pluck no se limita a "buscar". Ofrece un conjunto rico de operaciones pensadas para agentes: - read → devuelve un outline inteligente (firmas + cuerpos de helpers inline). Ahorro típico del 85-88 % en archivos grandes. - symbol, peek, impact y deps → navegas grafos de llamadas, imports y dependencias sin tener que reconstruirlos tú. - digest → comprime logs de CI y tests manteniendo los errores clave (71 % menos tokens). - plan → sugiere los siguientes 3-5 pasos de exploración que el agente debería dar. Y lo más importante: siempre existe el modo --raw que devuelve exactamente lo mismo que cat o grep byte por byte. Nunca pierdes la capacidad original. Es un reemplazo inteligente, no una limitación. pluck no es "otra herramienta de búsqueda". Es infraestructura pensada para la era de los agentes de coding. Mientras más eficientes sean recuperando contexto relevante, más lejos podrán llegar antes de chocar contra los límites de tokens. REPOOO👇

Anthropic is pushing HTML artifacts as the future of AI workflows. What they're not telling you: a markdown report costs ~800 tokens. The same content in styled HTML costs 2,500-4,000. That's 3-5x more tokens burned on divs and CSS instead of reasoning and depth. More tokens spent per task means more API calls. More API calls means more revenue. The incentive is right there. I steelmanned every major argument for HTML-first workflows and pressure-tested what holds up. One out of five survived.

Most agentic CLIs are built in TypeScript. Here’s why that’s a mistake (and you should use Go instead): We'll use Michael @maximilien as an example… He built his Weave CLI with Go. He's also the former CTO at IBM and former Chairperson of the NodeJS Foundation. So this is not a “TypeScript is bad” take... He knows the ecosystem deeply. But when he started building Weave CLI, an open-source tool for production RAG across 11 vector databases, the constraints were different. He needed something that could run anywhere. And this is where Go shines. It has no: • npm install • Python virtual envs • uv issues • JVM setup • Broken package registries • On-prem network restrictions Just download the binary, make it executable, and run it. Weave CLI has to: • Spin up vector databases • Ingest documents • Run RAG agents • Compare embeddings • Benchmark configurations • Monitor traces & experiments with Opik by @Cometml For this kind of infrastructure tooling, installation friction is product friction. If users can’t run it easily, they won’t use it. But there's a deeper lesson in this: Don’t pick your stack based on the herd. Pick it based on what the system needs. For frontend-heavy agent apps, TypeScript may be the right choice. For infra-heavy CLIs and TUIs that need to run anywhere, Go is hard to beat. Full Weave CLI case study in Decoding AI Magazine: decodingai.com/p/ship-rag-wit…

Thinky's secret plan: 1: Increase Human<->AI bandwidth 2: Raise ceiling of human+AI intelligence 3: Help humans continue as main-characters in the new world We are at Step 1. Interaction Models are great real-time collaborative tools for humans. Here's a preview:


Yesterday @coinbase experienced a multi-hour service disruption affecting trading, exchange access, and balance updates. Here's our initial read from Coinbase engineering on what happened, how we recovered, and what we're addressing. At approximately 23:50 UTC on 2026-05-07, our monitoring detected cascading quote failures from internal services that triggered multiple Sev1 incidents that engineering immediately began investigating. Customer-facing impacts included spot trading, Prime, International and derivative exchanges. Root cause: a thermal event (cooling system failure) inside a subset of racks within a single building in AWS us-east-1. We run a primary replica of our exchange infrastructure in a single zone, consistent with industry standards to reduce latency. To prepare for failures like this, we maintain a distributed standby, but during this incident, failures in the primary zone that were designed to be isolated were not, extending the duration of our outage. The failure cascaded down two paths: 1. Multiple hardware components beneath our exchange’s matching engine failed, requiring recovery and failover 2. Distributed Kafka clusters that manage messaging across Coinbase systems failed to remain available, also requiring partition failovers to new hardware brokers with many TiBs of data After isolating the incident: automated tooling drained ~10 Kubernetes clusters worth of related workloads out of the affected zone to stabilize internal services. Most services were back to normal within ~30 minutes of diagnosis. The two things we couldn't automatically drain: the exchange (dedicated hardware and storage) and Kafka (managed service that was designed to be resilient to this, with unique problems). The exchange matching engine is the core system responsible for processing orders and maintaining order books. It is a distributed cluster and requires quorum to safely elect a leader and continue processing trading activity. During the incident, infrastructure-level constraints in the affected datacenter left only a subset of nodes healthy, preventing the cluster from reaching quorum. As a result, trading across Retail, Advanced, and Institutional exchanges were blocked. Recovery required our oncall and engineering teams to execute our disaster recovery plan, restore quorum safely, and validate system health under constrained infrastructure conditions. The team built, tested, deployed, and validated the fix while continuing to manage the broader incident. Kafka recovery was a much larger scale operation. Our primary managed Kafka partitions process many terabytes of data daily and are designed with resiliency guarantees for uninterrupted operation during a datacenter failure just like this. In this case, those guarantees failed and required manual recovery. We again relied on disaster recovery procedures to recover stuck partitions onto new hardware (brokers) that enabled us to safely bring x-service messaging back online across Coinbase. During the lag, customers saw delayed balance streams which resolved automatically once replication caught up. No data lost. Once the engine came back up as part of our standard runbooks, we re-opened markets carefully: all products to cancel-only mode first, audited product states, then moved all markets to auction mode, before restoring trading on Coinbase Exchange. What went right: the team. Incident response across the company came together within minutes, followed well-rehearsed playbooks and used secure automation tooling to recover all services. We have a strong, senior team at Coinbase that worked through rare failure modes to recover all services. To our customers: losing access to your account, even temporarily, is unacceptable. We know that. We're sorry, and we’ll publish a full root cause analysis in the coming weeks 🙏






The use of the phrase “not just __, it’s a __” (staple of AI-generated text) has risen sharply in SEC company filings:

Polymarket prices are highly accurate in predicting future events. The source of that accuracy is less obvious. In a new working paper, we find it is not the “wisdom of crowds,” but a small minority of informed traders. Fewer than 3% of accounts appear to drive price discovery; most perform no better than chance. The majority generates most of the volume but little of the information, effectively funding the informed minority. Check the paper here: papers.ssrn.com/sol3/papers.cf…




