
Taiwan's MediaTek says it supports both TSMC and Intel's advanced packaging technologies reut.rs/4nS9DQM reut.rs/4nS9DQM
Ashish Tuli
242 posts

@ashishtuli
Chips. AI. F1. https://t.co/6dAKvFPVYz All views my own.

Taiwan's MediaTek says it supports both TSMC and Intel's advanced packaging technologies reut.rs/4nS9DQM reut.rs/4nS9DQM

The speed-of-light optimization for Qwen3.5 on the TokenSpeed inference engine is a significant milestone, achieving a record-breaking 580 tokens per second (tps) for agentic workloads on NVIDIA GPUs. In the PyTorch Foundation's latest community blog post, you can learn all about the complete design, implementation, and optimization of Qwen3.5 models in the TokenSpeed inference framework and see for yourself how this work is improving performance 👉 bit.ly/4uGUvIS This achievement was a joint effort between the @Alibaba_Qwen inference team, @lightseekorg Foundation TokenSpeed team, @NVIDIAAI , and the Mooncake team, with special contributions from @tri_dao for FlashAttention-4 (FA4) optimization. @KVCache_AI

⚙️ Behind the build of self-improving tax agents with Codex We co-built Tax AI with @ThriveHoldings around tax prep workflows so when reviewers fix any errors, Codex can trace the failure, improve the system, and test the change before it ships. openai.com/index/building…

PDOOM ALERT 🚨 : ~48% of e2e LLM latency is prefill, ~52% is decode. Prefill itself breaks into 2 ops: 🟠 Prefill extend (cache write) — ingests new context/files, writes fresh KV tokens 🟠 Cache read — reuses existing KV cache from prior turns



MICROSOFT OPEN-SOURCED A GOVERNANCE LAYER FOR YOUR AI AGENTS and it's exactly what agentic ai has been missing here's what agent governance toolkit does: ▫️ intercepts every tool call in deterministic code before it hits the wire denied actions aren't unlikely, they're structurally impossible ▫️ yaml policy engine lets you allow, deny, or require human approval per action ▫️ zero-trust identity via spiffe/did/mtls no more 5 agents sharing one api key ▫️ 4-level execution sandbox with privilege rings so agents can't escape their scope ▫️ tamper-evident merkle audit logs for compliance and incident response ▫️ covers all 10/10 owasp agentic top 10 risks ▫️ works with langchain, crewai, autogen, openai agents sdk, semantic kernel, and more one pip install...any framework...python, typescript, go, rust, .net all supported because "please follow the rules" in a system prompt is not a guardrail...it's a suggestion github.com/microsoft/agen…


Copy and paste this into your codex: “Look through my recent Codex sessions and identify repeated workflows or repeated asks. For anything I keep doing manually, suggest: 1. a skill if it is a reusable workflow 2. a custom subagent if it is a bounded role or investigation task Focus on practical things like CI failures, PR reviews, changelogs, docs updates, release prep, debugging, and test triage. Create the useful ones only. Keep them simple.”



Agentic workloads are quietly rewriting inference economics. We pulled data from 432k real coding agent requests at SemiAnalysis and the median one isn't 32k, isn't 64k, but 96k input tokens. For context, that's more than the entire text of The Great Gatsby being shoved into the model before you've even typed your question. (1/3)🧵



Today, we announced more than $10B in investment across Taiwan’s ecosystem to scale advanced packaging and accelerate next-gen AI infrastructure, from 6th Gen EPYC CPUs codenamed “Venice” to our Helios rack-scale platform including Instinct MI450X GPUs, with multi-gigawatt deployments beginning in 2H 2026. Additionally, AMD and TSMC have hit another major production milestone, with Venice EPYC CPUs ramping on TSMC 2nm technology in Taiwan with future plans to ramp production at TSMC’s Arizona Fab. More on the news: bit.ly/4tJrUkR


