Kan Yuenyong

41.4K posts

Kan Yuenyong

@sikkha

I work as Geopolitical strategist. In case I was banned from twitter, find me at https://t.co/BcZiwy2b6t . RT ≠ Endorsement

Tokyo, Japan Katılım Kasım 2007

5.1K Takip Edilen1.9K Takipçiler

Sabitlenmiş Tweet

Kan Yuenyong@sikkha·14 Tem

Friend! An update on our #AIapp, we've been impacted by recent changes to Twitter's API rate limits. Unfortunately, we can no longer provide real-time trend snapshots every six hours as research. We're seeking alternatives & remain committed to #DataDemocracy. More updates soon!

English

5.9K

Kan Yuenyong retweetledi

Matt Dancho (Business Science)@mdancho84·7h

This guy built an entire AI data science team in Python. Then open-sourced (100% free). It automates data science workflows with AI, including data loading, cleaning, exploratory analysis, and feature engineering. And it tracks each step in a 100% reproducible pipeline. 00:00 Project Overview 01:32 Diving into the AI Data Science Workflow and Data Loading 02:10 Data Wrangling and Cleaning 03:33 Data Visualization Insights & Plotting 04:08 Feature Engineering 05:00 Live 1-Hour Workshop 05:44 AI Data Science Team Python Library 🔗 AI Data Science Team On GitHub (Give it a Star) github.com/business-scien… 🔗 Join My Next Live 1-Hour Agentic AI Workshop (Free): learn.business-science.io/ai-register

English

204

13.1K

Kan Yuenyong retweetledi

Vaishnavi@_vmlops·1d

This might be the wildest AI engineering breakdown on the internet right now 🤯 After the Anthropic leak… Someone turned the ENTIRE Claude Code system into a readable playbook. 👉 claude-code-from-source.com⁠ We’re talking: * 500K+ lines of real production AI agent logic * Broken down into 18 chapters you can actually learn from * Multi-agent systems, tool pipelines, memory, orchestration… all exposed This isn’t theory This is how a top-tier AI coding agent actually works under the hood Key ideas you’ll steal instantly: → Agent loops with async execution → Multi-agent “teams” coordinating tasks → File-based memory (no DB 🤯) → Context compression tricks → Tool execution pipelines at scale Basically… Instead of guessing how to build AI agents you now have a blueprint from a real system used by thousands of devs claude-code-from-source.com Crazy part? The whole thing was analyzed + rewritten in HOURS using AI agents claude-code-from-source.com If you're building: • AI agents * Dev tools * LLM products * or learning MLOps This is not optional This is a cheat code

English

335

23.5K

Kan Yuenyong retweetledi

Aakash Gupta@aakashgupta·1d

Zuckerberg paid $14.3 billion for a 28-year-old who had never trained a frontier model. Nine months later, that bet just shipped. The benchmark table tells you exactly what kind of lab Wang built. Muse Spark leads or ties Opus 4.6 and GPT 5.4 on multimodal perception, health queries, and visual reasoning. MedXpertQA, SimpleVQA, ScreenSpot Pro, CharXiv. These are all data-quality-sensitive benchmarks where training set curation determines the ceiling. Where it gets destroyed: ARC AGI 2 (42.5 vs 76.5 Gemini), Terminal-Bench (59.0 vs 75.1 GPT 5.4), GDPval office tasks (1444 vs 1672 GPT 5.4). Coding and abstract reasoning. The exact categories where architecture innovation and RL scaling matter more than data. This is a data labeling CEO's model. The fingerprints are all over the results. Wang spent seven years learning which benchmarks respond to better data and which ones require something else entirely. Muse Spark maxed out the first category and exposed the gap in the second. The $14.3B question was always whether the guy who built the best data pipeline in AI could build the best model. The answer so far: he built the best model at the things data pipelines solve, and a mediocre one at everything else. The move nobody's pricing: Meta said larger models are already in development, private API today, open-source future versions. Wang called this "step one." If the next model closes the coding and reasoning gap, Meta goes from also-ran to three-horse race. If it doesn't, they spent $14.3 billion to build a very good medical chatbot for 3 billion users. Both outcomes are interesting. Only one justifies the stock moving 9%.

Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English

223

2.5K

934K

Kan Yuenyong retweetledi

elvis@omarsar0·4h

NEW paper from Microsoft Every agent benchmark has the same hidden problem: how do you know the agent actually succeeded? Microsoft researchers introduce the Universal Verifier, which discusses lessons learned from building best-in-class verifiers for web tasks. It's built on four principles: non-overlapping rubrics, separate process vs. outcome rewards, distinguishing controllable from uncontrollable failures, and divide-and-conquer context management across full screenshot trajectories. It reduces false positive rates to near zero, down from 45%+ (WebVoyager) and 22%+ (WebJudge). Without reliable verifiers, both benchmarks and training data are corrupted. One interesting finding is that an auto-research agent reached 70% of expert verifier quality in 5% of the time, but couldn't discover the structural design decisions that drove the biggest gains. Human expertise and automated optimization play complementary roles. Paper: arxiv.org/abs/2604.06240 Learn to build effective AI agents in our academy: academy.dair.ai

English

131

12.1K

Kan Yuenyong retweetledi

Rohan Paul@rohanpaul_ai·19h

New Stanford paper argues that, under equal reasoning budgets, one LLM usually solves multi-hop problems better than many coordinated ones. The core point is almost embarrassingly simple. A single agent keeps the whole problem in one internal chain of thought, while a multi-agent system has to slice that chain into messages, summaries, and handoffs. Every handoff is a compression step. And once reasoning is compressed, some information is easier to drop than to recover, which is why the paper leans on the Data Processing Inequality as a formal explanation rather than just an empirical hunch. The experiments back that up across Qwen, DeepSeek, and Gemini on FRAMES and MuSiQue: when thinking-token budgets are matched, single-agent systems usually match or beat sequential, debate, role-based, and ensemble setups. Here’s the part most people miss. Many celebrated multi-agent gains may not be architectural gains at all. They often come from spending more test-time compute, surfacing more visible reasoning, or benefiting from evaluation quirks that make the pipeline look smarter than it is. The paper is especially sharp when it looks for the boundary case instead of pretending the rule is universal. When the single agent’s effective context is degraded by masking, substitution, or misleading distractors, multi-agent pipelines become more competitive and sometimes win, not because message passing is magical, but because structure can partially stabilize corrupted reasoning. That is a much narrower and more useful claim than “more agents is better.” It suggests the real trade-off is not single versus multi so much as latent reasoning versus external coordination, with context quality and compute accounting deciding which side looks stronger. For multi-hop reasoning, the default should now be clear: start with one strong model, and treat extra agents as a repair strategy, not an upgrade. ---- Paper Link – arxiv. org/abs/2604.02460 Paper Title: "Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets"

English

4.5K

Kan Yuenyong retweetledi

Goldman Sachs@GoldmanSachs·1d

Markets expected central banks to hike rates as oil prices rose, but history suggests that rates could come down before the end of the year as higher prices start to weigh on growth, according to Goldman Sachs Research. Sign up for our weekly newsletter, Briefings, for more insights: click.gs.com/gx5d

English

210

769

55.3K

Kan Yuenyong retweetledi

The Year of the Graph@TheYotg·1d

Semantica v0.4.0: Temporal intelligence and Agentic AI Semantica is a framework for building context graphs and decision intelligence systems with explainability and provenance. It transforms AI agents from black boxes into trustworthy, auditable systems by providing structured knowledge representation, complete decision tracking, and end-to-end lineage. Perfect for high-stakes domains where every answer must be traceable: healthcare, finance, legal, cybersecurity, and government. 🕐 Temporal Intelligence Bi-temporal model baked into the core — valid time + transaction time on everything. Query your graph at any point in history. Full Allen interval algebra — deterministic, zero LLM calls. Extract temporal metadata from text with calibrated confidence scores. 🤖 Agentic AI with Agno Graph-backed persistent agent memory. Multi-hop GraphRAG for agent knowledge retrieval. Decision-intelligence & KG pipeline toolkits agents can call natively. Shared memory across entire agent teams, role-scoped. 🧠 Datalog-Style Reasoning Recursive Horn clause rules — ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y). Handles transitivity & self-joins that loop forward/backward engines indefinitely. O(1) delta-index — re-evaluates only what changed. 🔌 Novita AI Provider create_provider("novita") — OpenAI-compatible, plug-and-play. Default: deepseek/deepseek-v3.2, configure via NOVITA_API_KEY. 🔍 Knowledge Explorer API 20+ REST endpoints — graph traversal, analytics, decisions, SPARQL export. SKOS vocabulary browsing + RDF/OWL import. WebSocket real-time progress · launch with one command. ✅ Ontology & Validation SHACL shape generation — zero hand-authoring, three quality tiers. Cross-ontology alignment + structured diff with breaking-change classification. 🗂️ Named Graph Support Correct FROM / FROM NAMED clause handling in query execution. Graph URI percent-encoding, default_graph_uri alias. ⚡ Under the Hood O(N) → O(limit) pagination — no more 502s on large graphs. Full RLock thread safety across all graph paths. 6 CodeQL security findings resolved. 886 tests passing. 0 failures. By Mohd Kaif Semantica v0.4.0 — Release Notes github.com/Hawksight-AI/s… #SemanticWeb #TemporalReasoning #MultiAgentSystems #KnowledgeEngineering #OpenSource -- 📩 The Year of the Graph Spring 2026 newsletter issue is out! Beyond Context Graphs: How Ontology, Semantics, and Knowledge Graphs Define Context 👇 yearofthegraph.xyz/newsletter/202… All things #KnowledgeGraph, #GraphDB, Graph #Analytics / #DataScience / #AI and #SemTech. Subscribe and follow to be in the know. Reach out if you'd like to be featured

English

1.6K

Kan Yuenyong retweetledi

Adina Yakup@AdinaYakup·2d

GLM-5.1 is available on the @huggingface 🔥 huggingface.co/zai-org/GLM-5.1 ✨ Apache2.0 license ✨ Better at handling long, complex tasks than GLM-5

English

7.5K

Kan Yuenyong retweetledi

Carl Zha@CarlZha·2d

Taiwan main opposition party KMT chairwoman Cheng Li-wun arrived in Shanghai. The 1st KMT leader visit on mainland China in a decade. She is schedule to travel to Beijing to meet with President Xi Jinping

English

177

1.4K

8.1K

413.4K

Kan Yuenyong retweetledi

alphaXiv@askalphaxiv·1d

Introducing GLM-5.1 for understanding research papers 🚀 Highlight any section of a paper to ask questions and “@” other papers for quick context, comparisons, and benchmark references

English

336

27.2K

Kan Yuenyong retweetledi

Roan@RohOnChain·2d

This 1 hour MIT lecture by Jim Simons (Quant King) will teach you more about quantitative trading than most people learn in their entire career at Wall Street. Bookmark this & watch, no matter what. It’s the most productive start you can give your week. Then read article below.

Roan@RohOnChain

x.com/i/article/2037…

English

945

4.5K

761.2K

Kan Yuenyong retweetledi

How To AI@HowToAI_·2d

🚨 Someone just open-sourced a tool that converts pdfs to markdown at 100 pages per second. It's called OpenDataLoader. It runs entirely on CPU and handles complex layouts, tables, and nested structures like a senior dev 100% Free.

English

334

2.6K

146.9K

Kan Yuenyong retweetledi

Charly Wargnier@DataChaz·1d

🚨 @karpathy literally ditched traditional RAG for an autonomous Obsidian file system. Instead of writing code, he dumps raw AI research into a local folder and lets an LLM convert it into an interconnected markdown wiki. He rarely edits the text manually. By relying purely on dynamically updated index files, the system navigates the exact context it needs natively without relying on flawed vector embeddings. Because the LLM fully understands the file structure, it executes advanced autonomous workflows: → Operates a custom vibe-coded local search engine → Renders complex charts and formatted markdown slides → Continuously compounds a 400,000-word knowledge base The most fascinating mechanic is the self-healing loop. He triggers background health checks where the LLM natively spots structural gaps, scrapes the internet for missing data, and cleans the articles perfectly. This feels the absolute blueprint for managing complex technical data 🔥 btw, he also plans to fine-tune a local model directly on the wiki so the research is baked into the neural weights rather than relying on limited context windows 👀

English

186

915

54.6K

Kan Yuenyong retweetledi

Frank Nielsen@FrnkNlsn·1d

"Foundations of Probabilistic Programming" cambridge.org/core/books/fou…

English

397

14.7K

Kan Yuenyong retweetledi

Matt Dancho (Business Science)@mdancho84·2d

🚨 BREAKING: Microsoft launches a free Python library that converts ANY document to Markdown Introducing Markitdown. Let me explain. 🧵

Matt Dancho (Business Science) tweet media

English

269

242.2K

Kan Yuenyong retweetledi

The White House@WhiteHouse·1d

OFFICIAL STATEMENT OF IRAN:

English

5.7K

10.1K

41.2K

3.5M

Kan Yuenyong retweetledi

Tom Dörr@tom_doerr·2d

Trading tools for AI agents via MCP github.com/atilaahmettane…

English

118

725

41.7K

Kan Yuenyong retweetledi

elvis@omarsar0·2d

NEW paper on multi-agents from Stanford. More agents, better results, right? Not so fast. This paper challenges a core assumption in the multi-agent hype by controlling for what most studies don't: total computation. It compares single-agent and multi-agent LLM architectures on multi-hop reasoning under matched thinking-token budgets across different models. The finding is clear: Single-agent systems are more information-efficient when reasoning tokens are held constant. The authors also identify significant artifacts in API-based budget control that may artificially inflate multi-agent advantages. Why does it matter? Many reported multi-agent gains disappear once you account for unequal computation. Before building a multi-agent system, check whether a single agent with the same token budget would do the job. This paper gives you the framework to make that call. Paper: arxiv.org/abs/2604.02460 Learn to build effective AI agents in our academy: academy.dair.ai

English

275

19.5K

Kan Yuenyong retweetledi

Seyed Abbas Araghchi@araghchi·1d

Statement on behalf of the Supreme National Security Council of the Islamic Republic of Iran:

English

6.6K

31.7K

107.9K

7.2M

Kan Yuenyong retweetledi

Nina Schick@NinaDSchick·1d

Claude Mythos. Ten trillion parameters: the first model in this weight class. Estimated training cost: ten billion dollars. On the hardest coding test in the industry (SWE bench) it scores 94%. It found a security flaw in a system that had been running for 27 years, one that every human engineer and every automated check had missed. It found another bug that had survived five million test runs over 16 years. (It did so overnight.) It is so capable in cybersecurity that Anthropic will not release it to the public, instead it is launching Project Glasswing along with 100m in compute credits to help secure software. Only twelve partners currently have access: Amazon, Cisco, Apple, Google, Microsoft, NVIDIA, JPMorgan Chase, Crowdstrike, Palo Alto, AWS, The Linux Foundation, Broadcom. (I'm sure the Pentagon is on the line?) This is not a product launch: it is a controlled deployment of a system too powerful to distribute freely. Tell me this isn't (very expensive) AGI?

Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English

584

905

11.3K

1.9M

Keşfet

@huggingface @karpathy @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA