lamb
1.1K posts

lamb
@lamb356_
Building things | just here to help https://t.co/gjhtVUURbB


MiMo-V2-Pro & Omni & TTS is out. Our first full-stack model family built truly for the Agent era. I call this a quiet ambush — not because we planned it, but because the shift from Chat to Agent paradigm happened so fast, even we barely believed it. Somewhere in between was a process that was thrilling, painful, and fascinating all at once. The 1T base model started training months ago. The original goal was long-context reasoning efficiency. Hybrid Attention carries real innovation, without overreaching — and it turns out to be exactly the right foundation for the Agent era. 1M context window. MTP inference for ultra-low latency and cost. These architectural decisions weren't trendy. They were a structural advantage we built before we needed it. What changed everything was experiencing a complex agentic scaffold — what I'd call orchestrated Context — for the first time. I was shocked on day one. I tried to convince the team to use it. That didn't work. So I gave a hard mandate: anyone on MiMo Team with fewer than 100 conversations tomorrow can quit. It worked. Once the team's imagination was ignited by what agentic systems could do, that imagination converted directly into research velocity. People ask why we move so fast. I saw it firsthand building DeepSeek R1. My honest summary: — Backbone and Infra research has long cycles. You need strategic conviction a year before it pays off. — Posttrain agility is a different muscle: product intuition driving evaluation, iteration cycles compressed, paradigm shifts caught early. — And the constant: curiosity, sharp technical instinct, decisive execution, full commitment — and something that's easy to underestimate: a genuine love for the world you're building for. We will open-source — when the models are stable enough to deserve it. From Beijing, very late, not quite awake.

Tether AI breakthrough Tether AI team just released new version of QVAC Fabric to include the World’s First Cross-Platform BitNet LoRA Framework to Enable Billion-Parameter AI Training and Inference on Consumer GPUs and Smartphones. Background Microsoft's BitNet uses one bit architecture to dramatically compress models. Traditional LLMs operate on full-precision computation, where weights are stored as complex, high-resolution numbers. The innovation of BitNet is that it shrinks these weights into a tiny ternary range of only -1, 0, and 1. significantly reducing memory usage and computation. LoRA, is a parameter-efficient fine-tuning technique that reduces the number of trainable parameters by up to ninety-nine percent. Together they slash memory and compute requirements. Yet BitNet has mostly been limited to CPU or CUDA NVIDIA backends, and lacked the support of LoRA fine-tuning. Enters QVAC Fabric: the unlock Today, with QVAC Fabric LLM, is the first time BitNet LoRA fine-tuning and inference work cross-platform across GPU vendors and operating systems using Vulkan and Metal backends. That means support for AMD, Intel, Apple Metal and also Mobile GPUs. And for the first time ever, BitNet inference runs efficiently on smartphones using mobile GPUs. On flagship devices, GPU inference is 2 to 11 times faster than CPU while using up to 90% less memory than the full precision models. The biggest unlock: QVAC Fabric LLM support for BitNet LoRA fine-tuning on heterogeneous GPUs. Our team was able to demonstrate this by fine tuning models up to 3.8 billion parameters on all flagships phones such as Pixel 9, S25 and iPhone 16 and up to 13 billion parameter models on the iPhone 16. Github repositories: github.com/tetherto/qvac-… : general QVAC Fabric codebase github.com/tetherto/qvac-… : specific QVAC Fabric's BitNet knowledge base, architecture docs and pre-built binaries What does it mean? What used to require dedicated GPUs now runs on consumer hardware. This breakthrough is the first real-world signal of a local private AI that can truly serve the people. And this is just the beginning. In the next months and years Tether will relentlessly continue to invest significant amounts of resources and capital to continue to research and develop open-source intelligence that can scale and evolve on local devices, providing maximum utility and privacy to its users. The era of Stable Intelligence has just begun. Free as in freedom.

GPT-5.4 (High) has now cleared 90% on this benchmark at a cost of just $0.37/task So that's a 32x efficiency improvement in the last three months, or 12000x since December 2024

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

POKÉMON GO PLAYERS TRAINED 30 BILLION IMAGE AI MAP Niantic says photos and scans collected through Pokémon Go and its AR apps have produced a massive dataset of more than 30 billion real-world images. The company is now using that data to power visual navigation for delivery robots, letting them identify exact locations on city streets without relying on GPS. Source: NewsForce

We’ve trained a multimodal AI model to turn routine pathology slides into spatial proteomics, with the potential to reduce time and cost while expanding access to cancer care.

it's time to drop three new #opensource robotic hands! this time with tactile sensors! Tweak it, 3D print it, and use them in your robotics and physical AI research! Here are some wild examples ↓↓↓


Today we're opening up MMT - everyone now has access to professional grade tools for FREE. - Market Profile / TPO - Custom Session TPO - Hyperliquid MBO Profile - Aggregated Heatmaps - HD Heatmaps - Liquidation Heatmap - Hyperliquid Liquidation Heatmap - Hyperliquid Stop Loss Heatmap - Hyperliquid Take Profit Heatmap - Aggregated Footprints - Filtered Footprints - Dual Cluster Modes - Bucketed Trade Size Groups - Aggregated Indicators - Aggregated CVD - Aggregated Open Interest - Net Longs/Shorts Indicator - VWAP Suite - Volume Bubbles - Custom Scripting - Community Indicators - Aggregated Orderbooks - Aggregated DOM - Orderbook Imbalances - Orderbook Depth Overlay - 1s Time-Frames - Custom Time-Frames Available now, for everyone – only in MMT

There is only one form of capital formation I'm aware of that can scale with AI: @MetaDAOProject AI will cause an explosion in long tail companies. More long tail = more necessities for programatic bolted-in accountability. The friction of the DE C-corp cannot keep up.





