Marktechpost AI Dev News ⚡

@_philschmid x.com/Marktechpost/s…

81

Tuana@tuanacelik·17h

We just open-sourced LiteParse 🎉 A lightweight, local document parser in the shape of an easy-to-use CLI. No API calls, no external service, no cloud dependency. Just fast text extraction from common file formats, right from your terminal. It's built for developers who want parsing that stays on their own infrastructure and gets out of their way. Clean PDFs, DOCX, HTML: run it, get your text, move on. The output is designed to be fed straight into agents so they can read parsed text and reason over screenshots without any extra wrangling. When you hit more complex territory like scanned docs, dense tables, or multi-column layouts, that's where LlamaParse picks up. Same philosophy, more horsepower for the hard stuff. 📖 Announcement post: llamaindex.ai/blog/liteparse… 🔗 GitHub: github.com/run-llama/lite… 🎬 Walkthrough: youtu.be/_gcqMGUWN-E

YouTube

English

15

59

558

36.3K

Marktechpost AI Dev News ⚡@Marktechpost·3h

LlamaIndex Releases LiteParse: A CLI and TypeScript-Native Library for Spatial PDF Parsing in AI Agent Workflows The technical shift here is significant: - Zero Python Dependencies: Built natively in TypeScript using PDF.js and Tesseract.js. It runs entirely on your local CPU—no API keys, no latency, and no data leaving your environment. - Spatial Text Parsing: Instead of struggling with complex Markdown conversion, LiteParse projects text onto a spatial grid. It preserves the document's original indentation and layout, allowing LLMs to use their internal spatial reasoning to interpret tables and multi-column text. - Multimodal Agent Support: Beyond text, LiteParse generates page-level screenshots. This allows your AI agents to "see" charts, diagrams, and visual context that text-only parsers miss. Full Analysis: marktechpost.com/2026/03/19/lla… Repo: github.com/run-llama/lite… Technical details: llamaindex.ai/blog/liteparse…? @llama_index @tuanacelik

English

1

5

8

194

Marktechpost AI Dev News ⚡@Marktechpost·10h

Google just turned Colab into a programmable sandbox for AI agents! No more copy-pasting code into a colab notebook on a browser tab. The new Colab MCP Server gives your local agents (like Claude Code or Gemini CLI) direct, programmatic access to Colab’s cloud GPUs and runtimes. Key Points: → Direct GPU Access: Offload heavy compute from your laptop to the cloud via CLI. → Self-Correction: Agents see the kernel state and errors, allowing them to debug and fix code autonomously. → Persistent Context: Agents build real .ipynb notebooks with documentation and logic, not just chat blocks. → The "agentic" workflow is here. Stop managing notebooks and start orchestrating them. Full analysis: marktechpost.com/2026/03/19/goo… Repo: github.com/googlecolab/co… Technical details: developers.googleblog.com/announcing-the… @googleaidevs @GoogleOSS

QME

@cartesia x.com/Marktechpost/s…

56

Philipp Schmid@_philschmid·2d

Google Colab now has an open-source MCP server that lets you use Colab runtimes with GPUs from any local AI agent. 🔧 Tools to execute_code, connect, notebook editing ☁️ Run Python on cloud GPUs directly from agents 📝 Can create .ipynb files and add code/markdown 🔌 Works with Gemini CLI, Antigravity or any MCP-compatible client

English

22

76

465

24.1K

Marktechpost AI Dev News ⚡@Marktechpost·10h

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

1

552

Cartesia@cartesia·1d

Mamba-3 is out! 🐍 SSMs marked a major advance for the efficiency of modern LLMs. Mamba-3 takes the next step, shaping SSMs for a world where AI workloads are increasingly dominated by inference. Read about it on the Cartesia blog: blog.cartesia.ai/p/mamba-3

English

27

150

41.3K

Marktechpost AI Dev News ⚡@Marktechpost·11h

Google just turned Colab into a programmable sandbox for AI agents! No more copy-pasting code into a colab notebook on a browser tab. The new Colab MCP Server gives your local agents (like Claude Code or Gemini CLI) direct, programmatic access to Colab’s cloud GPUs and runtimes. Key Points: → Direct GPU Access: Offload heavy compute from your laptop to the cloud via CLI. → Self-Correction: Agents see the kernel state and errors, allowing them to debug and fix code autonomously. → Persistent Context: Agents build real .ipynb notebooks with documentation and logic, not just chat blocks. → The "agentic" workflow is here. Stop managing notebooks and start orchestrating them. Full analysis: marktechpost.com/2026/03/19/goo… Repo: github.com/googlecolab/co… Technical details: developers.googleblog.com/announcing-the… @googleaidevs @GoogleOSS

GIF

English

5

13

285

Marktechpost AI Dev News ⚡@Marktechpost·1d

Tsinghua and Ant Group Researchers Unveil a Five-Layer Lifecycle-Oriented Security Framework to Mitigate Autonomous LLM Agent Vulnerabilities in OpenClaw The research team have conducted a comprehensive security analysis of the OpenClaw autonomous LLM agent framework, identifying critical vulnerabilities across its entire operational lifecycle. Their study reveals that OpenClaw’s "kernel-plugin" architecture, centered on the pi-coding-agent, is susceptible to multi-stage systemic risks such as skill poisoning, indirect prompt injection, memory poisoning, and intent drift. To address these threats, the research team proposed a five-layer, lifecycle-oriented defense architecture—comprising Foundational Base, Input Perception, Cognitive State, Decision Alignment, and Execution Control layers—designed to replace fragmented point solutions. This framework utilizes advanced technical enablers, including eBPF for kernel-level sandboxing, Merkle-tree structures for memory integrity validation, and symbolic solvers for formal plan verification, to secure an agent’s complete operational trajectory against complex adversarial attacks..... Full analysis: marktechpost.com/2026/03/18/tsi… Paper: arxiv.org/pdf/2603.11619 @AntGroup @Tsinghua_Uni

English

@aakash_lahoti x.com/Marktechpost/s…

9

42

134.9K

Marktechpost AI Dev News ⚡@Marktechpost·1d

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

English

2

10

31

131K

Marktechpost AI Dev News ⚡@Marktechpost·1d

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

3

Aakash Lahoti@aakash_lahoti·2d

A year of cooking 👨‍🍳and we’re finally serving Mamba-3. What began as a small effort to revisit a few recurring limitations of SSMs grew into a much bigger project. Taking a more principled state space perspective ended up tying these threads together.

Albert Gu@_albertgu

English

7

19

92

7.3K

Marktechpost AI Dev News ⚡@Marktechpost·1d

@caitWW9 @aakash_lahoti @kevinyli_ @_berlinchen @avivbick @zicokolter @tri_dao @_albertgu x.com/Marktechpost/s…

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

The frontier has increasingly shifted to hybrid models - from Qwen to Kimi-Linear and now with NVIDIA's Nemotron-3 Super - that rely on a strong linear sequence model. Today we release Mamba-3, the most powerful linear model to date. x.com/_albertgu/stat…

1

40

caitlin wang@caitWW9·2d

So excited to share Mamba-3! Grateful for the opportunity to work on such an impactful project with @aakash_lahoti, @kevinyli_, @_berlinchen, @avivbick, @zicokolter, @tri_dao, @_albertgu, each of whom I’ve learned much from.

Tri Dao@tri_dao

English

4

8

65

7.6K

Marktechpost AI Dev News ⚡@Marktechpost·1d

@_berlinchen @aakash_lahoti @kevinyli_ @caitWW9 @avivbick @zicokolter @tri_dao @_albertgu x.com/Marktechpost/s…

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

The frontier has increasingly shifted to hybrid models - from Qwen to Kimi-Linear and now with NVIDIA's Nemotron-3 Super - that rely on a strong linear sequence model. Today we release Mamba-3, the most powerful linear model to date. x.com/_albertgu/stat…

27

Berlin Chen@_berlinchen·2d

Had so much fun working on Mamba-3 with my wonderful collaborators: @aakash_lahoti @kevinyli_ @caitWW9 @avivbick @zicokolter @tri_dao @_albertgu. We treated inference as a first-class citizen from day one. This leads to some surprisingly powerful results 👇

Tri Dao@tri_dao

English

@kevinyli_ x.com/Marktechpost/s…

6

31

7.1K

Marktechpost AI Dev News ⚡@Marktechpost·1d

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

11

Kevin Li@kevinyli_·2d

After more than a year, the third Mamba has hatched 🐍 What initially started off as an exploration into how to remove the pesky short conv has finally seen the light of day as a full fledged model, built for performance and efficiency. 1/

Albert Gu@_albertgu

English

@_albertgu x.com/Marktechpost/s…

17

86

9.3K

Marktechpost AI Dev News ⚡@Marktechpost·1d

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

@togethercompute x.com/Marktechpost/s…

57

Albert Gu@_albertgu·2d

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

English

36

311

1.6K

407.3K

Marktechpost AI Dev News ⚡@Marktechpost·1d

Meet Mamba-3: A New State Space Model Frontier with 2x Smaller States and Enhanced MIMO Decoding Hardware Efficiency Here is the technical breakdown: 1️⃣ Exponential-Trapezoidal Discretization Mamba-3 replaces previous first-order heuristics with a second-order accurate approximation. This induces an implicit convolution on the SSM input, allowing the model to function without the external short causal convolutions utilized in prior versions. 2️⃣ Complex-Valued SSMs (The "RoPE Trick") Real-valued linear models often fail at "state-tracking" tasks like parity. Mamba-3 adopts complex-valued updates, proven to be mathematically equivalent to data-dependent Rotary Positional Embeddings (RoPE). This enables it to solve synthetic tasks that previous linear models could not learn. 3️⃣ MIMO (Multi-Input, Multi-Output) Formulation SSM decoding is typically memory-bound, leaving hardware underutilized. Mamba-3 shifts to a matrix-multiplication-based state update. This increases decoding FLOPs by up to 4x while maintaining similar wall-clock latency to Mamba-2. The Results (1.5B Scale): → Accuracy: +1.8 point gain in average downstream accuracy compared to Gated DeltaNet. → Efficiency: Achieves comparable perplexity to Mamba-2 using only half the state size. → Hardware: Optimized Triton and CuTe DSL kernels for fast training and inference. Mamba-3 demonstrates that fundamental methodological changes to the State Space Model viewpoint can bridge the gap between sub-quadratic efficiency and high-tier model quality. 🔗 Full analysis: marktechpost.com/2026/03/18/mee… 🛠 Open Source Kernels: github.com/state-spaces/m… 📄 Paper: arxiv.org/pdf/2603.15569 🌐 Technical details: together.ai/blog/mamba-3 @togethercompute @SCSatCMU @_albertgu @aakash_lahoti @kevinyli_ @_berlinchen @tri_dao

QME

The newest model in the Mamba series is finally here 🐍 Hybrid models have become increasingly popular, raising the importance of designing the next generation of linear models. We've introduced several SSM-centric ideas to significantly increase Mamba-2's modeling capabilities without compromising on speed. The resulting Mamba-3 model has noticeable performance gains over the most popular previous linear models (such as Mamba-2 and Gated DeltaNet) at all sizes. This is the first Mamba that was student led: all credit to @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9, and of course @tri_dao!

18

Together AI@togethercompute·2d

Introducing Mamba-3 🐍 Inference speeds are more important than ever, driven by the rise in agents and inference-heavy RL rollouts. Linear models are fast in FLOPs but memory-bound during decode. Mamba-3's MIMO (multi-input, multi-output) variant fixes this: swap the recurrence from vector outer-product to matrix multiply, and you get a stronger model at the same decode speed. Fastest prefill+decode at 1.5B. Beats Mamba-2, GDN, and Llama-3.2-1B. Kernels open-sourced. #mamba3 #togetherresearch Congratulations to the team leading this research: @aakash_lahoti @kevinyli_ @_berlinchen @caitWW9 @tri_dao @_albertgu

Albert Gu@_albertgu

English

8

29

293

28.9K

Marktechpost AI Dev News ⚡@Marktechpost·1d

@Baidu_Inc x.com/Marktechpost/s…

🚀 Baidu Research introduces Qianfan-OCR: A 4B-parameter unified end-to-end model for document intelligence! Key Highlights: • Unifies layout analysis, text recognition, and semantic understanding into a single architecture. • Introduces "Layout-as-Thought" to generate structural representations via tokens. • Ranks #1 on OmniDocBench v1.5 (93.12) and OlmOCR Bench (79.8) among end-to-end models. • Outperforms Gemini-3.1-Pro and Qwen3-VL-235B on Key Information Extraction (KIE) benchmarks. • Supports high-resolution inputs up to 4K via the Any Resolution vision encoder. Full analysis: marktechpost.com/2026/03/18/bai… Check it out: github.com/baidubce/Qianf… Paper: arxiv.org/pdf/2603.13398 Model on HF: huggingface.co/collections/ba… #AI #OCR #Baidu #MachineLearning #DocumentIntelligence #ComputerVision @Baidu_Inc

QME

3

6

653

Baidu Inc.@Baidu_Inc·1d

🚀 Introducing Qianfan-OCR: a 4B-parameter end-to-end model for document intelligence. One model. No pipeline. Table extraction, formula recognition, chart understanding, and key information extraction, all in a single pass. Paper: arxiv.org/abs/2603.13398 Models: huggingface.co/collections/ba… 🧵 Key results ↓

English

12

96

648

51.3K

Marktechpost AI Dev News ⚡@Marktechpost·1d

🚀 Baidu Research introduces Qianfan-OCR: A 4B-parameter unified end-to-end model for document intelligence! Key Highlights: • Unifies layout analysis, text recognition, and semantic understanding into a single architecture. • Introduces "Layout-as-Thought" to generate structural representations via tokens. • Ranks #1 on OmniDocBench v1.5 (93.12) and OlmOCR Bench (79.8) among end-to-end models. • Outperforms Gemini-3.1-Pro and Qwen3-VL-235B on Key Information Extraction (KIE) benchmarks. • Supports high-resolution inputs up to 4K via the Any Resolution vision encoder. Full analysis: marktechpost.com/2026/03/18/bai… Check it out: github.com/baidubce/Qianf… Paper: arxiv.org/pdf/2603.13398 Model on HF: huggingface.co/collections/ba… #AI #OCR #Baidu #MachineLearning #DocumentIntelligence #ComputerVision @Baidu_Inc

English

6

12

1K

Marktechpost AI Dev News ⚡ nag-retweet

Marktechpost AI Dev News ⚡@Marktechpost·2d

Fine-tuning a Large Language Model (LLM) usually feels like a battle against CUDA out-of-memory errors and broken environments. Unsloth AI Releases Studio: A Local No-Code Interface For High-Performance LLM Fine-Tuning With 70% Less VRAM Usage We’ve moved past the era where 'pro-level' training required a specialized infrastructure team. Unsloth Studio is an open-source, local Web UI that brings enterprise-grade optimization to your workstation (Windows, Linux, or Mac). Why this is a shift for AI Stack? → Triton-Powered Efficiency: By rewriting backpropagation kernels in OpenAI’s Triton language, we achieve a 2x training speedup and 70% VRAM reduction. You can now fine-tune a Llama 3.3 (70B) or the latest Qwen 3.5 on hardware that previously couldn't even load them. → Data Recipes: Stop wasting time on manual cleaning. Use a graph-node workflow to transform raw PDFs, CSVs, and JSONL into structured ChatML or Alpaca datasets using NVIDIA DataDesigner. → Local Reasoning Models: With integrated GRPO (Group Relative Policy Optimization) support, you can train 'Reasoning AI' (like DeepSeek-R1 variants) using 80% less VRAM—starting with as little as 5GB. → The 'Export Gap' is over: One-click exports to GGUF, vLLM, and Ollama. Fine-tune in the morning, deploy locally in the afternoon. The Technical Reality: 👇 This isn't just a 'wrapper.' It’s a unified interface for the Unsloth 2.0 engine. Whether you are running an RTX 3090 at home or an H100 cluster at work, the kernels automatically optimize for your specific architecture (NVIDIA, and soon AMD/Intel). 100% local. 100% private. ~0% accuracy loss. Full analysis: marktechpost.com/2026/03/17/uns… Technical details: unsloth.ai/docs/new/studio @UnslothAI

English

1

6

22

78.1K

Marktechpost AI Dev News ⚡ nag-retweet

Marktechpost AI Dev News ⚡@Marktechpost·2d

Most AI agents today are failing the enterprise 'vibe check.' ServiceNow Research just released EnterpriseOps-Gym, and it’s a massive reality check for anyone expecting autonomous agents to take over IT and HR tomorrow. We’re moving past simple benchmarks. This is a containerized sandbox with 164 database tables and 512 functional tools. It’s designed to see if agents can actually handle long-horizon planning amidst persistent state changes and strict access protocols. The Brutal Numbers: → Claude Opus 4.5 (the top performer) only achieved a 37.4% success rate. → Gemini-3-Flash followed at 31.9%. → DeepSeek-V3.2 (High) leads the open-source pack at 24.5%. Why the low scores? The research study found that strategic reasoning, not tool invocation, is the primary bottleneck. When the research team provided agents with a human-authored plan, performance jumped by 14-35 percentage points. Strikingly, with a good plan, tiny models like Qwen3-4B actually become competitive with the giants. The TL;DR for AI Devs: ✅ Planning > Scale: We can’t just scale our way to reliability; we need better constraint-aware plan generation. ✅ MAS isn't a Silver Bullet: Decomposing tasks into subtasks often regressed performance because it broke sequential state dependencies. ✅ Sandbox Everything: If you aren't testing your agents in stateful environments, you aren't testing them for the real world. Read our full analysis here: marktechpost.com/2026/03/18/ser… Check out the benchmark: enterpriseops-gym.github.io Paper: arxiv.org/pdf/2603.13594 Codes: github.com/ServiceNow/Ent… @ServiceNow @ServiceNowRSRCH @RajeswarSai @ShivaMalay @PShravannayak @TheJishnuNair @sagardavasam @SathwikTejaswi @tscholak @NVIDIAAI @turingcom @ServiceNowNews @jonsidd

English