123 posts

FW

@thegenerality

https://t.co/uJtHMTZCrZ

Katılım Mayıs 2018

16 Takip Edilen178 Takipçiler

FW@thegenerality·18 Mar

Online Experiential Learning

Tianzhu Ye@ytz2024

(1/n) Introduce Online Experiential Learning toward the era of experience. Beyond offline pre-constructed training data, models can learn online from their own deployment experience across infinite, unsimulable real-world environments. Accumulate, consolidate, self-improve 🔄

English

168

FW@thegenerality·6 Mar

VibeVoice-ASR is now officially in HF Transformers and Microsoft Foundry.

Alvaro Bartolome@alvarobartt

💥 New example out! Deploy @Microsoft VibeVoice-ASR on Microsoft Foundry with @huggingface for multi-lingual STT! Structured output with Who (Speaker), When (Timestamps), and What (Content), up to 60 minutes in a single pass. Step-by-step in the thread 🧵

English

1.1K

FW@thegenerality·13 Şub

Experiential Learning -- Part I: On-Policy Context Distillation for Experiential Learning

Li Dong@donglixp

On-Policy Context Distillation for Experiential Learning: learning from experience (consolidated from trajectories) at test time.

English

135

FW@thegenerality·23 Oca

LLM-in-Sandbox

DailyPapers@HuggingPapers

LLM-in-Sandbox Microsoft Research puts LLMs in a virtual computer to unlock agentic intelligence for non-code tasks. No extra training needed—models spontaneously access resources and run scripts. Works across math, physics, chemistry, biomedicine and more.

English

254

FW@thegenerality·22 Oca

#VibeVoice-ASR - your new ASR model in the era of LLM

DailyPapers@HuggingPapers

Microsoft just released VibeVoice-ASR on Hugging Face A unified speech-to-text model that transcribes hour-long audio in one pass With built-in speaker diarization, timestamps, and customizable user context

English

128

FW retweetledi

DailyPapers@HuggingPapers·21 Oca

English

256

24.3K

FW@thegenerality·21 Oca

Differential Transformer V2 (DIFF V2)

Tianzhu Ye@ytz2024

Introduce Differential Transformer V2 (DIFF V2), an improved version of Differential Transformer. This revision focuses on inference efficiency, training stability, and architectural elegance. We verify the design on production-scale LLMs.

English

192

FW@thegenerality·31 Ara

#VibeVoice

A.I.Warper@AIWarper

Anyways check out this 60s clip I made with a single image of @lexfridman and a 20s audio recording of his voice. Full continuous shot with no cuts created with LongCat Avatar in ComfyUI. VibeVoice for the audio.

QHT

FW@thegenerality·5 Ara

New #VibeVoice model released - VibeVoice Realtime 0.5B for realtime, streaming, robust long-form TTS

AK@_akhaliq

Microsoft just released VibeVoice-Realtime-0.5B huggingface.co/microsoft/Vibe…

English

127

FW@thegenerality·14 Kas

Generative Adversarial Distillation for Black-Box On-Policy Distillation of LLMs

Tianzhu Ye@ytz2024

🚀 We propose Generative Adversarial Distillation (GAD) 🤖 Designed to perform on-policy distillation from proprietary black-box LLMs. ➡️ Requires neither access to teacher logits nor alignment of tokenizer vocabularies. (1/n)

English

175

FW retweetledi

Robert Youssef@rryssf_·11 Kas

🚨 Microsoft Research just launched something that might define the next era of AI systems. They call it 'Agentic Organization' and it’s not just a new model. It’s a new way for intelligence itself to organize. Here’s what’s wild: Most large language models still “think” like a single brain. Step-by-step. Linear. Slow. Even “parallel thinking” just runs the same process twice and merges answers later. Agentic Organization changes the entire game. They built a new reasoning protocol called AsyncThink, where a model plays both roles an Organizer that breaks a complex problem into sub-queries, and Workers that solve those sub-parts at the same time. Think of it like this: Instead of one mind grinding through steps, AsyncThink forms a mini civilization of minds delegating, merging, adapting in real time. And it learns this behavior through reinforcement learning literally learning how to organize its own thoughts. The results are insane: → 28% lower inference latency than parallel thinking → Higher accuracy on math reasoning tasks → Zero-shot generalization to unseen problems like Sudoku → Learned organizational policies that evolve dynamically during reasoning It’s like scaling from “an intelligent agent” → to “an intelligent organization.” AsyncThink models don’t just reason faster they reason like teams do. Fork. Think. Join. Verify. Iterate. This is a glimpse of post-LLM intelligence systems that don’t just think, they coordinate thought. And if that holds, the future of AI might look less like a single brain… and more like a company of minds. Paper: The Era of Agentic Organization: Learning to Organize with Language Models

English

218

1.1K

127K

FW@thegenerality·3 Kas

The Era of Agentic Organization

Rohan Paul@rohanpaul_ai

New @Microsoft paper teaches LLMs to organize reasoning into concurrent subtasks for faster, more accurate answers. It shows 28% lower wait time than typical parallel thinking while also boosting math accuracy. The big deal is simple, it turns coordination into a skill the model learns, so it decides when to split work, when to wait, and when to merge. The usual single chain wastes time because each step blocks the next. Fixed parallel plans also waste time because they cannot adapt to each query. The fix is an organizer that writes simple Fork and Join tags to start and merge worker thoughts. Workers chase sub-queries in parallel while the organizer keeps thinking and only pauses to Join. All control lives in plain text, so the base model stays unchanged. Training happens in 2 stages, first supervised traces that teach the tag format. Then reinforcement learning rewards correct final answers, clean format, and real concurrency. Speed is measured by the critical path through the Fork-Join graph, which matches true waiting. Across countdown puzzles, math questions, and Sudoku, the learned policy runs faster and fails less. The big idea is to learn organization itself rather than hard-code a script. ---- Paper – arxiv. org/abs/2510.26658 Paper Title: "The Era of Agentic Organization: Learning to Organize with Language Models"

English

FW retweetledi

Li Dong@donglixp·28 Eki

On-policy + Reverse KLD = MiniLLM (arxiv.org/abs/2306.08543). Really nice blog by @thinkymachines. Exciting to see it being offered as a service!

Thinking Machines@thinkymachines

Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other approaches for a fraction of the cost. thinkingmachines.ai/blog/on-policy…

English

162

19.8K

FW@thegenerality·18 Eki

BitDistill finetunes any full-precision LLMs into 1.58-bit for specific tasks with the same peformance

AK@_akhaliq

Microsoft presents BitNet Distillation

English

443

FW@thegenerality·25 Eyl

Introducing Thinking Augmented Pre-Training (#TPT) as a simple, general, scalable and effective technique for future mid-training and/or pre-training recipes.

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

Thinking Augmented Pre-training "we propose Thinking augmented Pre-Training (TPT), a universal methodology that augments text with automatically generated thinking trajectories. Such augmentation effectively increases the volume of the training data and makes high-quality tokens more learnable through step-by-step reasoning and decomposition." "Notably, TPT enhances the data efficiency of LLM pre-training by a factor of 3. For a 3B parameter model, it improves the post-training performance by over 10% on several challenging reasoning benchmarks."

English

120

FW@thegenerality·27 Ağu

A cool demo of VibeVoice (together with Wan 2.2). This will enable many new scenarios and applications. #The_Era_of_Vibe_Media

Wildminder@wildmindai

VibeVoice-7B-Preview with 32K Context Length. Wan2.2 + VibeVoice is pretty nice github.com/microsoft/Vibe…

English

152

FW@thegenerality·26 Ağu

#VibeVoice Vibd Podcasting

Axel Dittmann@DittmannAxel

#Microsoft's VibeVoice-1.5B just turned my rig into a podcast studio. 4 voices. Zero API costs. Running locally on a consumer GPU. Generated 5 test podcasts instantaneously - they sound surprisingly human. Setup took 30 minutes: clone repo, load model (most of the time - rural Germany), feed script, press play. The open-source podcast revolution is here, and it fits in your home rig. Who needs cloud subscriptions when innovation runs at localhost? 🎙️ #GenAI #LocalAI #Podcasting #VibeVoice

English

FW@thegenerality·25 Ağu

Vibe Podcasting with VibeVoice - examples at microsoft.github.io/VibeVoice/

DailyPapers@HuggingPapers

Microsoft just dropped VibeVoice on Hugging Face A novel framework generating expressive, long-form, multi-speaker conversational audio like podcasts from text. Synthesizes up to 90 minutes of speech with up to 4 distinct speakers! huggingface.co/microsoft/Vibe…

English

198

FW@thegenerality·11 Haz

Reinforcement Pre-training #RPT

elvis@omarsar0

Reinforcement Pre-Training New pre-training paradigm for LLMs just landed on arXiv! It incentivises effective next-token reasoning with RL. This unlocks richer reasoning capabilities using only raw text and intrinsic RL signals. A must-read! Bookmark it! Here are my notes:

English

120

FW@thegenerality·10 Haz

Reinforcement Pre-Training (RPT) - a new scaling paradigm for LLM and RL

Qingxiu Dong@qx_dong

⏰ We introduce Reinforcement Pre-Training (RPT🍒) — reframing next-token prediction as a reasoning task using RLVR ✅ General-purpose reasoning 📑 Scalable RL on web corpus 📈 Stronger pre-training + RLVR results 🚀 Allow allocate more compute on specific tokens

English

251

Keşfet

@thinkymachines @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine