Benjamin Muller (@ben_mlr) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

So many exciting releases from FAIR @AIatMeta Super happy to see Spirit LM now open-sourced. Spirit LM unlocks expressive speech generation through interleaving speech-text training and phonetic(hubert)+pitch+style-specific tokenization. Available here: Weights: ai.meta.com/resources/mode… Code: github.com/facebookresear… Paper: arxiv.org/abs/2402.05755 (soon to be presented at #EMNLP2024 )

AI at Meta@AIatMeta

Open science is how we continue to push technology forward and today at Meta FAIR we’re sharing eight new AI research artifacts including new models, datasets and code to inspire innovation in the community. More in the video from @jpineau1. This work is another important step towards our goal of achieving Advanced Machine Intelligence (AMI). What we’re releasing: • Meta Spirit LM: An open source language model for seamless speech and text integration. • Meta Segment Anything Model 2.1: An updated checkpoint with improved results on visually similar objects, small objects and occlusion handling. Plus a new developer suite to make it easier for developers to build with SAM 2. • Layer Skip: Inference code and fine-tuned checkpoints demonstrating a new method for enhancing LLM performance. • SALSA: New code to enable researchers to benchmark AI-based attacks in support of validating security for post-quantum cryptography. • Meta Lingua: A lightweight and self-contained codebase designed to train language models at scale. • Meta Open Materials: New open source models and the largest dataset of its kind to accelerate AI-driven discovery of new inorganic materials. • MEXMA: A new research paper and code for our novel pre-trained cross-lingual sentence encoder with coverage across 80 languages. • Self-Taught Evaluator: a new method for generating synthetic preference data to train reward models without relying on human annotations. Access to state-of-the-art AI creates opportunities for everyone. We’re excited to share this work and look forward to seeing the community innovation that results from it. Details and access to everything released by FAIR today ➡️ go.fb.me/hgtkel

English

1

14

2K

Benjamin Muller retweetledi

Zhibin Gou@zebgou·1 Ara

If Gemini-3 proved continual scaling pretraining, DeepSeek-V3.2-Speciale proves scaling RL with large context. We spent a year pushing DeepSeek-V3 to its limits. The lesson is post-training bottlenecks are solved by refining methods and data, not just waiting for a better base.

DeepSeek@deepseek_ai

🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents! 🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API. 🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now. 📄 Tech report: huggingface.co/deepseek-ai/De… 1/n

English

61

162

1.9K

236.3K

Benjamin Muller retweetledi

Yen-Ju Lu@Yen_Ju_Lu·8 Eki

🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 arxiv.org/pdf/2510.06195

English

7

16

35

9.3K

Benjamin Muller retweetledi

Aymeric Zhuo@aymericzzz·17 Eyl

Introducing @CodeWordsAI , the fastest way to go from idea to automation, simply by chatting with AI. No more drag-and-drop and configuration. Save time by doing less. Available today for free, for everyone. The Cursor moment for automation is here.

English

7

14

48

3K

Benjamin Muller retweetledi

Artidoro Pagnoni@ArtidoroPagnoni·31 Tem

Thrilled to share that our Byte Latent Transformer won an Outstanding Paper Award at ACL 2025! 🏆

Artidoro Pagnoni@ArtidoroPagnoni

🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯 Paper 📄 dl.fbaipublicfiles.com/blt/BLT__Patch… Code 🛠️ github.com/facebookresear…

English

16

34

281

27.4K

Benjamin Muller retweetledi

Gargi Ghosh@gargighosh·30 Tem

Outstanding paper award! @aclmeeting - BLT: arxiv.org/pdf/2412.09871

English

5

9

132

10.7K

Benjamin Muller retweetledi

Percy Liang@percyliang·8 Nis

We ran Llama 4 Maverick through some HELM benchmarks. It is 1st on HELM capabilities (MMLU-Pro, GPQA, IFEval, WildBench, Omni-MATH), but… crfm.stanford.edu/helm/capabilit…

English

5

15

141

29.4K

Benjamin Muller retweetledi

AI at Meta@AIatMeta·5 Nis

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9

English

830

2.4K

12.8K

3.7M

Benjamin Muller retweetledi

Jason Weston@jaseweston·31 Oca

🚨 Diverse Preference Optimization (DivPO) 🚨 SOTA LLMs have model collapse🫠: they can't generate diverse creative writing or synthetic data 🎨 DivPO trains for both high reward & diversity, vastly improving variety with similar quality. Paper 📝: arxiv.org/abs/2501.18101 🧵below

English

1

76

340

45.3K

Benjamin Muller retweetledi

Gargi Ghosh@gargighosh·27 Ara

We released new research - Byte Latent Transformer(BLT) BLT encodes bytes into dynamic patches using light-weight local models and processes them with a large latent transformer. Think of it as a transformer sandwich!

AI at Meta@AIatMeta

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ go.fb.me/w23lmz

English

11

79

660

70.7K

Benjamin Muller retweetledi

AI at Meta@AIatMeta·27 Ara

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️ go.fb.me/w23lmz

English

27

186

1K

203.1K

Benjamin Muller@ben_mlr·13 Ara

Groundbreaking scaling trends for Byte-level Language Modeling with the new BLT architecture 🚀 More insights in the thread 🧵

Artidoro Pagnoni@ArtidoroPagnoni

🚀 Introducing the Byte Latent Transformer (BLT) – An LLM architecture that scales better than Llama 3 using byte-patches instead of tokens 🤯 Paper 📄 dl.fbaipublicfiles.com/blt/BLT__Patch… Code 🛠️ github.com/facebookresear…

English

0

4

21

1.4K

Benjamin Muller@ben_mlr·12 Kas

Congrats @aymericzzz and team on being live! Very exciting vision to build entire softwares with just a prompt

Aymeric Zhuo@aymericzzz

Excited to share more about our background, vision and where we're headed at @agemoai with @r1ddhi at @BusinessInsider 𝗢𝘂𝗿 𝘃𝗶𝘀𝗶𝗼𝗻 𝗶𝘀 𝘁𝗼 𝗲𝗻𝗮𝗯𝗹𝗲 𝗮𝗻𝘆𝗼𝗻𝗲 𝘁𝗼 𝗰𝗿𝗲𝗮𝘁𝗲 𝘀𝗼𝗳𝘁𝘄𝗮𝗿𝗲 – from an idea to fully deployed software. The critical path to achieve it requires building AI systems that can reason about software at a fundamental level. 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹𝘀 𝗮𝗿𝗲 𝗻𝗼𝘁 𝘁𝗵𝗲 𝘀𝗼𝗹𝗲 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻, 𝗯𝘂𝘁 𝗮𝗿𝗲 𝗽𝗮𝗿𝘁 𝗼𝗳 𝘁𝗵𝗲 𝘀𝗼𝗹𝘂𝘁𝗶𝗼𝗻. Since inception, our research focus at agemo has been on leveraging neurosymbolic methods to build a reasoning system for software. With the first implementation and training of this system done, we have been iterating on a platform for non-coders to generate software. We are fortunate to be backed by @FlyVC and @firstminutecap alongside pioneers in the field at @GoogleDeepMind and @Meta . Read more about our story at businessinsider.com/agemo-exits-st…

English

0

2

4

380

Benjamin Muller retweetledi

Xiang Yue@xiangyue96·22 Eki

🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries! 🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨ neulab.github.io/Pangea/ arxiv.org/pdf/2410.16153 The Pangea family includes three major components: 🔥 Pangea-7B: A state-of-the-art multilingual multimodal LLM capable of 39 languages! Not only does it excel in multilingual scenarios, but it also matches or surpasses English-centric models like Llama 3.2, Molmo, and LlavaOneVision in English performance. 📝 PangeaIns: A 6M multilingual multimodal instruction tuning dataset across 39 languages. 🗂️ With 40% English instructions and 60% multilingual instructions, it spans various domains, including 1M culturally-relevant images sourced from LAION-Multi. 🎨 🏆 PangeaBench: A comprehensive evaluation benchmark featuring 14 datasets in 47 languages. Evaluation can be tricky, so we carefully curated existing benchmarks and introduced two new datasets: xChatBench (human-annotated wild queries with fine-grained evaluation criteria) and xMMMU (a meticulously machine-translated version of MMMU). 🙌 This is a joint leading effort with @yueqi_song. Also kudos to the amazing team @AkariAsai, @seungonekim, @Jeande_d, @simi_97k, @anjali_ruban, @lintangsutawika, @Sathya8NR, @gneubig for their hard work! Check out more results and insights we conclude from our training in the thread below. 👇

English

7

76

373

91.2K

Benjamin Muller@ben_mlr·20 Eki

@mhnt1580 @juanmiguelpino Thanks Wei-Ning !!

English

0

2

118

Wei-Ning Hsu@mhnt1580·20 Eki

An **open-sourced** multimodal LM that writes and speaks! Congrats to @ben_mlr @juanmiguelpino and the team

AI at Meta@AIatMeta

Today we released Meta Spirit LM — our first open source multimodal language model that freely mixes text and speech. Many existing AI voice experiences today use ASR to techniques to process speech before synthesizing with an LLM to generate text — but these approaches compromise the expressive aspects of speech. Using phonetic, pitch and tone tokens, Spirit LM models can overcome these limitations for both inputs and outputs to generate more natural sounding speech while also learning new tasks across ASR, TTS and speech classification. We hope that sharing this work will enable the research community to further new approaches for text and speech integration.

English

1

3

21

2.2K

Benjamin Muller retweetledi

Yann LeCun@ylecun·19 Eki

Meta Spirit LM: open source language model that mixes text and speech.

AI at Meta@AIatMeta

Today we released Meta Spirit LM — our first open source multimodal language model that freely mixes text and speech. Many existing AI voice experiences today use ASR to techniques to process speech before synthesizing with an LLM to generate text — but these approaches compromise the expressive aspects of speech. Using phonetic, pitch and tone tokens, Spirit LM models can overcome these limitations for both inputs and outputs to generate more natural sounding speech while also learning new tasks across ASR, TTS and speech classification. We hope that sharing this work will enable the research community to further new approaches for text and speech integration.

English

19

68

326

62.9K

Benjamin Muller@ben_mlr·9 Eki

@ruochenz_ Thank you Ruochen! Lots of research to do in that area indeed!

English

0

1

80

Ruochen Zhang@ruochenz_·8 Eki

This is super cool! Excited to see people using interpretability techniques for better transfer learning on multilingual tasks!

Benjamin Muller@ben_mlr

Recent LLMs (e.g. LLama 3 🦙) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer Swapping** and find that combining Math and Language-Specific experts improves the performance of Llama 3 for specific languages (e.g. #Bengali) on Math queries Arxiv and detail in the thread below 🧵

English

1

7

845

Benjamin Muller@ben_mlr·8 Eki

Thanks Tu :). Interesting! That's definitely relevant to our work, thanks for sharing! In our case we found that merging/swapping language-specific expert layers and English Math expert layers transfers effectively at test time to target languages. Doing it in a parameter-efficient setting was degrading the test performance in our setup but there is definitely more work to be done there.

English

1

0

1

203

Tu Vu@tuvllms·8 Eki

Great work, @ben_mlr and @LucasBandarkar! We tried a relevant idea in our zero-shot cross-lingual transfer paper (arxiv.org/abs/2205.12647) back in 2022 and it worked reasonably well. Basically, we trained soft prompts separately for languages and tasks, and then composed these vectors to generalize across new language-task combinations.

English

1

0

6

553

Benjamin Muller@ben_mlr·8 Eki

Recent LLMs (e.g. LLama 3 🦙) are increasingly good at Math. However, this progress is reserved for languages with large amounts of task-specific instruct-tuning data. In this work @AIatMeta (led by @LucasBandarkar ), we introduce a new model merging technique called **Layer Swapping** and find that combining Math and Language-Specific experts improves the performance of Llama 3 for specific languages (e.g. #Bengali) on Math queries Arxiv and detail in the thread below 🧵

Lucas Bandarkar@LucasBandarkar

Cross-lingual transfer can be as easy as swapping model layers between LLMs! 🔀 Our model merging method can compose math and language skills by swapping top&bottom layers from a SFT’d target language expert into a math expert without retraining arxiv.org/pdf/2410.01335 🧵: [1/3]

English

2

7

28

5.6K

Benjamin Muller

Keşfet