Or Tal

80 posts

Or Tal

Or Tal

@Or__Tal

PhD candidate @HebrewU; Research Assistant @MetaAI (FAIR)

เข้าร่วม Mart 2022
208 กำลังติดตาม156 ผู้ติดตาม
ทวีตที่ปักหมุด
Or Tal
Or Tal@Or__Tal·
Which modeling to choose for text-to-music generation? We run a head-to-head comparison to figure it out. Same data, same architecture - AR vs FM. 👇 If you care about fidelity, speed, control, or editing see this thread. 🔗huggingface.co/spaces/ortal16… 📄arxiv.org/abs/2506.08570 1/6
Or Tal tweet media
English
1
11
41
1.7K
Or Tal รีทวีตแล้ว
Eliahu Horwitz
Eliahu Horwitz@EliahuHorwitz·
Excited to share this has now been accepted at #NeurIPS2025 as a position paper (<6% acceptance)!🎉 We advocate for systematically studying entire model populations via weight-space learning, and argue that this requires charting them in a Model Atlas. @NeurIPSConf #NeurIPS 🧵👇
Eliahu Horwitz tweet media
Eliahu Horwitz@EliahuHorwitz

🚨 New paper alert! 🚨 Millions of neural networks now populate public repositories like Hugging Face 🤗, but most lack documentation. So, we decided to build an Atlas 🗺️ Project: horwitz.ai/model-atlas Demo: huggingface.co/spaces/Eliahu/… 🧵👇🏻 Here's what we found:

English
0
21
64
3.9K
Or Tal รีทวีตแล้ว
Heli Ben-Hamu
Heli Ben-Hamu@helibenhamu·
Excited to share our work Set Block Decoding! A new paradigm combining next-token-prediction and masked (or discrete diffusion) models, allowing parallel decoding without any architectural changes and with exact KV cache. Arguably one of the simplest ways to accelerate LLMs!
English
5
24
115
25.7K
Jesse Engel
Jesse Engel@jesseengel·
New VST/AU Plugin! 🚨 Play with Lyria RealTime directly from inside your favorite DAW with “The Infinite Crate” 🎧🎶 Like other Lyria RT demos, you can mix together text prompts and other controls to steer the model in real-time. But now with a VST plugin you can feed audio directly into your DAW for sampling, live performance, or even a practice partner to jam with. 💾 Get it here: g.co/magenta/infini…
English
14
33
188
59.7K
Or Tal
Or Tal@Or__Tal·
@__Rafail__ @NadavHarTuv I believe it should. If the audio is very long you may need to parse it in chunks but thats also true to all other audio representation models
English
0
0
1
22
נדב הר-טוב
נדב הר-טוב@NadavHarTuv·
🚨 New paper alert! PAST: phonetic-acoustic speech tokenizer – just got accepted to Interspeech 2025 🎉 It learns phonetic + acoustic tokens jointly, with no SSL babysitter or external vocoder. 🔗pages.cs.huji.ac.il/adiyoss-lab/PA… 👇 If you’re into speech LMs, keep reading!
נדב הר-טוב tweet media
English
3
33
164
12.8K
Or Tal
Or Tal@Or__Tal·
@__Rafail__ @NadavHarTuv Nope, PAST has ~180M params, with the streamable version having ~125M params. This should run on a standard gpu. For speech LM training we used 2 a100 gpus but it could be done with less
English
1
0
1
40
Raphael
Raphael@__Rafail__·
@NadavHarTuv This is all great, but what is the consumption of such a solution? Do I need to buy H200 to run it?
English
1
0
0
42
Or Tal รีทวีตแล้ว
Gallil Maimon
Gallil Maimon@GallilMaimon·
Many modern SpeechLMs are trained with Speech-Text interleaving. How does this impact scaling trends? In our new paper, we train several dozen SLMs, and show - quite a lot! So there is room for optimism 😊 Key insights, code, models, full paper 👇🏻
Gallil Maimon tweet media
English
4
19
74
5.5K
Or Tal รีทวีตแล้ว
Gallil Maimon
Gallil Maimon@GallilMaimon·
🎉Thrilled that our paper on "scaling analysis of interleaved speech-text LMs" was accepted to #CoLM2025 It gives room for optimism when scaling SpeechLMs *right* - with large TextLMs (in place of more data), interleaving, and synth training data💪
Gallil Maimon tweet media
English
1
4
29
1.5K
Or Tal
Or Tal@Or__Tal·
💣Introducing PAST: a speech tokenizer that jointly model phonetics and acoustics (No SSL involved). Past demonstrates great reconstruction as well as semantic capabilities in the form of ABX and sWUGGY. 🤗 huggingface.co/slprl/PAST Check out Nadav's post👇@NadavHarTuv @adiyossLC
נדב הר-טוב@NadavHarTuv

🚨 New paper alert! PAST: phonetic-acoustic speech tokenizer – just got accepted to Interspeech 2025 🎉 It learns phonetic + acoustic tokens jointly, with no SSL babysitter or external vocoder. 🔗pages.cs.huji.ac.il/adiyoss-lab/PA… 👇 If you’re into speech LMs, keep reading!

English
0
0
9
368
Or Tal รีทวีตแล้ว
Gallil Maimon
Gallil Maimon@GallilMaimon·
🎵💬 If you are interested in Audio Tokenisers, you should check out our new work! We empirically analysed existing tokenisers from every way - reconstruction, downstream, LMs and more. Grab yourself a ☕/🍺 and sit down for a read!
Gallil Maimon tweet media
English
1
25
103
5.8K
Or Tal รีทวีตแล้ว
Niv Eckhaus
Niv Eckhaus@niveckhaus·
🚨 New Paper: "Time to Talk"! 🕵️ We built an LLM agent that doesn't just decide WHAT to say, but also WHEN to say it! Introducing "Time to Talk" - LLM agents for asynchronous group communication, tested in real Mafia games with human players. 🌐niveck.github.io/Time-to-Talk 🧵1/7
English
3
12
57
6.2K
Or Tal
Or Tal@Or__Tal·
What if training steps are capped at 500k? FM reaches near-topline quality with small batches. It’s compute-efficient and forgiving. AR needs larger batch sizes to recover performance. It benefits more from large-scale training. See📉 below by model duration + batch size: 6/6
Or Tal tweet media
English
1
0
2
129
Or Tal
Or Tal@Or__Tal·
Which modeling to choose for text-to-music generation? We run a head-to-head comparison to figure it out. Same data, same architecture - AR vs FM. 👇 If you care about fidelity, speed, control, or editing see this thread. 🔗huggingface.co/spaces/ortal16… 📄arxiv.org/abs/2506.08570 1/6
Or Tal tweet media
English
1
11
41
1.7K
Or Tal รีทวีตแล้ว
Felix Kreuk
Felix Kreuk@FelixKreuk·
We’ve been exploring the trade-offs between Autoregressive and Flow-Matching models for music generation. We share our findings in this latest paper led by @Or__Tal. Many interesting take-aways and practical advice on training generative models for music! 🎶🧠
Or Tal@Or__Tal

Which modeling to choose for text-to-music generation? We run a head-to-head comparison to figure it out. Same data, same architecture - AR vs FM. 👇 If you care about fidelity, speed, control, or editing see this thread. 🔗huggingface.co/spaces/ortal16… 📄arxiv.org/abs/2506.08570 1/6

English
1
1
11
601