Pedro Sarmento

2.3K posts

Pedro Sarmento banner
Pedro Sarmento

Pedro Sarmento

@umpedronosapato

AI & Music Data Scientist @moises_ai | prev. @c4dm

Katılım Haziran 2014
1.9K Takip Edilen2.8K Takipçiler
Pedro Sarmento retweetledi
Chris Donahue
Chris Donahue@chrisdonahuey·
Vibe coding is cool but have you tried vibe patching? Pure Vibes = Pure Data + MCP. Describe your sound, watch the patch appear 🌊 🔊👇
English
8
16
103
5.8K
Pedro Sarmento retweetledi
dadabots
dadabots@dadabots·
SCIENCE PAPER DROPPED big ups @zacknovack 🥳 & the @harmonai_org team This paper explores an inexpensive method to add custom control (e.g. pitch) to a pretrained audio diffusion model, without retraining the model arxiv.org/abs/2603.04366
English
1
3
42
1.1K
Pedro Sarmento retweetledi
Sander Dieleman
Sander Dieleman@sedielem·
Some really great insights here about the differences between masked and uniform-state discrete diffusion. Both continuous diffusion and uniform-state discrete diffusion for modelling categorical data seem to be making a bit of a comeback recently. Entropy is all you need🙃
Dimitri von Rütte@dvruette

there, I said it. diffusion LLMs are the future! I'll be back in a couple of years to collect my "I told you so" award.

English
6
19
217
26.1K
Ash
Ash@Ashvala·
Coming soon to @nonety_pe: draw to engrave the music. Built on top of everyone' s favorite drawing tool, @tldraw and a simple 300k parameter CNN.
English
4
9
80
8.2K
Pedro Sarmento retweetledi
Jordi Pons
Jordi Pons@jordiponsdotme·
The ACE-Step 1.5 paper can be confusing. In this post I share are its main ideas: artintech.substack.com/p/ace-step-15-… 1. DIFFUSION MODEL: supports multiple tasks. 2. LANGUAGE MODEL: reprompting & semantic tokens generation. 3. DATA PREPARATION: 27M songs. 4. OPEN WEIGHTS: supports LoRAs.
Jordi Pons tweet mediaJordi Pons tweet mediaJordi Pons tweet mediaJordi Pons tweet media
English
0
5
46
2.6K
Pedro Sarmento retweetledi
Yi-Hsuan Yang
Yi-Hsuan Yang@affige_yang·
Yes! This ATTM Grand Challenge brings the fair-play & affordability we've been longing for in TTM research: from-scratch training on fixed academic data, prioritizing novel algorithmic or system design. We provide a MeanAudio baseline to get started easily. Join us! 🚀 #ICME2026
Hao-Wen (Herman) Dong 董皓文@hermanhwdong

📢Happy to announce the ICME 2026 Grand Challenge on Academic Text-to-Music Generation! - Official launch: Feb 10 - Registration deadline: Mar 20 - Submission deadline: April 23 Co-organizing with @affige_yang @HungyiLee2 @Lonian6 & Fang-Chih Hsieh ntu-musicailab.github.io/ICME26-ATTM-Gr…

English
0
1
17
980
Pedro Sarmento retweetledi
机器之心 JIQIZHIXIN
机器之心 JIQIZHIXIN@jiqizhixin·
New paradigm from Kaiming He's team: Drifting Models! With this approach, you can generate a perfect image in a single step. The team trains a "drifting field" that smoothly moves samples toward equilibrium with the real data distribution. The result? A one-step generator that sets a new SOTA on ImageNet 256x256, beating complex multi-step models.
机器之心 JIQIZHIXIN tweet media
English
15
166
1.3K
314.7K
Pedro Sarmento retweetledi
Yoshua Bengio
Yoshua Bengio@Yoshua_Bengio·
Today we’re releasing the International AI Safety Report 2026: the most comprehensive evidence-based assessment of AI capabilities, emerging risks, and safety measures to date. 🧵 (1/17)
English
63
369
1.1K
413.2K
Pedro Sarmento retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
🔊 New Paper Alert — Training-Free Inference-Time Timbre Transfer Check out our latest #ICASSP paper on timbre transfer in music audio! 🎶 ➡️ Diffusion Timbre Transfer Via Mutual Information Guided Inpainting; Ching Ho Lee, Javier Nistal, Stefan Lattner, Marco Pasini & George Fazekas 📜 Paper: arxiv.org/pdf/2601.01294 🥁 Audio Demos: anon-audio-demo-25.github.io/audio_demo/ In this work, we rethink timbre transfer as an inference-time editing problem — and show that you don’t need to retrain or fine-tune heavy models to change the instrumental color of a piece while preserving its musical structure. 🎯 What’s New? Instead of training separate models or adding control modules for each instrument: ✅ We start from a pre-trained latent diffusion model and steer it on the fly using two simple controls: • Mutual-Information guided noise injection: add noise only in latent channels most informative of timbre. • Early-step clamping: “lock in” melody and rhythm by restoring structure-dominant channels during denoising. This lightweight, training-free procedure lets you control timbre without sacrificing the original melody, harmony or rhythm — and works with text or audio conditioning (e.g., CLAP). ✨ Why It Matters 🎵 Practical music production tools for re-orchestration and sound design 🛠️ Efficient editing with no added model training 🔍 A framework that could extend beyond timbre to other label-driven audio edits 📌 Compatible with strong diffusion backbones and generative audio models @SonyCSLMusic @SonyCSLParis
English
0
4
21
1.8K
Pedro Sarmento retweetledi
Hope Rugo
Hope Rugo@hoperugo·
Very cool to see this large dataset. Walking is good! Bummer about swimming though! My preferred exercise. I figure variability might be greater with swimming! @OncoAlert
Eric Topol@EricTopol

The @BMJMedicine was supposed to post this paper 2 hours ago but has failed to do. It is being covered by other means #google_vignette" target="_blank" rel="nofollow noopener">medicalxpress.com/news/2026-01-p… Someday the link will be active! dx.doi.org/10.1136/bmjmed…

English
0
7
32
6.3K
Pedro Sarmento retweetledi
arXiv Sound
arXiv Sound@ArxivSound·
Carlos Hernandez-Olivan, Hendrik Vincent Koops, Hao Hao Tan, Elio Quinton, "Single-step Controllable Music Bandwidth Extension With Flow Matching," arxiv.org/abs/2601.14356
English
0
3
5
481
arXiv Sound
arXiv Sound@ArxivSound·
Xinhao Mei, Gael Le Lan, Haohe Liu, Zhaoheng Ni, Varun Nagaraja, Yang Liu, Yangyang Shi, Vikas Chandra, "SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training," arxiv.org/abs/2601.12594
Filipino
1
0
8
457
Pedro Sarmento retweetledi
dadabots
dadabots@dadabots·
Coditany of Timeness --- In 2017 we made the 1st fully neural synthesized album. Ever. We put it on Bandcamp. 1M+ people listened. 100+ articles were written about it. It was research. It was art. It was anti-human black metal. Today Bandcamp BANS ai 🚫 reddit.com/r/BandCamp/com…
English
7
10
78
5.9K
Pedro Sarmento retweetledi
arXiv Sound
arXiv Sound@ArxivSound·
Simon Rouard, Manu Orsini, Axel Roebel, Neil Zeghidour, Alexandre D\'efossez, "Continuous Audio Language Models," arxiv.org/abs/2509.06926
Română
0
1
19
1.3K
Pedro Sarmento retweetledi
Jordi Pons
Jordi Pons@jordiponsdotme·
AI music doesn’t have to be ‘slop’. It can be Interactive! Over the break, I wrote a post defining Interactive AI Music, a term that brings together my favorite artistic projects in AI and music. artintech.substack.com/p/interactive-…
English
3
2
17
2.2K