Javier Nistal

146 posts

Javier Nistal banner
Javier Nistal

Javier Nistal

@latentspaces

Researcher @SonyCSLMusic. Former PhD @TelecomParis_ | @MIDASconsoles | Intern @JukedeckRandD, @SoundCloud. Exploring deep generative models for sound and music

Paris Katılım Ağustos 2020
562 Takip Edilen1.1K Takipçiler
Sabitlenmiş Tweet
Javier Nistal
Javier Nistal@latentspaces·
Very proud of this accomplishment :)
Stefan Lattner@deeplearnmusic

🥳Excited to share our latest work: "Diff-A-Riff"! 🥁 A Latent Diffusion Model that generates instrumental accompaniments for any musical input, specifically tailored for music producers! It's faster, lighter, and produces superior audio quality. Control via text/audio references. 48kHz sample rate, (pseudo) stereo, ~3Gb memory, takes 6 seconds to generate 90 seconds of music. Trained on a single GPU. 📜arxiv.org/pdf/2406.08384 🎶sonycslparis.github.io/diffariff-comp… 🎸 "Diff-A-Riff" adapts to any musical input, following the artist's unique style. 🎛️ Optional controls via text prompts, audio references, interpolation slider, pseudo-stereo width and loop intensity. 🎚️ It produces state-of-the-art audio quality indistinguishable from real data by human raters and operates at unprecedented speed. 🧠 "Diff-A-Riff" is smaller and more efficient than previous models thanks to its Consistency Autoencoder, making it accessible and practical for various applications. Big shoutout to my outer space colleagues: Javier Nistal, the Machine in "machine learning" 🚄, Marco Pasini, the neural net whisperer 🤫, Cyran Aouameur, the troubleshootah 🛠️, Maarten Grachten, aka MaartenGPT 🤖. #Teamwork #AI #MusicTech #Innovation @latentspaces @marco_ppasini @cyranaouameur @SonyCSLMusic

English
0
5
43
2.9K
Javier Nistal
Javier Nistal@latentspaces·
Listening test alert 🚨 we need you! 😊 Super simple music denoising test: listen to a few piano clips and rate their closeness to a reference. 15 minutes for you, tons of help for us :) Headphones recommended. Computer needed. lnkd.in/d6t8UzaC Thanks 🫶
English
0
3
9
1.3K
Javier Nistal retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
🎶 New ISMIR 2025 paper! "Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces" by Mathias Rose Bjare (JKU), Stefan Lattner (Sony CSL), and Gerhard Widmer (JKU/LIT AI Lab). We explore how surprisal — the unexpectedness of musical events — can be modeled directly from audio using autoregressive diffusion models (ADMs). 💡 What we did: - Compared surprisal from diffusion models vs. Generative Infinite-Vocabulary Transformers (GIVT). - Evaluated across tasks: monophonic pitch surprisal (expectation) & segment boundary detection (structural surprise). - Tested surprisal at different noise levels in diffusion processes to see how musical features emerge at multiple granularities. 🔥 Key take-aways: - Diffusion surprisal beats GIVT in modeling pitch expectation & boundary detection. - Mid-level noise surprisal captures pitch-level expectations while suppressing timbre-related variation. - Surprisal curves align with human-like musical segmentation, showing potential as proxies for perceptual surprise. Why it matters: Understanding musical surprisal links computational modeling with human perception and cognition — with applications in AI composition, real-time music interaction, and brain-music studies. 📜 Paper: arxiv.org/abs/2508.05306 💻 Code: github.com/SonyCSLParis/a… #DiffusionModels #Surprisal #MIR @ISMIRConf @SonyCSLParis @SonyCSLMusic
Stefan Lattner tweet media
English
0
6
29
1.8K
Javier Nistal retweetledi
Alain Riou
Alain Riou@howariou·
❌ We don’t need no negative samples ❌ We don’t need no large batches ❌ No modality gap in the classroom Very happy to introduce SLAP, our latest brick in the wall of multimodal SSL 🎶🧠 Joint work with King @Juj_guinot, accepted at #ISMIR2025! 🇰🇷 1/7
Alain Riou tweet media
English
8
36
201
21.2K
Javier Nistal retweetledi
Tom Baker
Tom Baker@TeeJayBaker·
Hey! Our paper, 🌸 “LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation” got accepted as ISMIR 2025! It presents a novel control paradigm for audio diffusion models with greatly reduced parameters, encouraging users to train individual, modular models.
Tom Baker tweet media
English
3
5
28
2.3K
Javier Nistal retweetledi
SonyCSL(Paris)_Music Team
SonyCSL(Paris)_Music Team@SonyCSLMusic·
🎭💻 We’re participating in #CultTech Residency — a new program supporting artists and creative teams exploring performance & technology Experience the benefits of working with our AI-powered tool #DiffARiff Feel free to share the news 🔽 Open call: culttech.at/residence-open…
English
0
2
4
439
Javier Nistal
Javier Nistal@latentspaces·
@dadabots I agree that it feels hacky. But imagine it real-time, with many conditioning, each with its own CFG... that should be extremely fun :D
English
1
0
0
102
dadabots
dadabots@dadabots·
CFG is just a hack to get conditioning to work. It comes with side effects like over-saturation. If you train your model right you don’t need it. But I still like to crank it as an effect, especially in genres like tearout
dadabots tweet media
English
2
2
22
1.3K
Javier Nistal retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
🌟 New @ieeeICASSP Paper Announcement 🌟 We introduce "Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding", a novel autoencoder achieving higher fidelity at extreme compression rates using consistency models & summary embeddings. 🧵
Stefan Lattner tweet media
English
1
5
39
2.2K
Javier Nistal retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
😃Accepted @ieeeICASSP papers of @SonyCSLMusic: Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems M. Grachten, J. Nistal Estimating Musical Surprisal in Audio M. Bjare, G. Cantisani, S. Lattner and G. Widmer Hybrid Losses for Hierarchical Embedding Learning H. Tian, S. Lattner, B. McFee, C. Saitis Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding M. Pasini, S. Lattner, G. Fazekas Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures A. Riou, S. Lattner, A. Gagneré, G. Hadjeres, S. Lattner, G. Peeters Congrats to the authors! @latentspaces @howariou @GiorgiaCanti @tiianhk @marco_ppasini @GeoffroyPeeters @gaetan_hadjeres @SonyCSLParis
English
0
7
39
3.2K
Javier Nistal retweetledi
SonyCSL(Paris)_Music Team
SonyCSL(Paris)_Music Team@SonyCSLMusic·
💡If you missed it, we released our new AI-tool, #DrumGAN, which allows you to generate flexible drum sounds with ease Here is a full blogpost for a detailed look at it + free download the 1st #AI-drum kits made by #Twenty9 during our collaboration ➡️🎁 tinyurl.com/blpodr
English
2
4
6
630
John Vinyard
John Vinyard@john_c_vinyard·
@deeplearnmusic @latentspaces @SonyCSLMusic Very cool @deeplearnmusic ! I'm very curious about how the model was exported to JavaScript so that it's usable in the browser? I'm working on a model that decomposes musical audio into "event vectors" (#Event%20Scatterplot" target="_blank" rel="nofollow noopener">blog.cochlea.xyz/v4blogpost.htm…) and would love to support in-browser exploration!
English
2
0
0
52
Javier Nistal retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
🥳 New publication announcement! Marco Pasini solved the problem of error accumulation in continuous autoregressive models (CAMs), making it possible to generate sequences without the need for prior tokenization. Say goodbye to RVQ codecs (use music2latent 😉). @SonyCSLMusic
Marco Pasini@marco_ppasini

✨ Train language models directly on continuous data - without tokenization ✨ We propose an easy way to train GPT-style autoregressive models on continuous data, without error accumulation. We test it on audio 🔊, but this method can easily work with other modalities 🎆 👇🧵

English
0
8
27
1.8K
Javier Nistal
Javier Nistal@latentspaces·
🌟 Exciting news! The YouTube channel @aisearchio just released a review of #DiffARiff, and it's already making waves—10k views in less than 12 hours and more than 100 comments with fantastic feedback! 🎉 Thanks so much for the incredible exposure!
⚡AI Search⚡@aisearchio

🎶Current AI music generators like Udio & Suno lack control—they spit out entire songs for you, which isn’t always ideal. But now, there's a game changer for music producers! An AI that generates individual stems for any instrument. Here's a sneak peek! youtu.be/dQJZxPyI5l8

English
0
1
6
500
Javier Nistal retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
🤩 It's exciting to see how Diff-A-Riff makes it easy to build up a song based on some input audio! 👇
SonyCSL(Paris)_Music Team@SonyCSLMusic

🎉Continue to discover #DiffARiff capabilities New #HipHop demo by @CarliNistal showcasing how #DiffARiff can seamlessly integrate into your workflow, offering flexibility for musical exploration Using #textprompts & #referenceaudio Carli guided the AI to generate this new demo

English
2
3
15
1.6K