Javier Nistal

146 posts

Javier Nistal

@latentspaces

Researcher @SonyCSLMusic. Former PhD @TelecomParis_ | @MIDASconsoles | Intern @JukedeckRandD, @SoundCloud. Exploring deep generative models for sound and music

Paris เข้าร่วม Ağustos 2020

562 กำลังติดตาม1.1K ผู้ติดตาม

ทวีตที่ปักหมุด

Javier Nistal@latentspaces·14 Haz

Very proud of this accomplishment :)

Stefan Lattner@deeplearnmusic

🥳Excited to share our latest work: "Diff-A-Riff"! 🥁 A Latent Diffusion Model that generates instrumental accompaniments for any musical input, specifically tailored for music producers! It's faster, lighter, and produces superior audio quality. Control via text/audio references. 48kHz sample rate, (pseudo) stereo, ~3Gb memory, takes 6 seconds to generate 90 seconds of music. Trained on a single GPU. 📜arxiv.org/pdf/2406.08384 🎶sonycslparis.github.io/diffariff-comp… 🎸 "Diff-A-Riff" adapts to any musical input, following the artist's unique style. 🎛️ Optional controls via text prompts, audio references, interpolation slider, pseudo-stereo width and loop intensity. 🎚️ It produces state-of-the-art audio quality indistinguishable from real data by human raters and operates at unprecedented speed. 🧠 "Diff-A-Riff" is smaller and more efficient than previous models thanks to its Consistency Autoencoder, making it accessible and practical for various applications. Big shoutout to my outer space colleagues: Javier Nistal, the Machine in "machine learning" 🚄, Marco Pasini, the neural net whisperer 🤫, Cyran Aouameur, the troubleshootah 🛠️, Maarten Grachten, aka MaartenGPT 🤖. #Teamwork #AI #MusicTech #Innovation @latentspaces @marco_ppasini @cyranaouameur @SonyCSLMusic

English

2.9K

Javier Nistal@latentspaces·9 Eyl

Final push! We need a few more participants for a 15-minute listening test: golisten.ucd.ie/task/mushra-te… Already helped and coming to #ISMIR? Beer on me :)

Javier Nistal@latentspaces

Listening test alert 🚨 we need you! 😊 Super simple music denoising test: listen to a few piano clips and rate their closeness to a reference. 15 minutes for you, tons of help for us :) Headphones recommended. Computer needed. lnkd.in/d6t8UzaC Thanks 🫶

English

684

Javier Nistal@latentspaces·6 Eyl

English

1.3K

Javier Nistal รีทวีตแล้ว

Stefan Lattner@deeplearnmusic·20 Ağu

🎶 New ISMIR 2025 paper! "Estimating Musical Surprisal from Audio in Autoregressive Diffusion Model Noise Spaces" by Mathias Rose Bjare (JKU), Stefan Lattner (Sony CSL), and Gerhard Widmer (JKU/LIT AI Lab). We explore how surprisal — the unexpectedness of musical events — can be modeled directly from audio using autoregressive diffusion models (ADMs). 💡 What we did: - Compared surprisal from diffusion models vs. Generative Infinite-Vocabulary Transformers (GIVT). - Evaluated across tasks: monophonic pitch surprisal (expectation) & segment boundary detection (structural surprise). - Tested surprisal at different noise levels in diffusion processes to see how musical features emerge at multiple granularities. 🔥 Key take-aways: - Diffusion surprisal beats GIVT in modeling pitch expectation & boundary detection. - Mid-level noise surprisal captures pitch-level expectations while suppressing timbre-related variation. - Surprisal curves align with human-like musical segmentation, showing potential as proxies for perceptual surprise. Why it matters: Understanding musical surprisal links computational modeling with human perception and cognition — with applications in AI composition, real-time music interaction, and brain-music studies. 📜 Paper: arxiv.org/abs/2508.05306 💻 Code: github.com/SonyCSLParis/a… #DiffusionModels #Surprisal #MIR @ISMIRConf @SonyCSLParis @SonyCSLMusic

English

1.8K

Javier Nistal รีทวีตแล้ว

Alain Riou@howariou·25 Haz

❌ We don’t need no negative samples ❌ We don’t need no large batches ❌ No modality gap in the classroom Very happy to introduce SLAP, our latest brick in the wall of multimodal SSL 🎶🧠 Joint work with King @Juj_guinot, accepted at #ISMIR2025! 🇰🇷 1/7

English

201

21.2K

Javier Nistal รีทวีตแล้ว

Tom Baker@TeeJayBaker·16 Haz

Hey! Our paper, 🌸 “LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation” got accepted as ISMIR 2025! It presents a novel control paradigm for audio diffusion models with greatly reduced parameters, encouraging users to train individual, modular models.

English

2.3K

Javier Nistal รีทวีตแล้ว

SonyCSL(Paris)_Music Team@SonyCSLMusic·8 Nis

🎭💻 We’re participating in #CultTech Residency — a new program supporting artists and creative teams exploring performance & technology Experience the benefits of working with our AI-powered tool #DiffARiff Feel free to share the news 🔽 Open call: culttech.at/residence-open…

English

439

Javier Nistal@latentspaces·9 Mar

@dadabots I agree that it feels hacky. But imagine it real-time, with many conditioning, each with its own CFG... that should be extremely fun :D

English

102

dadabots@dadabots·8 Mar

CFG is just a hack to get conditioning to work. It comes with side effects like over-saturation. If you train your model right you don’t need it. But I still like to crank it as an effect, especially in genres like tearout

English

1.3K

Javier Nistal รีทวีตแล้ว

Stefan Lattner@deeplearnmusic·30 Oca

🌟 New @ieeeICASSP Paper Announcement 🌟 We introduce "Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding", a novel autoencoder achieving higher fidelity at extreme compression rates using consistency models & summary embeddings. 🧵

English

2.2K

Javier Nistal รีทวีตแล้ว

Stefan Lattner@deeplearnmusic·14 Oca

😃Accepted @ieeeICASSP papers of @SonyCSLMusic: Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems M. Grachten, J. Nistal Estimating Musical Surprisal in Audio M. Bjare, G. Cantisani, S. Lattner and G. Widmer Hybrid Losses for Hierarchical Embedding Learning H. Tian, S. Lattner, B. McFee, C. Saitis Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding M. Pasini, S. Lattner, G. Fazekas Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures A. Riou, S. Lattner, A. Gagneré, G. Hadjeres, S. Lattner, G. Peeters Congrats to the authors! @latentspaces @howariou @GiorgiaCanti @tiianhk @marco_ppasini @GeoffroyPeeters @gaetan_hadjeres @SonyCSLParis

English

3.2K

Javier Nistal รีทวีตแล้ว

SonyCSL(Paris)_Music Team@SonyCSLMusic·17 Ara

💡If you missed it, we released our new AI-tool, #DrumGAN, which allows you to generate flexible drum sounds with ease Here is a full blogpost for a detailed look at it + free download the 1st #AI-drum kits made by #Twenty9 during our collaboration ➡️🎁 tinyurl.com/blpodr

English

630

Javier Nistal@latentspaces·11 Ara

@john_c_vinyard @deeplearnmusic @SonyCSLMusic Perhaps @cyranaouameur can throw some light here

English

120

John Vinyard@john_c_vinyard·11 Ara

@deeplearnmusic @latentspaces @SonyCSLMusic Very cool @deeplearnmusic ! I'm very curious about how the model was exported to JavaScript so that it's usable in the browser? I'm working on a model that decomposes musical audio into "event vectors" (#Event%20Scatterplot" target="_blank" rel="nofollow noopener">blog.cochlea.xyz/v4blogpost.htm…) and would love to support in-browser exploration!

English

Javier Nistal รีทวีตแล้ว

Stefan Lattner@deeplearnmusic·11 Ara

Happy to announce that our legendary DrumGAN, developed 2020 by Javier Nistal @latentspaces is now available as a web application. Free to use and without registration! 🥁 Give it a try at drumgan.csl.sony.fr #DrumGAN @SonyCSLMusic

English

1.5K

Javier Nistal รีทวีตแล้ว

Stefan Lattner@deeplearnmusic·4 Ara

🥳 New publication announcement! Marco Pasini solved the problem of error accumulation in continuous autoregressive models (CAMs), making it possible to generate sequences without the need for prior tokenization. Say goodbye to RVQ codecs (use music2latent 😉). @SonyCSLMusic

Marco Pasini@marco_ppasini

✨ Train language models directly on continuous data - without tokenization ✨ We propose an easy way to train GPT-style autoregressive models on continuous data, without error accumulation. We test it on audio 🔊, but this method can easily work with other modalities 🎆 👇🧵

English

1.8K

Javier Nistal@latentspaces·25 Eki

🌟 Exciting news! The YouTube channel @aisearchio just released a review of #DiffARiff, and it's already making waves—10k views in less than 12 hours and more than 100 comments with fantastic feedback! 🎉 Thanks so much for the incredible exposure!

⚡AI Search⚡@aisearchio

🎶Current AI music generators like Udio & Suno lack control—they spit out entire songs for you, which isn’t always ideal. But now, there's a game changer for music producers! An AI that generates individual stems for any instrument. Here's a sneak peek! youtu.be/dQJZxPyI5l8

English

500

Javier Nistal รีทวีตแล้ว

Stefan Lattner@deeplearnmusic·22 Eki

🤩 It's exciting to see how Diff-A-Riff makes it easy to build up a song based on some input audio! 👇

SonyCSL(Paris)_Music Team@SonyCSLMusic

🎉Continue to discover #DiffARiff capabilities New #HipHop demo by @CarliNistal showcasing how #DiffARiff can seamlessly integrate into your workflow, offering flexibility for musical exploration Using #textprompts & #referenceaudio Carli guided the AI to generate this new demo

English

1.6K

Javier Nistal@latentspaces·17 Eki

🚨 New Video Drop! 🚨 My brother, Carli Nistal, used Diff-A-Riff to create this banger! 🎶🤖 Watch it in action!

SonyCSL(Paris)_Music Team@SonyCSLMusic

🎉New #DiffARiff audio demo 🎚️We continue to introduce you to Diff-A-Riff's capabilities, our accompaniment co-creation tool, with this new #Disco Demo produced by @CarliNistal 👏 Find the HQ video & audio on our YouTube channel #aimusic #research #innovation #sonycslparis

English

758

Javier Nistal รีทวีตแล้ว

SonyCSL(Paris)_Music Team@SonyCSLMusic·17 Eki

English

2.1K

ค้นพบ

@ISMIRConf @SonyCSLParis @SonyCSLMusic @Juj_Guinot @dadabots @ieeeICASSP @howariou @GiorgiaCanti