Bernardo Torres

36 posts

Bernardo Torres

@torres_be_

PhD Student @TelecomParis, @tp_adasp. Previously intern at @researchdeezer @SonyCSLMusic. AI/ML for audio and music signal processing and synthesis.

Paris, France Katılım Ocak 2023

139 Takip Edilen67 Takipçiler

Bernardo Torres@torres_be_·28 Eki

@stark_reborn @AssafShocher Actually it's not really the same approach since the encoder/decoder don't follow the linearizer architecture they propose. But I would be really interested in simply trying to replace the encoder/decoder with their linearizer and a similar compression ratio to see if it works!

English

Bernardo Torres@torres_be_·28 Eki

@stark_reborn @AssafShocher Hello! I might be a little late for the party but this is exactly was was done in the following paper (for pre-training): arxiv.org/abs/2510.23530 Check it out!

English

Assaf Shocher@AssafShocher·14 Eki

They tell you neural nets are non-linear. What does "linear" even mean?! Linearity is only defined given two vector spaces, X → Y. What if we could find a different pair of spaces where NNs ARE linear? 🤯 We do it and use it for many apps, such as one-step diffusion! 🧵

English

562

45K

Bernardo Torres retweetledi

Alain Riou@howariou·27 Eki

We just released the full training code, as well as our best pretrained model! 🎉 Feel free to use our SOTA checkpoint in your own project with 3 lines of code, or to retrain on your own data using our Lightning+Hydra+Dora codebase ⚡️🐍 🌐 github.com/sony/sampleid

Alain Riou@howariou

Eminem sampled Aerosmith, 50 Cent sampled Nina Simone, everybody sampled Chic... Many great songs sampled existing ones! Detecting this is the topic of our latest paper with @serrjoa at @SonyAI Barcelona 😎 tl;dr: multi-track dataset + few tricks = +18% boost over SOTA 🚀 1/N

English

1.6K

Bernardo Torres@torres_be_·24 Eyl

+ if you work on MIR and use the CQT for tasks which require frame-wise/time-varying estimation, consider using the Variable-Q Transform (VQT) instead. The huge CQT kernels at low frequencies are often underestimated and they might be hurting your performances.

English

Bernardo Torres@torres_be_·24 Eyl

The PESTO extension paper was published in TISMIR! Here's the link to the publication: transactions.ismir.net/articles/10.53… It features several improvements to the model, larger cross dataset evaluation, and a real-time implementation.

Alain Riou@howariou

Estremamente felice di annunciare that our new PESTO recipe has been published in the famous TISMIR cookbook 👨‍🍳🤌 With chef @torres_be_, we revisited this traditional sauce invented at ISMIR 2023 in Milan with some Brazilian flavours 🇮🇹🇧🇷 transactions.ismir.net/articles/10.53…

English

154

Bernardo Torres retweetledi

Zineb Lahrichi@zineb_lahrichi·3 Nis

🔉New paper out! Recent audio codecs typically learn compression and quantization jointly, limiting the choice of quantization layers, non-differentiable by definition. What if we used powerful neural quantizers like Qinco2 and trained them offline? arxiv.org/abs/2503.19597

English

4.4K

Bernardo Torres@torres_be_·26 Şub

@RomyBeaute Amazing! Great work :)

English

Romy Beauté@RomyBeaute·26 Şub

My first PhD paper is finally out as a preprint! Mapping of Subjective Accounts into Interpreted Clusters (MOSAIC): Topic Modelling and LLM applied to Stroboscopic Phenomenology 🔗 arxiv.org/abs/2502.18318

English

1.5K

Bernardo Torres@torres_be_·20 Şub

We’ve added fresh ingredients and cooking tricks to make the best, lightest, and fastest neural pitch estimator even better! 🌿🔥 Shoutout to all collaborators and to the amazing chef & SSL titan, @howariou

Alain Riou@howariou

PESTO 2.0 è rilasciato! 🥳🥳🥳 With Brazilian chef @torres_be_ (and others), we revisit this traditional italian sauce, invented in Milan at @ISMIRConf 2023 🇮🇹 And you can taste it in REAL-TIME at home (~5 ms latency) ⏱️ 1/6

English

392

Bernardo Torres retweetledi

Alain Riou@howariou·19 Şub

English

2.2K

Bernardo Torres retweetledi

Stefan Lattner@deeplearnmusic·13 Şub

🌟My keynote at the @c4dm workshop about "Models of Musical Signals: Representation, Learning & Generation" is now on YouTube, giving an overview on developments in self-supervised learning for audio since 2020, low-level representation learning, audio (stem) generation and much more 🧵👇 youtube.com/watch?v=ixHfBP… @SonyCSLMusic @SonyCSLParis

YouTube

English

2.7K

Bernardo Torres@torres_be_·9 Oca

@howariou 👀

QME

Alain Riou@howariou·9 Oca

Wow! I just realized PESTO has 195 ⭐️ on github! 🤩 It would be 𝗿𝗲𝗮𝗹ly cool if I find 𝘁𝗶𝗺𝗲 to push a new feature when it hits 200... 👀 github.com/SonyCSLParis/p…

English

261

Bernardo Torres retweetledi

Victor Letzelter@VLetzelter·26 Haz

Interested in ill-posed learning tasks, uncertainty prediction, conditional density estimation or multi-head deep neural networks ? In our new paper, accepted at #ICML24, we tackle these challenges by exploring the Winner-Takes-All (WTA) training scheme. [1/n]

English

5.6K

Bernardo Torres@torres_be_·12 May

Signals with a high degree of autocorrelation (such as pitched signals) make the training of convnets on raw audio unstable. For Gaussian initialization, the greater the input’s autocorrelation, the greater the variance of the output. Huge relief seeing these kinds of papers <3

Vincent Lostanlen@lostanlen

Training convnets on waveforms is hard—far harder than on magnitude spectrograms. "Instabilities in Convnets for Raw Audio" approaches this phenomenon from the perspective of sensitivity to initialization. IEEE Signal Processing Letters vol. 31 preprint: hal.science/hal-04528116

English

2.6K

Bernardo Torres@torres_be_·19 Nis

So many amazing people in this photo!

Peeters Geoffroy@GeoffroyPeeters

@adasp group from ⁦⁦@telecomparis⁩ @ieeeICASSP⁩ 2014, Seoul, Korea

English

274

Bernardo Torres@torres_be_·13 Nis

Great stuff!

YCY@yoyolicoris

arxiv.org/abs/2404.07970 The paper preprint is out! We'll make an official announcement soon but feel free to have a look first! :)

English

200

Bernardo Torres@torres_be_·8 Şub

@92HsChoi Hello! We struggled to make it work out of the box for more realistic data on a DDSP-like setting. The loss provides the gradient to move in frequency, but having to estimate the n. of harmonics + F0 creates many local minima/instabilities. Getting past that is work in progress!

English

최형석 (Hyeong-Seok Choi)@92HsChoi·8 Şub

@torres_be_ Looks interesting! Have you tried it on real-world dataset? (E.g., speech / singing voice)

English

174

Bernardo Torres@torres_be_·7 Şub

Happy to share some work accepted to ICASSP :) We experiment with a loss function inspired by optimal transport to compare spectra. We test it on a synthesis-based frequency localization (and F0 estimation) toy task using a harmonic synthesizer. Paper: arxiv.org/abs/2312.14507

Peeters Geoffroy@GeoffroyPeeters

10 papers from the Audio group @tp_adasp of Télécom-Paris will be presented at @icassp2014 @ieeeICASSP. --- Paper #1: Bernardo Torres et al. "Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal Transport"

English

4.9K

Bernardo Torres@torres_be_·8 Şub

The SOT loss is based on the p-Wasserstein distance, which has a closed-form solution in 1D. First we compute the cumulative function of both spectra. Then, we invert it to get the quantiles. We add up the differences (raised to power p) and that’s it!

English

144

Bernardo Torres@torres_be_·7 Şub

We call this loss Spectral Optimal Transport (SOT). Compared to L1 and L2 spectral losses on sinusoidal signals, SOT has a very nice convex curve leading to the right oscillator frequency. Its gradient also does not vanish when the frequency difference is high.

English

223

Bernardo Torres retweetledi

Stefan Lattner@deeplearnmusic·5 Şub

🥳 We present our #ICASSP2024 paper: A diffusion model that generates production-quality (bass) audio stems to any audio input. 😎🎸 According to our experience, that's more useful to artists than generating full mixes. 🙃 📜Paper: arxiv.org/abs/2402.01412 🎶Demo: tinyurl.com/bass-acc-demo by @marco_ppasini 👈💪@SonyCSLMusic #MusicAI

English

112

20.3K

Keşfet

@stark_reborn @AssafShocher @RomyBeaute @howariou @ISMIRConf @c4dm @SonyCSLMusic @SonyCSLParis