Bernardo Torres

36 posts

Bernardo Torres

Bernardo Torres

@torres_be_

PhD Student @TelecomParis, @tp_adasp. Previously intern at @researchdeezer @SonyCSLMusic. AI/ML for audio and music signal processing and synthesis.

Paris, France Katılım Ocak 2023
139 Takip Edilen67 Takipçiler
Bernardo Torres
Bernardo Torres@torres_be_·
@stark_reborn @AssafShocher Actually it's not really the same approach since the encoder/decoder don't follow the linearizer architecture they propose. But I would be really interested in simply trying to replace the encoder/decoder with their linearizer and a similar compression ratio to see if it works!
English
0
0
0
44
Assaf Shocher
Assaf Shocher@AssafShocher·
They tell you neural nets are non-linear. What does "linear" even mean?! Linearity is only defined given two vector spaces, X → Y. What if we could find a different pair of spaces where NNs ARE linear? 🤯 We do it and use it for many apps, such as one-step diffusion! 🧵
Assaf Shocher tweet media
English
22
60
562
45K
Bernardo Torres retweetledi
Alain Riou
Alain Riou@howariou·
We just released the full training code, as well as our best pretrained model! 🎉 Feel free to use our SOTA checkpoint in your own project with 3 lines of code, or to retrain on your own data using our Lightning+Hydra+Dora codebase ⚡️🐍 🌐 github.com/sony/sampleid
Alain Riou tweet media
Alain Riou@howariou

Eminem sampled Aerosmith, 50 Cent sampled Nina Simone, everybody sampled Chic... Many great songs sampled existing ones! Detecting this is the topic of our latest paper with @serrjoa at @SonyAI Barcelona 😎 tl;dr: multi-track dataset + few tricks = +18% boost over SOTA 🚀 1/N

English
1
5
17
1.6K
Bernardo Torres
Bernardo Torres@torres_be_·
+ if you work on MIR and use the CQT for tasks which require frame-wise/time-varying estimation, consider using the Variable-Q Transform (VQT) instead. The huge CQT kernels at low frequencies are often underestimated and they might be hurting your performances.
Bernardo Torres tweet media
English
0
0
0
14
Bernardo Torres
Bernardo Torres@torres_be_·
The PESTO extension paper was published in TISMIR! Here's the link to the publication: transactions.ismir.net/articles/10.53… It features several improvements to the model, larger cross dataset evaluation, and a real-time implementation.
Alain Riou@howariou

Estremamente felice di annunciare that our new PESTO recipe has been published in the famous TISMIR cookbook 👨‍🍳🤌 With chef @torres_be_, we revisited this traditional sauce invented at ISMIR 2023 in Milan with some Brazilian flavours 🇮🇹🇧🇷 transactions.ismir.net/articles/10.53…

English
1
0
1
154
Bernardo Torres retweetledi
Zineb Lahrichi
Zineb Lahrichi@zineb_lahrichi·
🔉New paper out! Recent audio codecs typically learn compression and quantization jointly, limiting the choice of quantization layers, non-differentiable by definition. What if we used powerful neural quantizers like Qinco2 and trained them offline? arxiv.org/abs/2503.19597
English
1
13
62
4.4K
Romy Beauté
Romy Beauté@RomyBeaute·
My first PhD paper is finally out as a preprint! Mapping of Subjective Accounts into Interpreted Clusters (MOSAIC): Topic Modelling and LLM applied to Stroboscopic Phenomenology 🔗 arxiv.org/abs/2502.18318
English
3
3
23
1.5K
Bernardo Torres
Bernardo Torres@torres_be_·
We’ve added fresh ingredients and cooking tricks to make the best, lightest, and fastest neural pitch estimator even better! 🌿🔥 Shoutout to all collaborators and to the amazing chef & SSL titan, @howariou
Alain Riou@howariou

PESTO 2.0 è rilasciato! 🥳🥳🥳 With Brazilian chef @torres_be_ (and others), we revisit this traditional italian sauce, invented in Milan at @ISMIRConf 2023 🇮🇹 And you can taste it in REAL-TIME at home (~5 ms latency) ⏱️ 1/6

English
1
1
11
392
Bernardo Torres retweetledi
Alain Riou
Alain Riou@howariou·
PESTO 2.0 è rilasciato! 🥳🥳🥳 With Brazilian chef @torres_be_ (and others), we revisit this traditional italian sauce, invented in Milan at @ISMIRConf 2023 🇮🇹 And you can taste it in REAL-TIME at home (~5 ms latency) ⏱️ 1/6
English
1
7
24
2.2K
Bernardo Torres retweetledi
Stefan Lattner
Stefan Lattner@deeplearnmusic·
🌟My keynote at the @c4dm workshop about "Models of Musical Signals: Representation, Learning & Generation" is now on YouTube, giving an overview on developments in self-supervised learning for audio since 2020, low-level representation learning, audio (stem) generation and much more 🧵👇 youtube.com/watch?v=ixHfBP… @SonyCSLMusic @SonyCSLParis
YouTube video
YouTube
English
1
11
38
2.7K
Alain Riou
Alain Riou@howariou·
Wow! I just realized PESTO has 195 ⭐️ on github! 🤩 It would be 𝗿𝗲𝗮𝗹ly cool if I find 𝘁𝗶𝗺𝗲 to push a new feature when it hits 200... 👀 github.com/SonyCSLParis/p…
English
2
0
14
261
Bernardo Torres retweetledi
Victor Letzelter
Victor Letzelter@VLetzelter·
Interested in ill-posed learning tasks, uncertainty prediction, conditional density estimation or multi-head deep neural networks ? In our new paper, accepted at #ICML24, we tackle these challenges by exploring the Winner-Takes-All (WTA) training scheme. [1/n]
Victor Letzelter tweet media
English
1
10
36
5.6K
Bernardo Torres
Bernardo Torres@torres_be_·
Signals with a high degree of autocorrelation (such as pitched signals) make the training of convnets on raw audio unstable. For Gaussian initialization, the greater the input’s autocorrelation, the greater the variance of the output. Huge relief seeing these kinds of papers <3
Vincent Lostanlen@lostanlen

Training convnets on waveforms is hard—far harder than on magnitude spectrograms. "Instabilities in Convnets for Raw Audio" approaches this phenomenon from the perspective of sensitivity to initialization. IEEE Signal Processing Letters vol. 31 preprint: hal.science/hal-04528116

English
0
5
30
2.6K
Bernardo Torres
Bernardo Torres@torres_be_·
@92HsChoi Hello! We struggled to make it work out of the box for more realistic data on a DDSP-like setting. The loss provides the gradient to move in frequency, but having to estimate the n. of harmonics + F0 creates many local minima/instabilities. Getting past that is work in progress!
English
0
0
2
74
Bernardo Torres
Bernardo Torres@torres_be_·
Happy to share some work accepted to ICASSP :) We experiment with a loss function inspired by optimal transport to compare spectra. We test it on a synthesis-based frequency localization (and F0 estimation) toy task using a harmonic synthesizer. Paper: arxiv.org/abs/2312.14507
Bernardo Torres tweet mediaBernardo Torres tweet media
Peeters Geoffroy@GeoffroyPeeters

10 papers from the Audio group @tp_adasp of Télécom-Paris will be presented at @icassp2014 @ieeeICASSP. --- Paper #1: Bernardo Torres et al. "Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal Transport"

English
2
3
27
4.9K
Bernardo Torres
Bernardo Torres@torres_be_·
The SOT loss is based on the p-Wasserstein distance, which has a closed-form solution in 1D. First we compute the cumulative function of both spectra. Then, we invert it to get the quantiles. We add up the differences (raised to power p) and that’s it!
Bernardo Torres tweet mediaBernardo Torres tweet media
English
0
0
0
144
Bernardo Torres
Bernardo Torres@torres_be_·
We call this loss Spectral Optimal Transport (SOT). Compared to L1 and L2 spectral losses on sinusoidal signals, SOT has a very nice convex curve leading to the right oscillator frequency. Its gradient also does not vanish when the frequency difference is high.
Bernardo Torres tweet media
English
1
0
1
223