FaRo

10K posts

FaRo

@faroit

@[email protected] Audio-AI researcher at @audioshakeai (Before: @inria, @FraunhoferIIS / @uniFAU). All in 17.68% of grey

Montpellier, France เข้าร่วม Nisan 2008

717 กำลังติดตาม1K ผู้ติดตาม

ทวีตที่ปักหมุด

FaRo@faroit·1 Eki

Happy to see that DeMask built with asteroid won the first place of the @PyTorch summer #hackathon! The model allows you to enhance muffled speech when wearing facemask Demo: youtu.be/QLf10Uqu8Yk 👋 to our team: @mnlpariente @michelolzam @_jonashaag and Samuel Cornell

YouTube

English

FaRo รีทวีตแล้ว

AudioShake@AudioShakeAI·6 Ara

DJs making music magic, create yours with AudioShake stems. djay Pro from @algoriddim now includes AudioShake’s tech to isolate vocals, instruments, and drums at the highest quality available to the industry. On mobile. 🎧📷#djtools #beatmaking

English

3.2K

FaRo รีทวีตแล้ว

Ondřej Cífka@cifkao·30 Kas

Our paper on lyrics transcription evaluation is on arXiv with updated and extended results (including Whisper v3)! 📄 arxiv.org/abs/2311.13987 Also, the benchmark is now on @paperswithcode: 🏆 paperswithcode.com/dataset/jam-alt 👏 @faroit @h_schreiber @Luke_Miner @cnst_ant @AudioShakeAI

AudioShake@AudioShakeAI

This month at #ISMIR2023, AudioShake’s Research team presented a new benchmark for automatic lyric transcription systems– one that accounts for the nuances of music. You can read more on their new paper on AudioShake: audioshake.ai/post/new-bench…

English

1.3K

FaRo รีทวีตแล้ว

Ondřej Cífka@cifkao·8 Kas

We just released Jam-ALT, a formatting- and punctuation-aware automatic lyrics transcription benchmark (based on the JamendoLyrics dataset) that follows music industry guidelines. 🧵 🔎 audioshake.github.io/jam-alt/ 🤗 huggingface.co/datasets/audio… 🧑‍💻 github.com/audioshake/alt…

English

1.6K

FaRo@faroit·29 Eyl

@ISMIRConf its 3PM and its not open :-/

English

332

ISMIR Conference@ISMIRConf·29 Eyl

📢 LATE-BREAKING DEMO REOPENS 📢 today at 3 p.m. (CEST) and will remain open until the originally announced deadline, We can guarantee acceptance of a limited number of papers (venue capacity will be confirmed later) and apply a priority to papers submitted early. #ISMIR2023

English

2.9K

FaRo@faroit·27 Eyl

@zhaojw1998 @gkspearow closing this after a couple of hours doesn't sound like a good review process to me

English

119

Jingwei Zhao@zhaojw1998·27 Eyl

@gkspearow I believe that is the most likely reason. Since there are only 15 entries to be accepted, I guess it's reasonable that all spots are filled quite early.

English

191

Jingwei Zhao@zhaojw1998·26 Eyl

ISMIR LBD just started and acceptance is rolling-based. CMT still open this morning but closed just now before I finished my draft. Never expected such competitiveness🥲 But anyw, seems many people have great progress to share. Look forward to connect with #ISMIR2023 in Milan :-)

English

4.3K

FaRo@faroit·27 Eyl

@zhaojw1998 same here for us. Very unfortunate and not really fair as it can't be automated. Why not let more papers to be submitted and review them by quality instead of submission time?

English

150

FaRo@faroit·27 Haz

@serrjoa Btw. Awesome paper!

English

121

Joan Serrà@serrjoa·27 Haz

Want to convert from mono to stereo? 🔊➡️🔊🔊 In our latest work, we posit that upmixing mono to stereo is a great avenue for generative modeling, and that parametric stereo coding can facilitate things. Paper: arxiv.org/abs/2306.14647

English

100

19K

FaRo@faroit·27 Haz

@serrjoa Out of curiosity, is that parametric stereo robust enough to extracting spatial parameters from a mixture and apply it on sources?

English

213

FaRo@faroit·28 Şub

@JOSS_TheOJ The main author is github.com/neillu23 and paper submission can be found here: github.com/openjournals/j…

English

258

FaRo@faroit·28 Şub

I am looking for reviewers for a @JOSS_TheOJ submission with expertise in speech and python. The software under review is a new speech enhancement module of github.com/espnet/espnet (hence its so difficult to get reviewers without conflicts of interest). Any pointer is helpful!

English

1.2K

FaRo@faroit·27 Şub

@ParcolletT @julien_c Montpellier!

Français

112

Julien Chaumond@julien_c·25 Şub

Hugging face Lyon office

English

150

16.3K

FaRo รีทวีตแล้ว

Yusong Wu@wuyusongwys·5 Şub

Join us for a special edition of our Mila Music + AI Reading Group from February 8th to 22nd! We're excited to host 5 teams from the 2022 AI Song Contest, an international contest where musicians and scientists collaborate to explore human-ai co-creativity.

English

14.1K

FaRo รีทวีตแล้ว

Loreto Parisi@loretoparisi·3 Şub

#Pytorch implementation of MusicLM, new SOTA model for music generation using attention networks plus embeddings from MuLan, a text-audio contrastive learned model github.com/lucidrains/mus…

English

585

FaRo@faroit·4 Şub

@DrJimFan @WilliamLamkin Reason for that are conference deadlines though

English

Jim Fan@DrJimFan·2 Şub

@WilliamLamkin Same, I’m surprised by the momentum in audio this year. 4 models in one week is insane even by modern AI’s pace.

English

8.7K

Jim Fan@DrJimFan·2 Şub

Music & sound effect industry has not fully understood the size of the storm about to hit. There’re not just one, or two, but FOUR audio models in the past week *alone* If 2022 is the year of pixels for generative AI, then 2023 is the year of sound waves. Deep dive with me: 🧵

English

914

4.3K

1.1M

FaRo รีทวีตแล้ว

AIcrowd@aicrowdHQ·3 Şub

🎻 The SDX23 challenge introduces a new formulation of audio source separation: cinematic sound separation. The task is to separate a movie's audio into three tracks: dialogue, sound effects & music. 📕 Give it a try using the starter kit. aicrowd.com/challenges/sou…

English

1.1K

FaRo รีทวีตแล้ว

Haohe Liu@LiuHaohe·3 Şub

#AudioLDM, the text-to-audio model, is now available on HuggingFace and GitHub to play with! We will add more functionality and further improve the model performance in the near future. Share the interesting samples you generate! github.com/haoheliu/Audio… huggingface.co/spaces/haoheli…

English

471

146.6K

FaRo@faroit·2 Şub

@naotokui_en @csteinmetz1 Training is just a very small aspect of it. Lawyers will first go after the obvious things: the startups that can generate new Taylor Swift songs.

English

Nao Tokui@naotokui_en·1 Şub

@csteinmetz1 It can be problematic because it’s completely legal to train AI models on copyrighted material in some countries (including Japan)

English

575

Christian Steinmetz@csteinmetz1·31 Oca

No one would dare train their music generation diffusion model on copyrighted music so we have nothing to worry about... 🫠

Eric Wallace@Eric_Wallace_

Models such as Stable Diffusion are trained on copyrighted, trademarked, private, and sensitive images. Yet, our new paper shows that diffusion models memorize images from their training data and emit them at generation time. Paper: arxiv.org/abs/2301.13188 👇[1/9]

English

5.7K

FaRo@faroit·2 Şub

I guess developers are thrilled!

Developers@XDevelopers

Starting February 9, we will no longer support free access to the Twitter API, both v2 and v1.1. A paid basic tier will be available instead 🧵

English

301

FaRo@faroit·2 Şub

@JonathanLeRoux @ethanmanilow @r4b1tt ^

QAM

FaRo@faroit·2 Şub

@JonathanLeRoux @ethanmanilow For the challenge we run mean across songs. But many papers and also (rightfully) show median across songs as outliers can be dramatic.

English

128

FaRo@faroit·31 Oca

Hey music separation researchers. We added a new definition of the SDR metric when we launched the last @sounddemix. To make it less confusing for future papers, we want to rename the metric. Please vote

English

1.9K

ค้นพบ

@algoriddim @paperswithcode @h_schreiber @Luke_Miner @cnst_ant @AudioShakeAI @ISMIRConf @zhaojw1998