thecollabagepatch

3.2K posts

thecollabagepatch banner
thecollabagepatch

thecollabagepatch

@thepatch_kev

i build alot of stuff with open source music models https://t.co/17IbjChnzf

Seattle, WA Katılım Ağustos 2023
264 Takip Edilen306 Takipçiler
Sabitlenmiş Tweet
thecollabagepatch
thecollabagepatch@thepatch_kev·
using bpm, key, a locked seed and multiple LoRa's makes stable audio 3 feel like an instrument
thecollabagepatch@thepatch_kev

some ai music models are actually made with musicians in mind stable audio 3 is a great example of that. grateful to @zqevans and friends for the opportunity to get a jumpstart on integrating this into the DAW it's going to be a very fun summer. if you want to play with it in the pre-release of gary4juce v4...⬇️

English
1
1
3
67
thecollabagepatch
thecollabagepatch@thepatch_kev·
@guilhermeotina yeah and when you serve many different model architectures, each one surfaces updates in its own fun ways, so the code is a different shape for each
English
0
0
0
9
Guilherme O'Tina
Guilherme O'Tina@guilhermeotina·
the root issue is that ML progress is fundamentally non-linear. the model might spend 90% of inference on the last 10% of tokens. you cant report a meaningful percentage because the time-per-step varies wildly with cache state, batch composition, and model depth. best you can do is "running" vs "done" with maybe a token counter, and thats unsatisfying for users
English
1
0
1
20
thecollabagepatch
thecollabagepatch@thepatch_kev·
the hardest part of implementing ML interfaces for me is always surfacing progress updates to the UI properly
English
1
0
3
267
thecollabagepatch retweetledi
Zachary Novack
Zachary Novack@zacknovack·
Can we transform offline audio diffusion into real-time streaming interactive instruments? Yes! Presenting Live Music Diffusion Models: a new paradigm for taking your favorite open models into live performance, right on your own laptop! 🎵🎵 🧵
English
8
28
160
11.9K
thecollabagepatch
thecollabagepatch@thepatch_kev·
in the name of research we got an arduino uno q (4gb of ram) to run stable-audio-3-small-sfx using a swap file, after 3 minutes, it actually worked lol
English
2
2
22
3.2K
lyra bubbles
lyra bubbles@_lyraaaa_·
built a fun little audio inpainting UI for stable audio 3 paint any number of arbitrary masks down to a single latent long and it will happily inpaint them. also supports audio2audio and text2audio code soon!
lyra bubbles tweet media
English
2
7
42
2.9K
eschatolocation
eschatolocation@eschatolocation·
stable audio 3 is actually good?? tons of artifacts but the outputs are way less slop than pretty much any other model ive tested. extremely fast on my local setup too
English
3
0
4
178
thecollabagepatch
thecollabagepatch@thepatch_kev·
@RoyalCities yes i think we'll see small is beautiful on gpu for instant gens of varying lengths! my head is swimming with use cases rn but have rly only played with medium-arc and the lora blending so far
English
0
0
1
10
JD | RoyalCities
JD | RoyalCities@RoyalCities·
@thepatch_kev oh shoot yeah that's cool! Ahh I need to get on that. Even just to do some A/B tests. The small model seems to list cpu inference. I wonder if it can also take gpus and improve the fast speed already.... I was also going to look at that small arm model too from a few months ago.
English
1
0
1
27
thecollabagepatch
thecollabagepatch@thepatch_kev·
i did notice in gradio there's a padding default of 6 seconds that at first i didn't understand why, but it became apparent if i bump that to, say, 20, it kinda guarantees i could continue a piece of music infinitely so that it's never "ending" the song in the generation i get back so i added that to my api as well and just need to expose it to Uis
English
1
0
1
10
JD | RoyalCities
JD | RoyalCities@RoyalCities·
Haven't had a chance to install since I've basically been knee deep in metadata but this model seems to just use the total end seconds and doesn't pad with silence so that should help massively as well. I'll dig into the small model at some point. Have a larger foundation upgrade in the works. If it pans out then I'll look at seeing what those small models can do. 3 seconds for minutes of audio is absurd. Opens the doors for a ton of stuff.
English
1
0
2
37
thecollabagepatch
thecollabagepatch@thepatch_kev·
i will be furiously adding features and LoRas for us all to blend, and of course, a local backend will be added to gary4local this week for both pc and mac users github.com/betweentwomidn…
English
0
1
4
413
thecollabagepatch
thecollabagepatch@thepatch_kev·
@RoyalCities ok i retract my previous statement about not wanting foundation to do drums if you get the one shots working... my max4live senses are tingling
English
1
0
1
12
JD | RoyalCities
JD | RoyalCities@RoyalCities·
But yeah I promise I'm still working haha. Just have had my head down and I hate pre-announcing stuff unless I am SURE it will work....but I do think this one will....
English
1
0
2
50
JD | RoyalCities
JD | RoyalCities@RoyalCities·
Been MIA so a quick Foundation-1 update: I’ve had to rebuild a bunch of stuff for one-shot support. Having a model generate usable one shots while preserving consistent timbre/scale behavior is an unsolved problem but I think I have a way to do it. The upside is it’s evolved into something else I’ve wanted to exist for a long time.... If this works, the upgrade is going to unlock some really cool stuff.
English
2
0
5
143
thecollabagepatch
thecollabagepatch@thepatch_kev·
the world is drowning in markdown files
English
0
0
3
51