thecollabagepatch

3

67

thecollabagepatch@thepatch_kev·41m

can run it locally inside gary now with your own loras lmk if anything's broken plz github.com/betweentwomidn…

English

thecollabagepatch@thepatch_kev

4

thecollabagepatch@thepatch_kev·41m

using bpm, key, a locked seed and multiple LoRa's makes stable audio 3 feel like an instrument

some ai music models are actually made with musicians in mind stable audio 3 is a great example of that. grateful to @zqevans and friends for the opportunity to get a jumpstart on integrating this into the DAW it's going to be a very fun summer. if you want to play with it in the pre-release of gary4juce v4...⬇️

English

3

67

thecollabagepatch@thepatch_kev·1d

@guilhermeotina yeah and when you serve many different model architectures, each one surfaces updates in its own fun ways, so the code is a different shape for each

English

9

Guilherme O'Tina@guilhermeotina·1d

the root issue is that ML progress is fundamentally non-linear. the model might spend 90% of inference on the last 10% of tokens. you cant report a meaningful percentage because the time-per-step varies wildly with cache state, batch composition, and model depth. best you can do is "running" vs "done" with maybe a token counter, and thats unsatisfying for users

English

0

1

20

thecollabagepatch@thepatch_kev·1d

the hardest part of implementing ML interfaces for me is always surfacing progress updates to the UI properly

English

0

3

267

thecollabagepatch retweetledi

Zachary Novack@zacknovack·3d

Can we transform offline audio diffusion into real-time streaming interactive instruments? Yes! Presenting Live Music Diffusion Models: a new paradigm for taking your favorite open models into live performance, right on your own laptop! 🎵🎵 🧵

English

8

28

160

11.9K

thecollabagepatch@thepatch_kev·3d

@wilderthanrogue

GIF

QME

0

26

Undermused@wilderthanrogue·4d

@thepatch_kev

GIF

QME

0

1

51

thecollabagepatch@thepatch_kev·4d

in the name of research we got an arduino uno q (4gb of ram) to run stable-audio-3-small-sfx using a swap file, after 3 minutes, it actually worked lol

English

22

3.2K

thecollabagepatch@thepatch_kev·4d

@jordiponsdotme it absolutely nailed the specifics of the prompt lol

English

thecollabagepatch@thepatch_kev

0

22

Jordi Pons@jordiponsdotme·4d

exactly the type of use I was expecting of small-sfx gpu hours well invested (wait till the end)

in the name of research we got an arduino uno q (4gb of ram) to run stable-audio-3-small-sfx using a swap file, after 3 minutes, it actually worked lol

English

0

10

1.1K

thecollabagepatch@thepatch_kev·4d

@yikesawjeez

GIF

QME

thecollabagepatch@thepatch_kev

1

16

yikes (:D/acc)@yikesawjeez·4d

go engage wif my frens post cos i am very proud of himb

some ai music models are actually made with musicians in mind stable audio 3 is a great example of that. grateful to @zqevans and friends for the opportunity to get a jumpstart on integrating this into the DAW it's going to be a very fun summer. if you want to play with it in the pre-release of gary4juce v4...⬇️

English

0

1

106

thecollabagepatch@thepatch_kev·5d

@_lyraaaa_ it's so purdy

English

1

60

lyra bubbles@_lyraaaa_·5d

built a fun little audio inpainting UI for stable audio 3 paint any number of arbitrary masks down to a single latent long and it will happily inpaint them. also supports audio2audio and text2audio code soon!

English

7

42

2.9K

thecollabagepatch@thepatch_kev·5d

@Proaitrends @zqevans thanks! still figuring out how to train and use the loras properly

English

1

130

Ifeanyichukwu@Proaitrends·5d

@thepatch_kev @zqevans Sounds super cool

English

0

1

214

thecollabagepatch@thepatch_kev·5d

some ai music models are actually made with musicians in mind stable audio 3 is a great example of that. grateful to @zqevans and friends for the opportunity to get a jumpstart on integrating this into the DAW it's going to be a very fun summer. if you want to play with it in the pre-release of gary4juce v4...⬇️

Stability AI@StabilityAI

Meet Stable Audio 3.0, the open-weight model family built for artistic experimentation. This is our open invitation to experiment with generative audio. We believe the best innovations are still waiting to be built. The 4-1-1 on 3.0: 📣 You own your outputs, and can distribute and commercialize them under the Stability AI Community License (up to $1 million in revenue). 🎵 New and improved capabilities include variable-length generation up to six minutes, and full song composition on portable devices, no GPU required. ✅ Trained on a fully licensed dataset. 🎨 You can customize the models on your own library with support for LoRa training, which we’ve documented for the first time. More on the models 👇

English

17

159

27.1K

thecollabagepatch@thepatch_kev·5d

@jasondesante @zqevans lol too much? maybe i get a lil excited with da vinci resolve

English

75

thecollabagepatch@thepatch_kev·5d

@eschatolocation yeah i'm having a lot of fun with it. the gen speed makes it so great for flow state inside the DAW

English

4

41

eschatolocation@eschatolocation·5d

stable audio 3 is actually good?? tons of artifacts but the outputs are way less slop than pretty much any other model ive tested. extremely fast on my local setup too

English

3

0

4

178

thecollabagepatch@thepatch_kev·5d

@RoyalCities yes i think we'll see small is beautiful on gpu for instant gens of varying lengths! my head is swimming with use cases rn but have rly only played with medium-arc and the lora blending so far

English

1

10

JD | RoyalCities@RoyalCities·5d

@thepatch_kev oh shoot yeah that's cool! Ahh I need to get on that. Even just to do some A/B tests. The small model seems to list cpu inference. I wonder if it can also take gpus and improve the fast speed already.... I was also going to look at that small arm model too from a few months ago.

English

Jordi Pons@jordiponsdotme

0

1

27

JD | RoyalCities@RoyalCities·5d

Definitely going to dig into this. In-Painting and editing is a MASSIVE W for open source models.

Stable Audio 3, explained in 5 figures. It’s a family of open-weight models for generating instrumental music and sound effects. The models are fast, support editing, and are trained on licensed and Creative Commons audio. 👾 artintech.substack.com/p/stable-audio… 🏋️‍♂️github.com/Stability-AI/s…

English

1

14

1.1K

thecollabagepatch@thepatch_kev·5d

i did notice in gradio there's a padding default of 6 seconds that at first i didn't understand why, but it became apparent if i bump that to, say, 20, it kinda guarantees i could continue a piece of music infinitely so that it's never "ending" the song in the generation i get back so i added that to my api as well and just need to expose it to Uis

English

0

1

10

JD | RoyalCities@RoyalCities·5d

Haven't had a chance to install since I've basically been knee deep in metadata but this model seems to just use the total end seconds and doesn't pad with silence so that should help massively as well. I'll dig into the small model at some point. Have a larger foundation upgrade in the works. If it pans out then I'll look at seeing what those small models can do. 3 seconds for minutes of audio is absurd. Opens the doors for a ton of stuff.

English

0

2

37

thecollabagepatch@thepatch_kev·5d

i will be furiously adding features and LoRas for us all to blend, and of course, a local backend will be added to gary4local this week for both pc and mac users github.com/betweentwomidn…

English

1

4

413

thecollabagepatch@thepatch_kev·6d

@RoyalCities ok i retract my previous statement about not wanting foundation to do drums if you get the one shots working... my max4live senses are tingling

English

0

1

12

JD | RoyalCities@RoyalCities·6d

But yeah I promise I'm still working haha. Just have had my head down and I hate pre-announcing stuff unless I am SURE it will work....but I do think this one will....

English

0

2

50

JD | RoyalCities@RoyalCities·6d

Been MIA so a quick Foundation-1 update: I’ve had to rebuild a bunch of stuff for one-shot support. Having a model generate usable one shots while preserving consistent timbre/scale behavior is an unsolved problem but I think I have a way to do it. The upside is it’s evolved into something else I’ve wanted to exist for a long time.... If this works, the upgrade is going to unlock some really cool stuff.

English

0

5

143

thecollabagepatch@thepatch_kev·16 May

the world is drowning in markdown files

English