MClem

12.3K posts

MClem

@mclemcrew

Lights and sounds are kinda my thing ✨

Chesterton, IN Katılım Mart 2016

1.2K Takip Edilen711 Takipçiler

MClem@mclemcrew·2d

NOW THIS IS WHAT I’M TALKING ABOUT!!! PD ftw 🙌 Grateful to be working in the same domain alongside all these amazing folks doing cool work✨

Chris Donahue@chrisdonahuey

Vibe coding is cool but have you tried vibe patching? Pure Vibes = Pure Data + MCP. Describe your sound, watch the patch appear 🌊 🔊👇

English

MClem retweetledi

Chris Donahue@chrisdonahuey·2d

Vibe coding is cool but have you tried vibe patching? Pure Vibes = Pure Data + MCP. Describe your sound, watch the patch appear 🌊 🔊👇

English

103

5.9K

MClem retweetledi

Hao-Wen (Herman) Dong 董皓文@hermanhwdong·2d

Didn't expect vibe patching to come this fast! 🤯

Chris Donahue@chrisdonahuey

Vibe coding is cool but have you tried vibe patching? Pure Vibes = Pure Data + MCP. Describe your sound, watch the patch appear 🌊 🔊👇

English

562

MClem retweetledi

RoyalCities@RoyalCities·5d

Foundation-1 is available here huggingface.co/RoyalCities/Fo… Typical inference: • ~7 GB VRAM • ~7–8 seconds per sample on an RTX 3090 • runs entirely local Have fun!

English

145

3.4K

MClem retweetledi

RoyalCities@RoyalCities·5d

After months of work, today I’m releasing Foundation-1. A SOTA text-to-sample model built specifically for music production workflows. It may also be the most advanced AI sample generator currently available - open or closed. • ~7 GB VRAM • Entirely local • 100% free 😁

English

148

1.3K

107.8K

MClem@mclemcrew·13 Mar

If you're interested in joining, please fill out when2meet.com/?35539832-6cezP and send me your email so I can add you to the list 😄

English

MClem@mclemcrew·13 Mar

I'm starting up a MusicAI Reading Group that'll meet once a week on Zoom for 45 mins and discuss the latest in music, signal processing, co-creativity, and AI! We'll be going through academic papers in more depth and test out some demos for making music and exploring this space!

English

MClem retweetledi

Julia Turc@juliarturc·11 Mar

Diffusion models clicked for me when I started seeing them through the lens of particle motion. I built this interactive playground where you too can clickety-clack to understand how drift, noise, and other hyperparams control diffusion. I hereby submit this as penance for the sin of YouTube edu-tainment 😇 Link in the first comment.

English

558

30.6K

MClem retweetledi

Chris Donahue@chrisdonahuey·11 Mar

Check out our new work! SotA lossless compression of CD-quality audio w/ autoregressive models of raw waveforms (albeit w/ compute costs far exceeding FLAC) Raw waveform modeling may be irrelevant for generation, but still lots of potential for compression!

Phillip Long@p1long_

Can LMs losslessly-compress CD-quality audio? Presenting Trilobyte: tractable LM-based lossless compression of full-fidelity audio, achieving 18% improvement over FLAC! w/ @zacknovack @chrisdonahuey 🧵

English

1.6K

MClem retweetledi

Jesse Engel@jesseengel·9 Mar

Today, we're open sourcing the code behind "The Infinite Crate," our VST/AU plugin that lets you play with Lyria RealTime directly inside your favorite DAW! 🎧🎶 Since we released The Infinite Crate last year, it’s been used by some of our favorite artists - including a wonderful showcase with @daitomanabe in Tokyo - and being featured as a top new music tool at NAMM 2026. We’re now fully open sourcing the plugin for developers to fork, modify, and make their own under the permissive Apache 2.0 license. 💾 Get it here: github.com/magenta/the-in…

English

199

18.6K

MClem retweetledi

Zachary Novack@zacknovack·6 Mar

Our latest paper (and 2nd from my time at @harmonai_org) is out and accepted at #ICASSP2026 ! Diffusion guidance has always been tricky to wrangle for creative control tasks, but we've broken ground on making it a lot simpler and faster!

Jordi Pons@jordiponsdotme

Our latest paper, LatCHs (Latent-Control Heads), is out today! Really happy we'll get to present it at ICASSP’26 in Barcelona :) > Guidance-based controls > Selective TFG > Latent-Control Heads (LatCHs) 🗄️ arXiv: arxiv.org/abs/2603.04366 👾 blogpost: artintech.substack.com/p/latchs-expla…

English

1.1K

MClem retweetledi

Jordi Pons@jordiponsdotme·5 Mar

English

2.3K

MClem retweetledi

Kenneth Marino@Kenneth_Marino·4 Mar

It’s been less than a year since I started my lab (SPARK Lab) at @UUtah we already have a ton of new stuff that I can’t wait to talk about soon. Stay tuned for more. I’ll start today by sharing that our updated Computer Use Survey blog has been accepted to ICLR Blogposts 2026. Collaboration with my student @aplycaebous and Utah colleague @anmarasovic.

English

904

MClem retweetledi

Christian Steinmetz@csteinmetz1·26 Şub

The talks tonight at the Boston AI Music Meetup made me think about where this field is going. Here’s what I learned: Jatin Chowdhury is an expert in getting machine learning models to run in real-time audio systems. He laid out the very real challenges of taking models out of research environments and putting them into actual audio applications. There are many inference solutions out there such as ONNX, TFLite, and libtorch, but none of them really treat audio as a first-class citizen. Real-time audio has strict latency constraints, tiny buffers, and highly variable consumer hardware. You can't just import your model and hope for the best. That is what motivated Jatin to build RTNeural, an open source C++ library designed specifically for audio neural networks. It enables building production-level audio plugins with neural networks that actually run in real time. He shared several real plugins built using this technology. If I were working on neural nets inside audio plugins, this is where I would start. That said, he was clear about the remaining challenges. Scaling to larger models is still difficult. Consumer hardware varies widely. Communication between CPU and GPU adds overhead. Meanwhile, models are getting heavier and more capable. There is a real tension here. Musicians do not want latency to stand in the way of their creativity, but AI models keep pushing toward greater compute requirements. The research field has challenges to address here. Ethan Manilow, a researcher at Google DeepMind, zoomed out completely. His talk traced how music technology has shaped and reshaped our relationship with sound. He focused on how Edison’s phonograph transformed music from an ephemeral experience into something that could be replayed and transported. But it also made people uncomfortable. Early listeners were unsettled by hearing a human voice without a human present. Generative music may be producing a similar reaction today. When we hear the inflection of a generated vocalist, there can be that same uncanny feeling. History gives us perspective, but not a prescription for the future. Ethan’s broader point was that no technology is guaranteed to succeed. History is often told as a linear progression, but many technologies faded while others persisted. We are still in the process of determining what role AI will ultimately play in music. The most exciting part of AI and music today is that there is still so much to figure out. Even the best audio models are still lacking in some capabilities and their latency is far beyond what many musicians desire. If you want to take part in more discussions like this, join us at our next event in March. Details are on our meetup page.

Christian Steinmetz@csteinmetz1

The winter storm won't stop us. The next Boston AI Music Meetup is now happening this Wed, Feb 25th at MIT Media Lab. Looking forward to talks from two awesome speakers: Jatin Chowdhury and Ethan Manilow (@ethanmanilow). RSVP at the link in the comments, we have limited spots.

English

4.8K

MClem retweetledi

Chris Donahue@chrisdonahuey·25 Şub

🚨 New music AI models entering the fray on Music Arena! @GoogleDeepMind Lyria 3, @SonautoAI v3-preview, @ACEStudio_en ACE-Step v1.5 . ⚔️Which will emerge victorious?

English

8.8K

MClem retweetledi

Scott H. Hawley@drscotthawley·23 Şub

🎉 "Flow Where You Want" accepted to ICLR 2026's Blogpost Track! I craved technical feedback (teach me!), yet critiques were purely re. scope & novelty (it's a tutorial 🤷‍♂️), not technical merit, and they appreciated the clear presentation. Will post revised version next month.

Scott H. Hawley@drscotthawley

🛶 New blog tutorial: "Flow Where You Want" Want to steer pretrained flow models without retraining? I spent months simplifying guidance methods: intuitive visuals, accessible math, & runnable code -- it's a Colab! 🔗 below. (btw that's my kayak)

English

3.8K

MClem retweetledi

Justin Salamon@justin_salamon·18 Şub

This is big. SOTA audio reasoning. SOTA video reasoning. SOTA audio captioning. SOTA sound event detection. Better than Gemini. Better than Qwen. TAC: Timestamped Audio Captioning 📑 paper: lnkd.in/getEz5xU 🌐 website with more demos: lnkd.in/gdw5TTuS

English

264

19.1K

MClem retweetledi

Sarah Drasner@sarah_edo·17 Şub

💥 I made a new drawing in my AI series, this time about Vector Databases, ANN, and HNSW. I hope it's useful! It might be good to look at the Transformers one in the series before this one, for additional background.

English

124

1.4K

79.3K

MClem retweetledi

Hugging Models@HuggingModels·16 Şub

Qwen3.5 is here 🚀 397B params, just 17B active. Native multimodal agents for coding, reasoning, GUI + video. 200+ languages. Open weights. Real scale. The next frontier is open. 🔗 huggingface.co/Qwen/Qwen3.5-3…

English

142

1.5K

107.5K

MClem retweetledi

Karan🧋@kmeanskaran·15 Şub

"we used to import tensorFlow to train neural networks."

English

416

8.2K

227.2K

Keşfet

@daitomanabe @harmonai_org @UUtah @aplycaebous @anmarasovic @GoogleDeepMind @SonautoAI @ACEStudio_en