MClem

12.3K posts

MClem banner
MClem

MClem

@mclemcrew

Lights and sounds are kinda my thing ✨

Chesterton, IN Katılım Mart 2016
1.2K Takip Edilen711 Takipçiler
MClem retweetledi
Chris Donahue
Chris Donahue@chrisdonahuey·
Vibe coding is cool but have you tried vibe patching? Pure Vibes = Pure Data + MCP. Describe your sound, watch the patch appear 🌊 🔊👇
English
8
16
103
5.9K
MClem retweetledi
RoyalCities
RoyalCities@RoyalCities·
Foundation-1 is available here huggingface.co/RoyalCities/Fo… Typical inference: • ~7 GB VRAM • ~7–8 seconds per sample on an RTX 3090 • runs entirely local Have fun!
English
13
2
145
3.4K
MClem retweetledi
RoyalCities
RoyalCities@RoyalCities·
After months of work, today I’m releasing Foundation-1. A SOTA text-to-sample model built specifically for music production workflows. It may also be the most advanced AI sample generator currently available - open or closed. • ~7 GB VRAM • Entirely local • 100% free 😁
English
81
148
1.3K
107.8K
MClem
MClem@mclemcrew·
If you're interested in joining, please fill out when2meet.com/?35539832-6cezP and send me your email so I can add you to the list 😄
English
0
0
0
38
MClem
MClem@mclemcrew·
I'm starting up a MusicAI Reading Group that'll meet once a week on Zoom for 45 mins and discuss the latest in music, signal processing, co-creativity, and AI! We'll be going through academic papers in more depth and test out some demos for making music and exploring this space!
English
1
0
2
90
MClem retweetledi
Julia Turc
Julia Turc@juliarturc·
Diffusion models clicked for me when I started seeing them through the lens of particle motion. I built this interactive playground where you too can clickety-clack to understand how drift, noise, and other hyperparams control diffusion. I hereby submit this as penance for the sin of YouTube edu-tainment 😇 Link in the first comment.
English
21
43
558
30.6K
MClem retweetledi
Chris Donahue
Chris Donahue@chrisdonahuey·
Check out our new work! SotA lossless compression of CD-quality audio w/ autoregressive models of raw waveforms (albeit w/ compute costs far exceeding FLAC) Raw waveform modeling may be irrelevant for generation, but still lots of potential for compression!
Phillip Long@p1long_

Can LMs losslessly-compress CD-quality audio? Presenting Trilobyte: tractable LM-based lossless compression of full-fidelity audio, achieving 18% improvement over FLAC! w/ @zacknovack @chrisdonahuey 🧵

English
0
2
24
1.6K
MClem retweetledi
Jesse Engel
Jesse Engel@jesseengel·
Today, we're open sourcing the code behind "The Infinite Crate," our VST/AU plugin that lets you play with Lyria RealTime directly inside your favorite DAW! 🎧🎶 Since we released The Infinite Crate last year, it’s been used by some of our favorite artists - including a wonderful showcase with @daitomanabe in Tokyo - and being featured as a top new music tool at NAMM 2026. We’re now fully open sourcing the plugin for developers to fork, modify, and make their own under the permissive Apache 2.0 license. 💾 Get it here: github.com/magenta/the-in…
English
8
42
199
18.6K
MClem retweetledi
Zachary Novack
Zachary Novack@zacknovack·
Our latest paper (and 2nd from my time at @harmonai_org) is out and accepted at #ICASSP2026 ! Diffusion guidance has always been tricky to wrangle for creative control tasks, but we've broken ground on making it a lot simpler and faster!
Jordi Pons@jordiponsdotme

Our latest paper, LatCHs (Latent-Control Heads), is out today! Really happy we'll get to present it at ICASSP’26 in Barcelona :) > Guidance-based controls > Selective TFG > Latent-Control Heads (LatCHs) 🗄️ arXiv: arxiv.org/abs/2603.04366 👾 blogpost: artintech.substack.com/p/latchs-expla…

English
0
4
18
1.1K
MClem retweetledi
Kenneth Marino
Kenneth Marino@Kenneth_Marino·
It’s been less than a year since I started my lab (SPARK Lab) at @UUtah we already have a ton of new stuff that I can’t wait to talk about soon. Stay tuned for more. I’ll start today by sharing that our updated Computer Use Survey blog has been accepted to ICLR Blogposts 2026. Collaboration with my student @aplycaebous and Utah colleague @anmarasovic.
Kenneth Marino tweet media
English
1
4
11
904
MClem retweetledi
Christian Steinmetz
Christian Steinmetz@csteinmetz1·
The talks tonight at the Boston AI Music Meetup made me think about where this field is going. Here’s what I learned: Jatin Chowdhury is an expert in getting machine learning models to run in real-time audio systems. He laid out the very real challenges of taking models out of research environments and putting them into actual audio applications. There are many inference solutions out there such as ONNX, TFLite, and libtorch, but none of them really treat audio as a first-class citizen. Real-time audio has strict latency constraints, tiny buffers, and highly variable consumer hardware. You can't just import your model and hope for the best. That is what motivated Jatin to build RTNeural, an open source C++ library designed specifically for audio neural networks. It enables building production-level audio plugins with neural networks that actually run in real time. He shared several real plugins built using this technology. If I were working on neural nets inside audio plugins, this is where I would start. That said, he was clear about the remaining challenges. Scaling to larger models is still difficult. Consumer hardware varies widely. Communication between CPU and GPU adds overhead. Meanwhile, models are getting heavier and more capable. There is a real tension here. Musicians do not want latency to stand in the way of their creativity, but AI models keep pushing toward greater compute requirements. The research field has challenges to address here. Ethan Manilow, a researcher at Google DeepMind, zoomed out completely. His talk traced how music technology has shaped and reshaped our relationship with sound. He focused on how Edison’s phonograph transformed music from an ephemeral experience into something that could be replayed and transported. But it also made people uncomfortable. Early listeners were unsettled by hearing a human voice without a human present. Generative music may be producing a similar reaction today. When we hear the inflection of a generated vocalist, there can be that same uncanny feeling. History gives us perspective, but not a prescription for the future. Ethan’s broader point was that no technology is guaranteed to succeed. History is often told as a linear progression, but many technologies faded while others persisted. We are still in the process of determining what role AI will ultimately play in music. The most exciting part of AI and music today is that there is still so much to figure out. Even the best audio models are still lacking in some capabilities and their latency is far beyond what many musicians desire. If you want to take part in more discussions like this, join us at our next event in March. Details are on our meetup page.
Christian Steinmetz tweet mediaChristian Steinmetz tweet media
Christian Steinmetz@csteinmetz1

The winter storm won't stop us. The next Boston AI Music Meetup is now happening this Wed, Feb 25th at MIT Media Lab. Looking forward to talks from two awesome speakers: Jatin Chowdhury and Ethan Manilow (@ethanmanilow). RSVP at the link in the comments, we have limited spots.

English
3
9
57
4.8K
MClem retweetledi
Scott H. Hawley
Scott H. Hawley@drscotthawley·
🎉 "Flow Where You Want" accepted to ICLR 2026's Blogpost Track! I craved technical feedback (teach me!), yet critiques were purely re. scope & novelty (it's a tutorial 🤷‍♂️), not technical merit, and they appreciated the clear presentation. Will post revised version next month.
Scott H. Hawley@drscotthawley

🛶 New blog tutorial: "Flow Where You Want" Want to steer pretrained flow models without retraining? I spent months simplifying guidance methods: intuitive visuals, accessible math, & runnable code -- it's a Colab! 🔗 below. (btw that's my kayak)

English
1
7
46
3.8K
MClem retweetledi
Justin Salamon
Justin Salamon@justin_salamon·
This is big. SOTA audio reasoning. SOTA video reasoning. SOTA audio captioning. SOTA sound event detection. Better than Gemini. Better than Qwen. TAC: Timestamped Audio Captioning 📑 paper: lnkd.in/getEz5xU 🌐 website with more demos: lnkd.in/gdw5TTuS
English
9
28
264
19.1K
MClem retweetledi
Sarah Drasner
Sarah Drasner@sarah_edo·
💥 I made a new drawing in my AI series, this time about Vector Databases, ANN, and HNSW. I hope it's useful! It might be good to look at the Transformers one in the series before this one, for additional background.
Sarah Drasner tweet media
English
42
124
1.4K
79.3K
MClem retweetledi
Hugging Models
Hugging Models@HuggingModels·
Qwen3.5 is here 🚀 397B params, just 17B active. Native multimodal agents for coding, reasoning, GUI + video. 200+ languages. Open weights. Real scale. The next frontier is open. 🔗 huggingface.co/Qwen/Qwen3.5-3…
English
21
142
1.5K
107.5K
MClem retweetledi
Karan🧋
Karan🧋@kmeanskaran·
"we used to import tensorFlow to train neural networks."
Karan🧋 tweet media
English
73
416
8.2K
227.2K