ZD1908

3.5K posts

ZD1908

@ZDi____

🇦🇷 25M ; ML Text-to-Speech/Audio, C++/Qt / DMs open

Latent space Katılım Haziran 2024

424 Takip Edilen302 Takipçiler

Sabitlenmiş Tweet

ZD1908@ZDi____·10 Şub

Releasing Brontes: A modified Wave U-Net architecture for audio super-resolution. This one is trained to operate on NeuCodec outputs. I'm releasing a general 30M checkpoint on a variety of speech. See links in replies. All on MI300X thanks to @HotAisle @AIatAMD

English

ZD1908 retweetledi

Mayank Mishra@MayankMish98·21h

Introducing M²RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling We bring back non-linear recurrence to language modeling and show it's been held back by small state sizes, not by non-linearity itself. 📄 Paper: arxiv.org/abs/2603.14360 💻 Code: github.com/open-lm-engine… 🤗 Models: huggingface.co/collections/op…

English

398

80.8K

ZD1908@ZDi____·15h

If you're cold emailing people, you have to prepend [Not AI slop] to the subject line, and write the whole thing with your own two hands.

English

ZD1908@ZDi____·16h

Hacker News is such a primitive platform it hurts. Even 4chan has autorefresh.

English

ZD1908@ZDi____·16h

@Wolvan1 Bitwise Operator's shaking that voluminous derrière.

Română

bigger-better-yee-haw@Wolvan1·17h

Post OCs and/or art ideas pls

English

148

ZD1908@ZDi____·1d

@CounterStrike The ArmAfication of Counter-Strike is upon us.

English

126

CS2@CounterStrike·1d

For your consideration, an update pertaining to Guns, Guides, and Games: steamcommunity.com/games/CSGO/ann…

English

4.5K

437

9.6K

ZD1908@ZDi____·1d

@ad0rnai >terminally offline >on X (formerly Twitter) Do you also go to butcher shops looking for vegans?

English

478

Lan@ad0rnai·1d

I am hiring someone who is: - terminally offline - not in any group chats - has covered the curriculum of the great books program - idiosyncratic - slightly off-putting

English

377

14.1K

ZD1908@ZDi____·1d

All of this was iterated start to end on a single AMD Instinct MI300X for ~5 days, thanks to @HotAisle and @AIatAMD. Audio quality isn't the best, but only so much one can do with few parameters. Scaling up will be key.

English

254

ZD1908@ZDi____·1d

GitHub: github.com/ZDisket/vits-e… HuggingFace Demo: huggingface.co/spaces/ZDisket… This is a highly upgraded VITS trained on LibriTTS-R + VCTK datasets, both fully open. Speaker encoder is Resemble AI's Resemblyzer.

English

ZD1908@ZDi____·1d

VITS EVOlution, my TTS model: 1. ~31M TTS model and speaker encoder in ONNX format. Faster than realtime on CPU 2. Natively outputs 48KHz audio 3. Voice cloning, or voice blending--mix two or more speakers to make a new voice! 4. Apache 2.0, use anywhere without worry Links in replies:

English

120

ZD1908@ZDi____·2d

My model is working and ready for release tomorrow but I came down with a cold today.

English

ZD1908@ZDi____·2d

@sonofalli

QME

372

85.6K

alli@sonofalli·2d

reporting to a middle-aged girl dad will change your life

English

279

9.1K

1.3M

ZD1908@ZDi____·2d

@giffmana Bigger models, plus switch from mostly convnet-based--most of the time U-Net (locally coherent, globally weak) to DiT (locally and globally strong), although newer conv is starting to emerge. x.com/miru_why/statu…

miru@miru_why

Reviving ConvNeXt for Efficient Convolutional Diffusion Models github.com/star-kwon/FCDM arxiv.org/abs/2603.09408… the authors propose an improved convnext-based diffusion model architecture that reportedly matches DiT-XL/2 quality with 7x fewer training steps

English

6.1K

Lucas Beyer (bl16)@giffmana·2d

I have a question about last year's image-generation progress, wonder what y'all think. How did we go from all models consistently getting fingers wrong, to all models consistently getting them right? This "flip" seems to have happened basically across all companies/models at the ~same time. Even "random" non-frontier papers seem to get it right? Or they just cherry-pick the figures?

English

483

109K

ZD1908@ZDi____·3d

Using DistilHuBERT features as speaker encoder also failed. Time to throw GPT 5.4 at the Resemblyzer repo and have it modernize the pipeline.

English

ZD1908 retweetledi

RoyalCities@RoyalCities·3d

After months of work, today I’m releasing Foundation-1. A SOTA text-to-sample model built specifically for music production workflows. It may also be the most advanced AI sample generator currently available - open or closed. • ~7 GB VRAM • Entirely local • 100% free 😁

English

148

1.3K

106.9K

ZD1908@ZDi____·3d

@jparkjmc @nikitabier You could drop an email address instead.

English

252

jpark@jparkjmc·3d

got way more inbound than i anticipated tried to go through all DMs and got rate limited 😵‍💫 @nikitabier can you un rate limit me, agi depends on it

jpark@jparkjmc

if you're an independent researcher and need - compute - tokens - travel stipends - any other resources DM me! hillclimb grant releasing soon😁

English

6.8K

ZD1908@ZDi____·3d

Vercel: makes open source project Cloudflare: forks it Vercel:

Malte Ubl@cramforce

x.com/i/article/2033…

English

124

ZD1908@ZDi____·3d

@JasonBotterill French, dog name, French.

English

109

JB@JasonBotterill·3d

Codex has subagents now but bro what the fuck these agent names are somehow worse than Grok's Benjamin. I don't want Russell going through my family photos

OpenAI Developers@OpenAIDevs

Subagents are now available in Codex. You can accelerate your workflow by spinning up specialized agents to: • Keep your main context window clean • Tackle different parts of a task in parallel • Steer individual agents as work unfolds

English

472

58.9K

ZD1908@ZDi____·3d

I should get into rocketry, I want to make guided missiles for delivery of medical supplies.

English

ZD1908@ZDi____·3d

@Alfauz19767861 Peronism is like pizza: there's a flavor for everyone.

English

233

Alfauz@Alfauz19767861·4d

My understanding of Peronism

English

102

201

2.9K

212K

ZD1908@ZDi____·4d

Well... I was going to release a zero-shot modeltoday, but my speaker encoder sucks and collapses every voice into an audiobook reader, as it was trained on a small amount of data. I'm going to use DistilHuBERT features instead.

English

Keşfet

@Wolvan1 @CounterStrike @ad0rnai @HotAisle @AIatAMD @sonofalli @giffmana @jparkjmc