Zhizheng Wu

35 posts

Zhizheng Wu

@drwuz

Researcher, builder

Los Gatos, CA Katılım Ağustos 2010

89 Takip Edilen160 Takipçiler

Zhizheng Wu retweetledi

lester violeta@lesterphv·18 Oca

Our paper discussing the SVCC 2025 summary has been accepted to ICASSP 2026! 🥳 Check it out here: arxiv.org/abs/2509.15629 We're still working on an extension journal paper that covers more details about SVCC, so stay tuned 😄

English

1.4K

Zhizheng Wu retweetledi

lester violeta@lesterphv·19 Mar

The first SVCC 2025 baseline system is now out! 🥳 We introduce Serenade: A Singing Style Conversion Framework Based On Audio Infilling. This preliminary investigation covers the main difficulties of singing style conversion (SSC) and details our findings.

English

3.4K

Zhizheng Wu retweetledi

Rajesh Agarwal@Rajesh992510253·6 Ara

80+ AI tools to finish months of work in minutes. 1. Research - ChatGPT - Copilot - Gemini - Abacus - Perplexity 2. Image - Fotor - Dalle 3 - Stability AI - Midjourney - Microsoft Designer 3. CopyWriting - Rytr - Copy AI - Writesonic - Adcreative AI 4. Writing - Jasper - HIX AI - Jenny AI - Textblaze - Quillbot 5. Website - 10Web - Durable - Framer - Style AI 6. Video - Klap - Opus - Eightify - InVideo - HeyGen - Runway - ImgCreator AI - Morphstudio .xyz 7. Meeting - Tldv - Otter - Noty AI - Fireflies 8. SEO - VidIQ - Seona AI - BlogSEO - Keywrds ai 9. Chatbot - Droxy - Chatbase - Mutual info - Chatsimple 10. Presentation - Decktopus - Slides AI - Gamma AI - Designs AI - Beautiful AI 11. Automation - Make - Zapier - Xembly - Bardeen 12. Prompts - FlowGPT - Alicent AI - PromptBox - Promptbase - Snack Prompt 13. UI/UX - Figma - Uizard - UiMagic - Photoshop 14. Design - Canva - Flair AI - Designify - Clipdrop - Autodraw - Magician design 15. Logo Generator - Looka - Designs AI - Brandmark - Stockimg AI - Namecheap 16. Audio - Lovo ai - Eleven labs - Songburst AI - Adobe Podcast 17. Productivity - Merlin - Tinywow - Notion AI - Adobe Sensei - Personal AI 18. Social media management - Tapilo - Typefully - Hypefury - TweetHunter Follow me for more.

English

194

1.6K

5.4K

508K

Zhizheng Wu@drwuz·28 Eki

👉 IEEE SLT 2024 Call for Recent Breakthrough Results. We invite submission of your recent findings on spoken language technology. Submit an abstract by the final deadline of Nov 15. Submit via this form: forms.gle/YDZZi7gQutNuX6… @shinjiw_at_cmu @HungyiLee2 @IEEE_SLTC

English

987

Zhizheng Wu retweetledi

Amphion@realamphion·24 Eki

🚀🚀🚀 A Zero-Shot TTS model MaskGCT (Masked Generative Codec Transformer) is open-sourced in Amphion now. Trained with Emilia. Only needs 5 sec speech to clone Paper: arxiv.org/abs/2409.00750# HF: huggingface.co/spaces/amphion… Discord: discord.gg/fRaQpH7s Watch the demo by MaskGCT

English

11.6K

Zhizheng Wu retweetledi

SLT 2024@ieee_slt·12 Eki

👥 Keynote highlights from industry and academia! 🤝 Supported by top tech leaders and innovators! 📅 Secure your spot now: 2024.ieeeslt.org/program/ #SLT2024

English

750

Zhizheng Wu retweetledi

Amphion@realamphion·28 Ağu

The Emilia dataset, 101k hours of multilingual in-the-wild speech data, is now available to download from HuggingFace! Join the discord if you have any feedback. HF: huggingface.co/datasets/amphi… Discord: discord.com/invite/ZxxREr3Y

English

18.6K

Zhizheng Wu retweetledi

Cameron R. Wolfe, Ph.D.@cwolferesearch·30 Tem

LLM-as-a-Judge is one of the most widely-used techniques for evaluating LLM outputs, but how exactly should we implement LLM-as-a-Judge? To answer this question, let’s look at a few widely-cited papers / blogs / tutorials, study their exact implementation of LLM-as-a-Judge, and try to find some useful patterns. (1) Vicuna was one of the first models to use LLMs as an evaluator. Their approach is different depending on the problem being solved. Separate prompts are written for i) general, ii) coding, and iii) math questions. Each domain-specific prompt introduces some extra, relevant details compared to the vanilla prompt. For example: - The coding prompt provides a list of desirable characteristics for a good solution. - The math prompt asks the judge to first solve the question before generating a score. Interestingly, the judge is given two model outputs within its prompt, but it is asked to score each output on a scale of 1-10 instead of just choosing the better output. (2) AlpacaEval is one of the most widely-used LLM leaderboards, and it is entirely based on LLM-as-a-Judge! The current approach used by AlpacaEval is based upon GPT-4-Turbo and uses a very simple prompt that: - Provides an instruction to the judge. - Gives the judge two example responses to the instruction. - Asks the judge to identify the better response based on human preferences. Despite the simplicity, this strategy correlates very highly with human preference scores (i.e., 0.9+ Spearman correlation with chatbot arena). (3) G-Eval was one of the first LLM-powered evaluation metrics that was shown to correlate well with human judgements. The key to success for this metric was to leverage a two-stage prompting approach. First, the LLM is given the task / instruction as input and asked to generate a sequence of steps that should be used to evaluate a solution to this task. This approach is called AutoCoT. Then, the LLM uses this reasoning strategy as input when generating an actual score, which is found to improve scoring accuracy! (4) The LLM-as-a-Judge paper itself uses a pretty simple prompting strategy to score model outputs. However, the model is also asked to provide an explanation for its scores. Generating such an explanation resembles a chain-of-thought prompting strategy and is found to improve scoring accuracy. Going further, several different prompting strategies–including both pointwise and pairwise prompts–are explored and found to be effective within this paper. Key takeaways. From these examples, we can arrive at a few common takeaways / learnings: - LLM judges are very good at identifying responses that are preferable to humans (due to training with RLHF). - Creating specialized evaluation prompts for each domain / application is useful. - Providing a scoring rubric or list of desirable properties for a good solution can be helpful to the LLM. - Simple prompts can be extremely effective (don’t make it overly complicated!). - Providing (or generating) a reference solution for complex problems (e.g., math) is useful. - CoT prompting (in various forms) is helpful. - Both pairwise and pointwise prompts are commonly used. - Pairwise prompts can either i) ask for each output to be scored or ii) ask for the better output to be identified.

English

456

45.4K

Zhizheng Wu@drwuz·13 Tem

@hantmango @realamphion Yes, and the pre-processing script

English

Hantmango@hantmango·13 Tem

@realamphion Dataset open-sourced? Respect!

English

203

Zhizheng Wu retweetledi

Amphion@realamphion·12 Tem

🔥🔥We have open-sourced Emilia for speech generation, a 101k-hour dataset in six languages from in-the-wild (e.g. talk shows, interviews, debates). Checkout perf of model trained with it. HF: huggingface.co/datasets/amphi… ArXiv: arxiv.org/abs/2407.05361 Demo: emilia-dataset.github.io/Emilia-Demo-Pa…

English

173

19.9K

Zhizheng Wu@drwuz·13 Tem

@Ex3NDR @realamphion data is important

Català

129

Steve Korshakov@Ex3NDR·12 Tem

@realamphion Very impressive! I wasn’t able to train voice box to match your quality using only ligrilight

English

402

Zhizheng Wu@drwuz·3 Tem

👇🎉🎉🎉Our latest work to unmute all videos, giving all silent videos a matching sound!

Yiming Zhang@ymzhang319

💥Video generation is booming lately🤤! 🤩Try our🎙️𝐅𝐨𝐥𝐞𝐲𝐂𝐫𝐚𝐟𝐭𝐞𝐫 to add sound effects. 🌟HomePage: foleycrafter.github.io ❤️Thanks to all the co-authors: Yicheng Gu @zengyh1900 @LeoXing8 Yuancheng Wang @drwuz @kaichen100 #FoleyCrafter #AudioGeneration #AISoundEffect

English

173

Zhizheng Wu@drwuz·21 Haz

A new milestone :)

Heiga Zen (全炳河)@heiga_zen

I just submitted a changelist to remove HTS (HMM-based speech synthesis toolkit) from Google's repository. I was the first maintainer of this open source toolkit when I was a PhD student (20 years ago!). I checked in it to the repository 12 years ago and maintained.

English

211

Zhizheng Wu@drwuz·21 Haz

Our latest work collaborating with Bytedance! It helps LLM to understand speech better beyond words and emotions!

Amphion@realamphion

🚀 Excited to share SD-Eval benchmark dataset that can help LLM to understand speech better than ChatGPT 4o! It covers the understanding of emotions, accents, age and bg environmental sounds. Paper: arxiv.org/abs/2406.13340 GitHub: github.com/amphionspace/S…

English

165

Zhizheng Wu retweetledi

SLT 2024@ieee_slt·6 Haz

We're excited to announce the Call for Sponsorship for IEEE SLT 2024! Join us in Macao from Dec 2-5, 2024, to explore the latest in speech and language technology. Check out our sponsorship packages and enhance your organization's visibility!

English

1.3K

Zhizheng Wu@drwuz·22 Mar

SLT 2024 DDL is Jun 20th

SLT 2024@ieee_slt

We recently concluded the challenge proposal phase for SLT 2024. Now, it's time for you to prepare your papers! Please note that the submission deadline is June 20th. We look forward to seeing you in Macao this December! #SLT2024 #CallForPapers

English

183

Zhizheng Wu@drwuz·12 Mar

Our least work in collaboration with Microsoft Research

Amphion@realamphion

Amphion now supports the FACodec, which is the core component of NaturalSpeech3 and the pretrained checkpoints are released. Paper: arxiv.org/abs/2403.03100 Checkpoints: huggingface.co/amphion/natura… Demo: huggingface.co/spaces/amphion… Code: github.com/open-mmlab/Amp… @xutan_tx @yuancwang

English

411

Zhizheng Wu retweetledi

SLT 2024@ieee_slt·1 Mar

IEEE Spoken Language Technology Workshop (SLT 2024) will be held in Macao later this year, China in Dec 2-5. We are calling for challenges! Make a proposal by March 13

English

2.2K

Zhizheng Wu@drwuz·28 Şub

IEEE Spoken Language Technology Workshop (SLT 2024) will be held in Macao later this year, China in Dec 2-5. We are calling for challenges! Make a proposal by March 13

English

1.7K

Zhizheng Wu@drwuz·28 Şub

IEEE Spoken Language Technology Workshop (SLT 2024) will be held in Macao later this year, China in Dec 2-5. Get your paper ready :) we are looking forward to seeing you in Macao!

English

3.2K

Keşfet

@shinjiw_at_cmu @HungyiLee2 @IEEE_SLTC @hantmango @realamphion @Ex3NDR @elonmusk @BarackObama