Tom Chea (@TomChea56513) - Twitter Profili | Zamantika Mersobahis Locabet

Tom Chea@TomChea56513·26 Eyl

@Alibaba_Wan i cannot save video after generated

English

Wan@Alibaba_Wan·24 Eyl

Today, we're officially launching Wan2.5-Preview! It's set to reshape the future of visual generation with a new architecture and powerful features. • Architectural Features: Native Multimodality, Deep Alignment ∘ Native Multimodal Architecture: Adopts a new, unified framework for both understanding and generation, flexibly supporting the input and output of text, images, video, and audio. ∘ Joint Multimodal Training: Achieves stronger modal alignment by jointly training on text, audio, and visual data—key to enabling audio-visual sync and greatly improved instruction following. ∘ Human Preference Alignment: Implements Reinforcement Learning from Human Feedback (RLHF) to continuously align with human preferences, enhancing image quality and video dynamics. • Video Capabilities: A/V Synchronization, Cinematic Quality ∘ Synchronized A/V Generation: Natively supports high-fidelity, high-consistency video generation with synchronized audio, including multi-person vocals, sound effects, and BGM. ∘ Controllable Multimodal Input: Supports text, images, and audio as input sources for limitless creativity. ∘ Cinematic Aesthetics: Features powerful dynamics and structural stability with an upgraded cinematic control system, generating 1080p HD 10s videos of cinematic quality. • Image Capabilities: Creative & Precise Control ∘ Advanced Image Generation: Greatly improved instruction following to support photorealistic quality, diverse artistic styles, creative typography, and professional-grade charts. ∘ Image Editing: Supports conversational, instruction-based image editing and pixel-level precision for tasks like multi-concept fusion, material transformation, and product color swapping, and more.

English

169

225

2.2K

1.7M

Tom Chea

Keşfet