Alibaba Unveils Wan2.6 AI Video Generation Models

Dec 16, 2025

AI models, Alibaba, artificial intelligence, content creation, multi-shot storytelling, video generation, visual consistency, voice generation, Wan2.6 series

techcoffeehouse

Alibaba has unveiled the latest version of its artificial intelligence video generation models, launching the Wan2.6 series with new tools designed to support professional-grade content creation and multi-shot storytelling.

The Wan2.6 AI video generation models introduce a new reference-to-video capability that allows users to appear in AI-generated videos as themselves, using both their visual likeness and voice. The update also brings upgrades across Alibaba’s existing text-to-video, image-to-video and image generation models.

Reference-to-video AI enables consistent characters and voices

At the centre of the release is Wan2.6-R2V, a reference-to-video generation model that enables users to upload a character reference video containing both appearance and voice. Using text prompts, the model can then generate new scenes featuring the same character, while preserving visual identity and audio characteristics.

According to Alibaba, users can create AI-generated videos featuring people, animals or objects, either individually or together, while maintaining consistency in how subjects look and sound across scenes.

The company said Wan2.6-R2V is China’s first reference-to-video AI model capable of inserting real individuals or other subjects into generated video scenes with consistent visuals and audio. The feature is aimed at simplifying production workflows for short-form drama creators and other professional content producers.

Multi-shot storytelling added across Wan2.6 models

Beyond reference-to-video generation, Alibaba said the Wan2.6 series includes comprehensive upgrades to four existing models. These include its text-to-video model (Wan2.6-T2V), image-to-video model (Wan2.6-I2V), and two image generation models (Wan2.6-image and Wan2.6-T2I).

The updated models introduce intelligent multi-shot storytelling capabilities, allowing creators to generate longer narratives composed of multiple connected scenes. Alibaba said this enables richer and more expressive stories, with visual consistency maintained throughout extended video sequences.

The company also highlighted improvements in audio-visual synchronisation and audio-to-video generation, which it said deliver more realistic scenes with enhanced sound effects and dialogue alignment.

Growing focus on AI tools for creators

The launch reflects Alibaba’s broader push into generative AI tools aimed at content creators, as competition intensifies among technology companies developing text, image and video generation models.

AI-powered video creation has increasingly been adopted for marketing, entertainment and short-form content production, particularly across Asia’s fast-growing creator economy.

Author

techcoffeehouse

View all posts

Discover more from techcoffeehouse.com

Subscribe to get the latest posts sent to your email.

Use promo code “TCH15” to get 15% off on checkout.

Share your thoughtsCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.