Create rich cinematic clips from images or text with Veo 3.1 Fast.
Wan 2.7 is an AI video generation model from Alibaba designed for creating short videos from natural language prompts. On this RunComfy page, Wan 2.7 is available as a text-to-video workflow that lets you generate 720p or 1080p clips with optional audio guidance, adjustable aspect ratios, controllable duration, negative prompts, prompt expansion, and seed control.
If you are searching for Wan 2.7 for AI video generation, this page covers the practical text-to-video use case on RunComfy. It is built for creators, marketers, developers, and teams who want to turn written ideas into short-form videos in the browser or through API access, without setting up their own inference stack.
Typical uses for Wan 2.7 include product videos, social media clips, ad concepts, brand visuals, visual storytelling tests, and fast creative prototyping.
Output format: 720p or 1080p / FPS: varies by provider / 2–15 s / 16:9, 9:16, 1:1, 4:3, 3:4 / Audio: optional audio_url or auto-generated background music
This Wan 2.7 page supports the following text-to-video inputs on RunComfy:
| Parameter | Required | Type | Default | Range / Options | Description |
|---|---|---|---|---|---|
| prompt* | Yes (*) | string | — | Max 5000 chars | Text prompt describing the desired video. |
| audio_url | No | string | — | WAV/MP3, 3–30 s, ≤15 MB | URL of driving audio; if omitted, matching background music is auto-generated. |
| aspect_ratio | No | AspectRatioEnum | "16:9" | 16:9, 9:16, 1:1, 4:3, 3:4 | Aspect ratio of the generated video. |
| resolution | No | ResolutionEnum | "1080p" | 720p, 1080p | Output video resolution tier. |
| duration | No | DurationEnum | "5" | 2–15 (seconds) | Output video duration in seconds. |
| negative_prompt | No | string | — | Max 500 chars | Content to avoid in the video. |
| enable_prompt_expansion | No | boolean | true | true/false | Enable intelligent prompt rewriting. |
| seed | No | integer | — | 0–2147483647 | Random seed for reproducibility. |
(*) Required
| Resolution | Rate (per second) | Example 5 s | Example 10 s | Example 15 s |
|---|---|---|---|---|
| 720p | $0.09 | $0.45 | $0.90 | $1.35 |
| 1080p | $0.13 | $0.65 | $1.30 | $1.95 |
Pricing shown is from RunComfy: $0.09 per second for 720p and $0.13 per second for 1080p.
Wan 2.7 can be used across a wide range of video workflows on RunComfy, including:
If you like Wan 2.7, explore these related options on RunComfy:
Create rich cinematic clips from images or text with Veo 3.1 Fast.
Next-gen tool turning prompts into cinematic 4K video clips with audio
Film-quality Seedance 2.0 grade video generation with stunning visual fidelity and cinematic motion
Animate a single image into a smooth video with Kling 2.1 Standard.
Animate an image into a smooth 6s video with Hailuo 02 Pro.
AI-driven motion conversion tool enabling precise, stable animation creation
Wan 2.7 currently produces native 1080p videos lasting between 2 and 15 seconds. The text-to-video model caps reference inputs at five media sources (image or video, plus optional voice input). There are prompt size constraints around 1,500 tokens to ensure generation stability.
By default, Wan 2.7 outputs only standard 1080p resolutions in 16:9, 9:16, and 1:1 aspect ratios. While text-to-video workflows can simulate higher ratios through cropping or post-scaling, true 4K generation is not yet supported.
To migrate from the RunComfy Model prototype to production, first verify your Wan 2.7 settings inside the Model. Then, use the RunComfy API, which mirrors the Model’s text-to-video endpoints. Generate an API key, ensure usd credits for production are funded, and map your prompt and media parameters as per API documentation.
Wan 2.7 improves visual sharpness, motion smoothness, and style fidelity relative to Wan 2.6. It introduces start/end-frame control, 9-grid image inputs, and subject plus voice references, which make the text-to-video process more structured and identity-consistent.
Unlike Seedance or Kling, Wan 2.7 emphasizes multi-reference conditioning and fine-grained control, allowing precise style retention and motion continuity. In text-to-video tasks, users often report smoother transitions and more accurate lip synchronization.
Wan 2.7 excels at short-form creative content such as narrative clips, product reels, and character storytelling. Its text-to-video mode is optimized for workflows requiring high fidelity, subject identity consistency, and integrated voice or sound.
Yes, Wan 2.7 automatically includes synchronized audio, including speech and environmental sound. In text-to-video generation, it can also use optional voice references for better vocal style matching and lip-sync accuracy.
Wan 2.7 outperforms older versions when projects demand subject stability, natural motion, or stylistic precision. Its text-to-video generation engine minimizes visual drift and improves texture detailing through upgraded consistency models.
The model accepts up to five media references, including images or videos, and an optional voice file. When performing text-to-video creation, this enables direct control of visual style, motion cues, and identity consistency.
For commercial use of Wan 2.7, check the official Wan AI license terms and RunComfy’s usage policies. The text-to-video outputs can often be used in monetized content, but always verify rights for any included references or likenesses.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.





