Generate high quality videos from text prompts using Luma Ray 2.


HappyHorse 1.0 on RunComfy uses Alibaba's async video-synthesis API with the happyhorse-1.0-t2v model. You provide a text prompt and choose a supported resolution/aspect-ratio combination, duration, optional seed, and whether the provider watermark should be included.
Output format: video / resolution tier: 720P or 1080P / duration: 3–15 seconds / aspect ratio: 16:9, 9:16, 1:1, 4:3, 3:4 / audio: not exposed in this template
| Parameter | Required | Type | Default | Range / Options | Description |
|---|---|---|---|---|---|
| prompt* | Yes | string | — | max 2500 chars | Describe the scene, subject, motion, camera, lighting, and style for the video. |
| aspect_ratio | No | string | 16:9 | 16:9, 9:16, 1:1, 4:3, 3:4 | Aspect ratio of the generated video. |
| resolution | No | string | 1080P | 720P, 1080P | Output video resolution tier. |
| duration | No | integer | 5 | 3–15 | Output video duration in seconds. |
| seed | No | integer | 0 | 0 to 2147483647 | Optional random seed. Use 0 to let the provider choose one automatically. |
| watermark | No | boolean | true | true, false | Whether to include the provider watermark in the generated video. |
Generate high quality videos from text prompts using Luma Ray 2.
Turn photos into expressive videos with synced voice motion.
Transform one video into another style with Tencent Hunyuan Video.
Create dynamic, sound-synced motion clips from visuals for rich storytelling.
Create lifelike cinematic video clips from prompts with motion control.
Animate between two images with smooth keyframe transitions using Pikaframes.
HappyHorse 1.0 is a next-generation AI video model ranked #1 on the Artificial Analysis Video Arena for both text-to-video (Elo 1333) and image-to-video (Elo 1392). It generates native 1080p video with advanced motion synthesis, multi-shot character consistency, and multilingual support across six languages.
The Artificial Analysis Video Arena ranks models through blind user voting — participants compare two videos generated from the same prompt without knowing which model made which, then pick the better result. Votes feed into an Elo rating system. It holds the highest Elo in both text-to-video and image-to-video (no audio) categories as of April 2026.
The model outputs native 1080p HD resolution. Video includes rich color grading, accurate lighting, and film-grade detail suitable for broadcast and professional production without additional post-processing.
Yes. The model generates synchronized audio alongside video in one pass — including dialogue, ambient sounds, and Foley effects. It ranks #2 in the with-audio categories on the Artificial Analysis leaderboard.
Six languages are natively supported: Chinese, English, Japanese, Korean, German, and French. Prompts in any supported language produce high-quality video with full linguistic nuance.
Multi-shot storytelling allows the model to generate video sequences with multiple shots while maintaining consistency in characters, wardrobe, visual style, and atmosphere across scene transitions — eliminating the need for manual editing between clips.
Yes. The model supports both text-to-video and image-to-video through a unified pipeline. Upload a static image to animate it with intelligent motion synthesis, or describe a scene entirely through text.
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.





