ComfyUI Nvidia Cosmos Text & Image to Video Workflow
What is the Nvidia Cosmos Workflow
Turn your imagination into fluid videos using the newly released Nvidia Cosmos models in ComfyUI. This workflow demonstrates the strong AI capabilities of Nvidia Cosmos with its text-to-video and image-to-video generation features. Powered by Nvidia Cosmos's state-of-the-art 7B and 14B models, you can create high-quality videos from either textual descriptions or still images. The Nvidia Cosmos engine gives stellar results thanks to its ultra-efficient video processing capabilities.
Key Features of Nvidia Cosmos
- Dual Generation Modes: Nvidia Cosmos offers both text-to-video and image-to-video generation
- Guaranteed Motion: Always generates videos with movement when using 121 frames
- Effective Negative Prompts: Non-distilled model ensures better control through negative prompts
- Flexible Image Control: Generate from the last frame or create transitions between images
- Ultra-Efficient VAE: Nvidia Cosmos employs a refined VAE system for smooth, high-quality video generation
- High Resolution Support: Create videos at resolutions of 704x704 and above
- Precise Frame Control: Optimized for 121-frame sequences
- Smart Image Interpolation: Generate smooth transitions between reference images
How to Use the Nvidia Cosmos Workflow
Nvidia Cosmos workflow contains two main parts: _text-to-video_ and _image-to-video_ generation. By default, the _image-to-video_ group is bypassed. To switch between the two modes:
- For _text-to-video_: Keep the _image-to-video_ group bypassed (default setting)
- For _image-to-video_: Right-click the _image-to-video_ group and select
Set Group Nodes to Always
1. Text to Video Generation with Nvidia Cosmos
Setup and Requirements
- Choose your preferred Nvidia Cosmos model size (7B recommended for starting)

- Set resolution (Default 1280x704; minimum 704x704)
- Frame settings:
- Length: 121 frames (The model performs optimally with a length of 121; deviating too much from this can result in subpar video quality.)
- Frame rate: 24.00 (default rate for optimal quality) <img src="https://cdn.runcomfy.net/workflow_assets/1184/readme02.webp" alt="Nvidia Cosmos" width="350"/> <img src="https://cdn.runcomfy.net/workflow_assets/1184/readme03.webp" alt="Nvidia Cosmos" width="350"/>
Sampling Parameters for Nvidia Cosmos
- Sampler:
res_multistep(Nvidia's recommended sampler for Cosmos) - Scheduler:
karras(default for stability) - Steps:
20(higher = better quality but slower; lower = faster but less detailed) - CFG:
6.5(prompt guidance strength) - Denoise:
1.00(1.00 = complete transformation; lower values keep more original content)
Prompting Tips for Nvidia Cosmos
- Use detailed, multi-sentence prompts for better results
- Include comprehensive negative prompts
- Short prompts may generate coherent videos but might not strictly follow instructions
2. Image to Video Generation with Nvidia Cosmos
Setup and Requirements
- Same base requirements as Nvidia Cosmos text-to-video
- Supports
start_imageandend_imageinputs
Reference Image Options
- Set a
start_imageorend_image, or both at the same time - Images work best when similar in style and content (for smooth transitions)

Key Parameters
- Identical sampling settings to text-to-video mode
- Maintains same video quality standards
Advanced Tips for Nvidia Cosmos
- For higher quality results with more VRAM, try the Nvidia Cosmos 14B model
- Ensure prompts are descriptive and detailed for best results
- Experiment with different image pairs for unique transitions
More Information about Nvidia Cosmos
For more details and updates about Nvidia Cosmos, visit Nvidia Cosmos Official Page.

