Pixverse PixVerse 5.5 text to video API | Pricing & Docs

pixverse/pixverse/v5.5/text-to-video

Transform text prompts into cinematic short videos with synchronized audio, multi-shot storytelling, and consistent characters using PixVerse 5.5's fast, diffusion-transformer video generation engine.

1. Get started

Use RunComfy's API to run pixverse/pixverse/v5.5/text-to-video. For accepted inputs and outputs, see the model's schema.

curl --request POST \
  --url https://model-api.runcomfy.net/v1/models/pixverse/pixverse/v5.5/text-to-video \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "prompt": "A girl in a yellow raincoat is walking in the rain while holding an umbrella. She calmly steps over wet pavement scattered with autumn leaves. The background features a softly blurred street scene with muted colors. The camera remains fixed, capturing the serene moment steadily."
  }'

2. Authentication

Set the YOUR_API_TOKEN environment variable with your API key (manage keys in your Profile) and include it on every request as a Bearer token via the Authorization header: Authorization: Bearer $YOUR_API_TOKEN.

3. API reference

Submit a request

Submit an asynchronous generation job and immediately receive a request_id plus URLs to check status, fetch results, and cancel.

curl --request POST \
  --url https://model-api.runcomfy.net/v1/models/pixverse/pixverse/v5.5/text-to-video \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "prompt": "A girl in a yellow raincoat is walking in the rain while holding an umbrella. She calmly steps over wet pavement scattered with autumn leaves. The background features a softly blurred street scene with muted colors. The camera remains fixed, capturing the serene moment steadily."
  }'

Monitor request status

Fetch the current state for a request_id ("in_queue", "in_progress", "completed", or "cancelled").

curl --request GET \
  --url https://model-api.runcomfy.net/v1/requests/{request_id}/status \
  --header "Authorization: Bearer <token>"

Retrieve request results

Retrieve the final outputs and metadata for the given request_id; if the job is not complete, the response returns the current state so you can continue polling.

curl --request GET \
  --url https://model-api.runcomfy.net/v1/requests/{request_id}/result \
  --header "Authorization: Bearer <token>"

Cancel a request

Cancel a queued job by request_id; in-progress jobs cannot be cancelled.

curl --request POST \
  --url https://model-api.runcomfy.net/v1/requests/{request_id}/cancel \
  --header "Authorization: Bearer <token>"

4. File inputs

Hosted file (URL)

Provide a publicly reachable HTTPS URL. Ensure the host allows server-side fetches (no login/cookies required) and isn't rate-limited or blocking bots. Recommended limits: images ≤ 50 MB (~4K), videos ≤ 100 MB (~2–5 min @ 720p). Prefer stable or pre-signed URLs for private assets.

5. Schema

Input schema

{
  "type": "object",
  "title": "Input schema",
  "required": [
    "prompt"
  ],
  "properties": {
    "prompt": {
      "title": "Prompt",
      "description": "",
      "type": "string",
      "default": "A girl in a yellow raincoat is walking in the rain while holding an umbrella. She calmly steps over wet pavement scattered with autumn leaves. The background features a softly blurred street scene with muted colors. The camera remains fixed, capturing the serene moment steadily."
    },
    "aspect_ratio": {
      "title": "Aspect Ratio (W:H)",
      "description": "The aspect ratio of the generated video.",
      "type": "string",
      "enum": [
        "16:9",
        "4:3",
        "1:1",
        "3:4",
        "9:16"
      ],
      "default": "16:9"
    },
    "resolution": {
      "title": "Resolution",
      "description": "The resolution of the generated video.",
      "type": "string",
      "enum": [
        "360p",
        "540p",
        "720p",
        "1080p"
      ],
      "default": "720p"
    },
    "duration": {
      "title": "Duration",
      "description": "The duration of the generated video in seconds. 1080p videos are limited to 5 or 8 seconds.",
      "type": "integer",
      "enum": [
        5,
        8,
        10
      ],
      "default": 5
    },
    "negative_prompt": {
      "title": "Negative Prompt",
      "description": "Negative prompt to be used for the generation.",
      "type": "string",
      "default": ""
    },
    "style": {
      "title": "Style",
      "description": "The style of the generated video.",
      "type": "string",
      "enum": [
        "anime",
        "3d_animation",
        "clay",
        "comic",
        "cyberpunk"
      ],
      "default": "anime"
    },
    "seed": {
      "title": "Seed",
      "description": "",
      "type": "integer",
      "default": 0
    },
    "generate_audio_switch": {
      "title": "Generate Audio",
      "description": "Enable audio generation (BGM, SFX, dialogue).",
      "type": "boolean",
      "default": false
    },
    "generate_multi_clip_switch": {
      "title": "Generate Multi-clip",
      "description": "Enable multi-clip generation with dynamic camera changes.",
      "type": "boolean",
      "default": false
    },
    "thinking_type": {
      "title": "Prompt Optimization Mode",
      "description": "Prompt optimization mode: 'enabled' to optimize, 'disabled' to turn off, 'auto' for model decision.",
      "type": "string",
      "enum": [
        "enabled",
        "disabled",
        "auto"
      ],
      "default": "auto"
    }
  }
}

Output schema

{
  "output": {
    "type": "object",
    "properties": {
      "image": {
        "type": "string",
        "format": "uri",
        "description": "single image URL"
      },
      "video": {
        "type": "string",
        "format": "uri",
        "description": "single video URL"
      },
      "images": {
        "type": "array",
        "description": "multiple image URLs",
        "items": {
          "type": "string",
          "format": "uri"
        }
      },
      "videos": {
        "type": "array",
        "description": "multiple video URLs",
        "items": {
          "type": "string",
          "format": "uri"
        }
      }
    }
  }
}

RunComfy

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

1. Get started

Use RunComfy's API to run pixverse/pixverse/v5.5/text-to-video. For accepted inputs and outputs, see the model's schema.

curl --request POST \
  --url https://model-api.runcomfy.net/v1/models/pixverse/pixverse/v5.5/text-to-video \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "prompt": "A girl in a yellow raincoat is walking in the rain while holding an umbrella. She calmly steps over wet pavement scattered with autumn leaves. The background features a softly blurred street scene with muted colors. The camera remains fixed, capturing the serene moment steadily."
  }'

3. API reference

Submit a request

Submit an asynchronous generation job and immediately receive a request_id plus URLs to check status, fetch results, and cancel.

curl --request POST \
  --url https://model-api.runcomfy.net/v1/models/pixverse/pixverse/v5.5/text-to-video \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <token>" \
  --data '{
    "prompt": "A girl in a yellow raincoat is walking in the rain while holding an umbrella. She calmly steps over wet pavement scattered with autumn leaves. The background features a softly blurred street scene with muted colors. The camera remains fixed, capturing the serene moment steadily."
  }'

Monitor request status

Fetch the current state for a request_id ("in_queue", "in_progress", "completed", or "cancelled").

curl --request GET \
  --url https://model-api.runcomfy.net/v1/requests/{request_id}/status \
  --header "Authorization: Bearer <token>"

Retrieve request results

Retrieve the final outputs and metadata for the given request_id; if the job is not complete, the response returns the current state so you can continue polling.

curl --request GET \
  --url https://model-api.runcomfy.net/v1/requests/{request_id}/result \
  --header "Authorization: Bearer <token>"

Cancel a request

Cancel a queued job by request_id; in-progress jobs cannot be cancelled.

curl --request POST \
  --url https://model-api.runcomfy.net/v1/requests/{request_id}/cancel \
  --header "Authorization: Bearer <token>"

4. File inputs

Hosted file (URL)

5. Schema

Input schema

{
  "type": "object",
  "title": "Input schema",
  "required": [
    "prompt"
  ],
  "properties": {
    "prompt": {
      "title": "Prompt",
      "description": "",
      "type": "string",
      "default": "A girl in a yellow raincoat is walking in the rain while holding an umbrella. She calmly steps over wet pavement scattered with autumn leaves. The background features a softly blurred street scene with muted colors. The camera remains fixed, capturing the serene moment steadily."
    },
    "aspect_ratio": {
      "title": "Aspect Ratio (W:H)",
      "description": "The aspect ratio of the generated video.",
      "type": "string",
      "enum": [
        "16:9",
        "4:3",
        "1:1",
        "3:4",
        "9:16"
      ],
      "default": "16:9"
    },
    "resolution": {
      "title": "Resolution",
      "description": "The resolution of the generated video.",
      "type": "string",
      "enum": [
        "360p",
        "540p",
        "720p",
        "1080p"
      ],
      "default": "720p"
    },
    "duration": {
      "title": "Duration",
      "description": "The duration of the generated video in seconds. 1080p videos are limited to 5 or 8 seconds.",
      "type": "integer",
      "enum": [
        5,
        8,
        10
      ],
      "default": 5
    },
    "negative_prompt": {
      "title": "Negative Prompt",
      "description": "Negative prompt to be used for the generation.",
      "type": "string",
      "default": ""
    },
    "style": {
      "title": "Style",
      "description": "The style of the generated video.",
      "type": "string",
      "enum": [
        "anime",
        "3d_animation",
        "clay",
        "comic",
        "cyberpunk"
      ],
      "default": "anime"
    },
    "seed": {
      "title": "Seed",
      "description": "",
      "type": "integer",
      "default": 0
    },
    "generate_audio_switch": {
      "title": "Generate Audio",
      "description": "Enable audio generation (BGM, SFX, dialogue).",
      "type": "boolean",
      "default": false
    },
    "generate_multi_clip_switch": {
      "title": "Generate Multi-clip",
      "description": "Enable multi-clip generation with dynamic camera changes.",
      "type": "boolean",
      "default": false
    },
    "thinking_type": {
      "title": "Prompt Optimization Mode",
      "description": "Prompt optimization mode: 'enabled' to optimize, 'disabled' to turn off, 'auto' for model decision.",
      "type": "string",
      "enum": [
        "enabled",
        "disabled",
        "auto"
      ],
      "default": "auto"
    }
  }
}

Output schema

{
  "output": {
    "type": "object",
    "properties": {
      "image": {
        "type": "string",
        "format": "uri",
        "description": "single image URL"
      },
      "video": {
        "type": "string",
        "format": "uri",
        "description": "single video URL"
      },
      "images": {
        "type": "array",
        "description": "multiple image URLs",
        "items": {
          "type": "string",
          "format": "uri"
        }
      },
      "videos": {
        "type": "array",
        "description": "multiple video URLs",
        "items": {
          "type": "string",
          "format": "uri"
        }
      }
    }
  }
}

Transform text prompts into cinematic short videos with synchronized audio, multi-shot storytelling, and consistent characters using PixVerse 5.5's fast, diffusion-transformer video generation engine.

Table of contents

1. Get started

2. Authentication

3. API reference

Submit a request

Monitor request status

Retrieve request results

Cancel a request

4. File inputs

Hosted file (URL)

5. Schema

Input schema

Output schema

Transform text prompts into cinematic short videos with synchronized audio, multi-shot storytelling, and consistent characters using PixVerse 5.5's fast, diffusion-transformer video generation engine.

Table of contents

1. Get started

2. Authentication

3. API reference

Submit a request

Monitor request status

Retrieve request results

Cancel a request

4. File inputs

Hosted file (URL)

5. Schema

Input schema

Output schema

PixVerse 5.5 text to video: Multi-Shot Scenes & Lip-Sync AI Generator

Transform text prompts into cinematic short videos with synchronized audio, multi-shot storytelling, and consistent characters using PixVerse 5.5's fast, diffusion-transformer video generation engine.

Table of contents

1. Get started

2. Authentication

3. API reference

Submit a request

Monitor request status

Retrieve request results

Cancel a request

4. File inputs

Hosted file (URL)

5. Schema

Input schema

Output schema

PixVerse 5.5 text to video: Multi-Shot Scenes & Lip-Sync AI Generator

Transform text prompts into cinematic short videos with synchronized audio, multi-shot storytelling, and consistent characters using PixVerse 5.5's fast, diffusion-transformer video generation engine.

Table of contents

1. Get started

2. Authentication

3. API reference

Submit a request

Monitor request status

Retrieve request results

Cancel a request

4. File inputs

Hosted file (URL)

5. Schema

Input schema

Output schema