ComfyUI > Nodes > TrentNodes > Phoneme To Mouth Shapes

ComfyUI Node: Phoneme To Mouth Shapes

Class Name

PhonemeToMouthShapes

Category
Trent/LipSync
Author
TrentHunter82 (Account age: 0days)
Extension
TrentNodes
Latest Updated
2026-03-20
Github Stars
0.03K

How to Install TrentNodes

Install this extension via the ComfyUI Manager by searching for TrentNodes
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter TrentNodes in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

Phoneme To Mouth Shapes Description

Transforms phoneme timing data into mouth shape indices for realistic lip-sync animations.

Phoneme To Mouth Shapes:

The PhonemeToMouthShapes node is designed to transform phoneme timing data into a sequence of mouth shape indices, which are essential for creating realistic lip-sync animations. This node is particularly useful for animators and AI artists who want to synchronize mouth movements with audio tracks in their projects. By converting phonemes into corresponding mouth shapes, the node facilitates the creation of animations that accurately reflect spoken words, enhancing the visual storytelling experience. The node leverages a mapping system to translate phonemes into standard mouth shapes, ensuring compatibility with animation mouth charts. This process is crucial for achieving natural-looking lip movements in animated characters, making it an invaluable tool for non-human character animation and other creative applications.

Phoneme To Mouth Shapes Input Parameters:

phoneme_data

This parameter represents the phoneme timing data, which is typically obtained from an audio-to-phoneme conversion process. It is a list of dictionaries containing information about the timing and type of each phoneme detected in the audio. This data is crucial for determining the sequence of mouth shapes that will be used in the animation.

duration

The duration parameter specifies the total length of the audio in seconds. It is a floating-point value with a default of 1.0, a minimum of 0.1, and a maximum of 3600.0. This parameter is important for calculating the timing of mouth shape transitions, ensuring they align with the audio's duration.

fps

The fps parameter stands for frames per second and determines the frame rate of the resulting animation. It is a floating-point value with a default of 24.0, a minimum of 1.0, and a maximum of 120.0. This parameter affects the smoothness and timing of the mouth shape transitions, with higher values resulting in smoother animations.

mapping_type

This parameter defines the phoneme-to-viseme mapping type used to convert phonemes into mouth shapes. It offers options such as "arpabet," "ipa," and "simplified," with "arpabet" as the default. The choice of mapping type can influence the accuracy and style of the mouth shapes generated, allowing for customization based on specific animation needs.

hold_frames

The hold_frames parameter specifies the minimum number of frames each mouth shape should be held for during the animation. It is an integer value with a default of 2, a minimum of 1, and a maximum of 10. This parameter helps control the pacing of mouth shape changes, preventing rapid flickering and ensuring smoother transitions.

smoothing

The smoothing parameter is a boolean that determines whether smoothing should be applied to the mouth shape sequence. It has a default value of True. Smoothing helps reduce flickering and abrupt changes between mouth shapes, resulting in more natural and visually appealing animations.

Phoneme To Mouth Shapes Output Parameters:

mouth_sequence

The mouth_sequence output is a list of integers representing the sequence of mouth shape indices for each frame of the animation. These indices correspond to specific mouth shapes defined in standard animation mouth charts, allowing for precise synchronization with the audio.

frame_count

The frame_count output is an integer representing the total number of frames in the generated mouth shape sequence. This value is important for understanding the length of the animation and ensuring it matches the duration of the audio.

Phoneme To Mouth Shapes Usage Tips:

  • To achieve the most natural-looking lip-sync animations, experiment with different mapping_type options to find the one that best suits your audio and animation style.
  • Adjust the hold_frames parameter to control the pacing of mouth shape changes. Increasing this value can help reduce flickering and create smoother transitions between shapes.
  • Enable smoothing to enhance the visual quality of your animations by minimizing abrupt changes between mouth shapes.

Phoneme To Mouth Shapes Common Errors and Solutions:

"Invalid phoneme data format"

  • Explanation: This error occurs when the phoneme_data input is not in the expected list of dictionaries format.
  • Solution: Ensure that the phoneme_data is correctly formatted as a list of dictionaries, each containing timing and phoneme type information.

"Duration must be between 0.1 and 3600.0 seconds"

  • Explanation: The duration parameter is set outside the allowed range.
  • Solution: Adjust the duration value to be within the specified range of 0.1 to 3600.0 seconds.

"FPS value out of range"

  • Explanation: The fps parameter is set below 1.0 or above 120.0.
  • Solution: Set the fps value within the valid range to ensure proper frame rate for the animation.

"Unsupported mapping type"

  • Explanation: The mapping_type provided is not one of the supported options.
  • Solution: Choose a valid mapping_type from the available options: "arpabet," "ipa," or "simplified."

Phoneme To Mouth Shapes Related Nodes

Go back to the extension to check out more related nodes.
TrentNodes
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

Phoneme To Mouth Shapes