ComfyUI  >  Nodes  >  VLM_nodes >  ChatMusician

ComfyUI Node: ChatMusician

Class Name


VLM Nodes/Audio
gokayfem (Account age: 1058 days)
Latest Updated
Github Stars

How to Install VLM_nodes

Install this extension via the ComfyUI Manager by searching for  VLM_nodes
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter VLM_nodes in the search bar
After installation, click the  Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • High-speed GPU machines
  • 200+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 50+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ChatMusician Description

Generate musical compositions from text prompts using language model, converting to ABC notation and synthesizing audio files.


ChatMusician is a versatile node designed to generate musical compositions based on textual prompts using a language model. This node leverages the capabilities of a language model to interpret and transform user-provided prompts into musical scores in ABC notation. It then synthesizes these scores into audio files, making it an invaluable tool for AI artists looking to create music from textual descriptions. The primary goal of ChatMusician is to bridge the gap between textual creativity and musical expression, allowing users to generate unique and personalized music pieces effortlessly.

ChatMusician Input Parameters:


The prompt parameter is a string that serves as the initial textual input for the language model. This text is used to guide the model in generating a musical composition. The content of the prompt significantly influences the style and structure of the resulting music. There are no strict constraints on the length or content of the prompt, but more detailed prompts can lead to more specific and tailored musical outputs.


The model parameter specifies the language model to be used for generating the musical composition. This model interprets the prompt and generates the corresponding ABC notation for the music. The choice of model can affect the quality and style of the generated music, as different models may have varying capabilities and training data.


The max_tokens parameter defines the maximum number of tokens the language model can generate in response to the prompt. This parameter controls the length of the generated musical composition. Higher values allow for longer compositions, while lower values restrict the output length. The default value is typically set by the model's configuration.


The temperature parameter controls the randomness of the language model's output. A higher temperature value results in more random and creative outputs, while a lower value produces more deterministic and focused results. The default value is usually around 1.0, with a typical range between 0.7 and 1.5.


The top_p parameter, also known as nucleus sampling, limits the model's output to the top p probability mass. This parameter helps in controlling the diversity of the generated text. A value of 1.0 includes all possible tokens, while lower values restrict the output to more probable tokens. The default value is often set to 0.9.


The top_k parameter limits the model's output to the top k most probable tokens. This parameter also helps in controlling the diversity of the generated text. A value of 0 disables this feature, while higher values allow for more diverse outputs. The default value is typically set to 50.


The frequency_penalty parameter adjusts the likelihood of the model repeating the same tokens. Higher values discourage repetition, leading to more varied outputs. The default value is usually set to 0, with a typical range between 0 and 1.


The presence_penalty parameter influences the model to introduce new tokens that have not appeared in the prompt. Higher values encourage the generation of new content, while lower values result in more conservative outputs. The default value is often set to 0, with a typical range between 0 and 1.


The repeat_penalty parameter penalizes the model for generating repeated sequences of tokens. This helps in reducing redundancy in the output. The default value is typically set to 1.0, with a typical range between 1.0 and 2.0.


The seed parameter sets the random seed for the language model's generation process. This ensures reproducibility of the generated outputs. If the same seed and parameters are used, the model will produce the same output. The default value is usually set to a random number.


The sample_rate parameter defines the sample rate of the synthesized audio output. This parameter affects the quality and size of the audio file. Common values include 16000, 22050, and 44100 Hz, with 44100 Hz being the standard for high-quality audio.

ChatMusician Output Parameters:


The abc_notation output is a string containing the musical composition in ABC notation. This notation is a text-based format for representing music scores, which can be easily interpreted and modified. It serves as an intermediate representation of the music before synthesis.


The audio output is a list of audio samples representing the synthesized music. This audio data can be played back or further processed as needed. The quality and characteristics of the audio depend on the sample rate and the synthesizer used.


The sample_rate output is an integer representing the sample rate of the synthesized audio. This value matches the sample_rate input parameter and indicates the number of samples per second in the audio file.

ChatMusician Usage Tips:

  • Experiment with different prompt texts to explore various musical styles and compositions.
  • Adjust the temperature parameter to balance creativity and coherence in the generated music.
  • Use the seed parameter to reproduce specific outputs for consistency in your projects.
  • Combine ChatMusician with other nodes to create complex audio workflows and enhance your creative process.

ChatMusician Common Errors and Solutions:

"Model not found"

  • Explanation: The specified language model could not be located or loaded.
  • Solution: Ensure that the model name is correct and that the model is properly installed and accessible.

"Invalid ABC notation"

  • Explanation: The generated ABC notation is not valid or cannot be parsed.
  • Solution: Check the prompt and parameters for any issues that might lead to invalid output. Adjust the prompt or parameters to generate a valid ABC notation.

"Audio synthesis failed"

  • Explanation: The synthesizer encountered an error while rendering the audio from the ABC notation.
  • Solution: Verify that the ABC notation is correct and that the synthesizer is functioning properly. Adjust the parameters or try a different prompt to resolve the issue.

ChatMusician Related Nodes

Go back to the extension to check out more related nodes.

© Copyright 2024 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals.