Visit ComfyUI Online for ready-to-use ComfyUI environment
Generate spatial tokens from image coordinates and labels for AI image editing and annotation tasks.
The QwenSpatialTokenGenerator is a powerful tool designed to generate spatial tokens from image coordinates and labels, facilitating spatial editing and annotation tasks. This node is particularly beneficial for AI artists and developers who need to manipulate or analyze images based on spatial data. By converting image coordinates into various token formats, it allows for flexible integration into different workflows, whether for structured data processing or natural language applications. The generator operates without relying on templates or assumptions, ensuring a pure and adaptable approach to spatial token generation. Its main goal is to provide a seamless and efficient way to handle spatial data, enhancing the capabilities of image-based projects.
This parameter represents the input image that you wish to use for spatial editing. It is crucial as it serves as the base from which spatial tokens are generated. The image should be in a compatible format, and its quality can impact the accuracy and detail of the spatial tokens produced.
The prompt is a string input that can be multiline and is auto-populated from the spatial editor. It serves as a guide or instruction set for generating spatial tokens, allowing for customized and context-specific token generation. The default value is an empty string, and it can be adjusted to fit the specific needs of your project.
This parameter determines the format in which the spatial tokens will be output. Options include structured_json, xml_tags, natural_language, and traditional_tokens. The default is structured_json, which is recommended for its structured and easily parseable nature. Each format offers different benefits, such as xml_tags for HTML-like elements, natural_language for human-readable sentences, and traditional_tokens for legacy compatibility.
A boolean parameter that, when enabled, provides additional debugging information during the token generation process. The default value is False. Activating debug mode can be helpful for troubleshooting and understanding the internal workings of the node, especially if unexpected results occur.
This output is the image with visual annotations based on the generated spatial tokens. It provides a visual reference for the spatial data, allowing you to see how the tokens correspond to specific areas or features within the image.
The output prompt is a string that reflects the final set of instructions or descriptions generated during the tokenization process. It can be used to understand the context and details of the spatial tokens produced.
This output provides a string containing detailed debugging information, which is especially useful if the debug mode is enabled. It includes logs and messages that can help diagnose issues or understand the processing steps taken by the node.
output_format options to find the one that best suits your project's needs. For structured data processing, structured_json is recommended, while natural_language may be more suitable for applications requiring human-readable output.debug_mode if you encounter unexpected results or need to understand the internal processing of the node. This can provide valuable insights and help troubleshoot potential issues.debug_mode can provide additional information to help identify the root cause of the error.prompt is not correctly formatted or if there is an issue with the spatial token data.prompt and any spatial token data are correctly formatted as JSON if using structured_json output. If using other formats, ensure that the data adheres to the expected structure for those formats.RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.