Visit ComfyUI Online for ready-to-use ComfyUI environment
Efficiently manage image directories by removing duplicates using perceptual hashing for AI artists.
The DedupImageFiles| Dedup Image Files 🐑
node is designed to efficiently manage and clean up your image directories by identifying and removing duplicate image files. This node leverages perceptual hashing to compare images, allowing it to detect duplicates even if they have slight variations. By setting a maximum distance threshold, you can control the sensitivity of the duplicate detection process. The primary benefit of using this node is to save storage space and maintain an organized image library by eliminating redundant files. It is particularly useful for AI artists who work with large collections of images and need to ensure that their datasets are free from unnecessary duplicates.
The directory
parameter specifies the path to the folder where the node will search for duplicate image files. It is crucial to provide the correct directory path to ensure that the node can access and process the images you want to deduplicate. There are no specific minimum or maximum values for this parameter, but it should be a valid directory path on your system.
The max_distance_threshold
parameter determines the sensitivity of the duplicate detection process. It sets the maximum allowable Hamming distance between image hashes for two images to be considered duplicates. A lower threshold will result in stricter duplicate detection, while a higher threshold may allow for more variations between duplicates. The exact range of values is not specified, but it should be a positive integer, with the default value typically set to a level that balances accuracy and flexibility.
The trigger_signal
parameter is an optional input that can be used to initiate the deduplication process. It is not mandatory for the node's operation, and its presence or absence does not affect the deduplication results. This parameter is useful for integrating the node into automated workflows where specific conditions or events trigger the deduplication process.
The deleted_count
output parameter provides the number of duplicate image files that were successfully deleted from the specified directory. This count helps you understand the extent of deduplication performed and can be used to verify the effectiveness of the node's operation.
The log_message
output parameter contains a detailed log of the deduplication process, including information about the files deleted and any issues encountered. This log is valuable for tracking the node's activity and troubleshooting any problems that may arise during execution.
directory
parameter is set to the correct path where your images are stored to avoid accidental deletion of important files.max_distance_threshold
parameter based on the level of similarity you want to allow between duplicates. A lower threshold is recommended for high precision, while a higher threshold can be used for more lenient deduplication.trigger_signal
parameter to automate the deduplication process in larger workflows, ensuring that it runs only when specific conditions are met.<filename>
: <error>
<file>
: <error>
<file>
RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Playground, enabling artists to harness the latest AI tools to create incredible art.