Load Dataset From Folder:
The Sage_Load_Dataset_From_Folder node is designed to facilitate the loading of datasets consisting of images and their corresponding captions from a specified directory. This node is particularly useful for training purposes, where having a structured dataset is crucial. It allows you to apply optional prefixes and suffixes to the captions, providing flexibility in how the data is processed and utilized. By automating the loading and pairing of images with captions, this node streamlines the preparation of datasets, making it easier to manage and manipulate large volumes of data for machine learning tasks. Its primary goal is to enhance the efficiency of dataset preparation, ensuring that images and captions are correctly aligned and ready for further processing or training.
Load Dataset From Folder Input Parameters:
dataset_path
The dataset_path parameter specifies the directory path where the dataset of images and captions is located. This input is crucial as it directs the node to the correct folder from which to load the data. There are no minimum or maximum values for this parameter, but it must be a valid path to a directory containing the dataset files. The accuracy of the dataset loading process heavily depends on the correctness of this path.
prefix
The prefix parameter allows you to add a specific string at the beginning of each caption. This can be useful for categorizing or tagging the data in a way that is meaningful for your specific use case. The default value is an empty string, meaning no prefix will be added unless specified. This parameter is optional and can be customized to suit your needs.
suffix
Similar to the prefix, the suffix parameter lets you append a string to the end of each caption. This can help in further categorizing or providing additional context to the captions. The default value is an empty string, and it is optional, allowing you to decide whether or not to use it based on your requirements.
separator
The separator parameter defines the character or string used to separate the prefix, caption, and suffix. By default, it is set to a space (" "), ensuring that the components are clearly delineated. This parameter is optional and can be adjusted to any string that suits your formatting needs.
Load Dataset From Folder Output Parameters:
images
The images output parameter provides a collection of images that have been loaded from the specified dataset directory. This output is essential as it represents the visual data that will be used in training or other processing tasks. The images are returned as a list, allowing for easy iteration and manipulation in subsequent steps.
filenames
The filenames output parameter contains the names of the image files that have been loaded. This output is important for tracking and referencing the specific images within the dataset. It helps in maintaining a clear association between the images and their corresponding captions.
captions
The captions output parameter delivers the text descriptions or annotations associated with each image. These captions are crucial for tasks that require image-text pairing, such as training models that rely on both visual and textual data. The captions are processed with any specified prefixes, suffixes, and separators, ensuring they are formatted as intended.
Load Dataset From Folder Usage Tips:
- Ensure that the
dataset_pathis correctly specified and points to a directory containing both images and captions to avoid loading errors. - Utilize the
prefixandsuffixparameters to add meaningful context to your captions, which can be beneficial for categorization or tagging purposes. - Adjust the
separatorparameter to match the formatting requirements of your specific use case, ensuring that the prefix, caption, and suffix are clearly separated.
Load Dataset From Folder Common Errors and Solutions:
Dataset directory not found
- Explanation: This error occurs when the specified
dataset_pathdoes not exist or is incorrect. - Solution: Verify that the
dataset_pathis correct and points to an existing directory containing the dataset.
No images found in the specified directory
- Explanation: This error indicates that the directory specified by
dataset_pathdoes not contain any valid image files. - Solution: Ensure that the directory contains image files with supported extensions such as
.png,.jpg,.jpeg, or.webp.
Invalid prefix or suffix
- Explanation: This error may occur if the prefix or suffix contains unsupported characters or formatting.
- Solution: Check the prefix and suffix for any unusual characters and ensure they are formatted correctly. Adjust them as needed to fit the desired output format.
