Load Checkpoint (Quantized):
The QuantizedModelLoader is a specialized node designed to facilitate the loading of quantized models within the ComfyUI framework. Its primary purpose is to handle models that have been quantized to reduce their size and improve performance, particularly in environments where computational resources are limited. This node supports various quantization formats, including INT8 and FP8, and can automatically detect the format of a given model file, allowing for seamless integration and operation. By leveraging custom operations tailored to specific quantization types, the QuantizedModelLoader ensures that models are loaded efficiently and accurately, maintaining the integrity of the model's performance while optimizing for speed and resource usage. This node is particularly beneficial for AI artists and developers who need to work with large models on hardware with limited capabilities, as it provides a streamlined and automated approach to model loading and execution.
Load Checkpoint (Quantized) Input Parameters:
quant_format
The quant_format parameter determines the quantization format of the model being loaded. It can be set to specific formats such as "int8_tensorwise", "int8_blockwise", "float8_e4m3fn_blockwise", "float8_e4m3fn_rowwise", "mxfp8", "nvfp4", or "auto" for automatic detection. This parameter impacts the selection of custom operations used during model loading, which can affect the model's performance and compatibility. The default value is "auto", which allows the node to automatically detect the appropriate format based on the model file.
ckpt_path
The ckpt_path parameter specifies the file path to the model checkpoint that needs to be loaded. This parameter is crucial as it directs the node to the exact location of the model file, enabling the loading process. The path must be accurate and accessible to ensure successful model loading. There are no specific minimum or maximum values, but it must be a valid file path.
Load Checkpoint (Quantized) Output Parameters:
model
The model output parameter represents the loaded quantized model. This output is crucial as it provides the fully constructed model ready for inference or further processing. The model is built from the state dictionary extracted from the checkpoint file, and its structure and operations are tailored based on the detected or specified quantization format. This ensures that the model operates efficiently and effectively within the constraints of the quantization method used.
Load Checkpoint (Quantized) Usage Tips:
- Ensure that the
ckpt_pathis correctly specified and points to a valid model checkpoint file to avoid loading errors. - Use the "auto" setting for
quant_formatto let the node automatically detect the best-suited quantization format, which can simplify the loading process and reduce the risk of compatibility issues. - Familiarize yourself with the different quantization formats supported by the node to better understand how they might impact model performance and resource usage.
Load Checkpoint (Quantized) Common Errors and Solutions:
Format detection failed
- Explanation: This error occurs when the node is unable to automatically detect the quantization format of the model file.
- Solution: Verify that the
ckpt_pathis correct and that the model file is not corrupted. If the issue persists, manually specify thequant_formatto bypass automatic detection.
HybridINT8Ops not available
- Explanation: This error indicates that the necessary operations for handling INT8 quantization are not available.
- Solution: Ensure that the required dependencies for INT8 operations are installed and correctly configured in your environment.
HybridFP8Ops not available
- Explanation: This error suggests that the operations needed for FP8 quantization are missing.
- Solution: Check that all necessary libraries and dependencies for FP8 operations are installed and properly set up in your system.
