Save 4 hours! We auto-setup your workflow! Free!

Drop your workflow.json — we handle every dependency, custom node, and model. Just open the link and run.

Auto-Setup Workflow Json (Free) Now!
ComfyUI > Nodes > ComfyUI-TurboQuant

ComfyUI Extension: ComfyUI-TurboQuant

Repo Name

ComfyUI-TurboQuant

Author
Scottcjn (Account age: 1243 days)
Nodes
View all nodes(2)
Latest Updated
2026-05-18
Github Stars
0.02K

How to Install ComfyUI-TurboQuant

Install this extension via the ComfyUI Manager by searching for ComfyUI-TurboQuant
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-TurboQuant in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

ComfyUI-TurboQuant Description

ComfyUI-TurboQuant enhances ComfyUI by compressing the attention KV cache VRAM by approximately 4.5 times. It utilizes TQ3 KV cache compression with 3-bit Lloyd-Max quantization and Fast Walsh-Hadamard Transform decorrelation.

ComfyUI-TurboQuant Introduction

ComfyUI-TurboQuant is an innovative extension designed to optimize the memory usage of AI models, particularly those using transformer architectures. By employing advanced compression techniques, this extension significantly reduces the memory footprint of the Key-Value (KV) cache in attention layers. This is particularly beneficial for AI artists working with large models that demand substantial VRAM, as it allows these models to run more efficiently on hardware with limited memory capacity. The primary feature of ComfyUI-TurboQuant is its ability to compress the KV cache by approximately 4.5 times using a method known as 3-bit Lloyd-Max quantization, combined with Fast Walsh-Hadamard Transform for decorrelation. This means you can work with larger models or more complex projects without needing to upgrade your hardware.

How ComfyUI-TurboQuant Works

At its core, ComfyUI-TurboQuant uses a technique called TQ3 quantization to compress data. Imagine you have a large, detailed painting that you want to store in a smaller space without losing much of its detail. ComfyUI-TurboQuant achieves this by breaking down the data into smaller, more manageable pieces and then applying a series of transformations to reduce its size while preserving its essential characteristics.

Here's a simplified breakdown of the process:

  1. Normalization: The data is first normalized, which means adjusting its values to fit within a certain range. This is like adjusting the brightness of an image so that all its details are visible.
  2. Transformation: The Fast Walsh-Hadamard Transform is applied to decorrelate the data. Think of this as rearranging the colors in a painting to make it easier to compress.
  3. Sign Flips: Random sign flips are used to spread the data's energy evenly, similar to how a painter might use different brush strokes to distribute paint evenly across a canvas.
  4. Scaling: The data is scaled to fit within a specific range, ensuring that it can be accurately represented with fewer bits.
  5. Quantization: Finally, the data is quantized using a codebook with 8 levels, effectively reducing the number of bits needed to store each value. This is akin to reducing the number of colors in a painting while maintaining its overall appearance. The result is a compressed version of the original data that retains a high degree of similarity, with a cosine similarity of over 0.97, meaning the compressed data is very close to the original in terms of information content.

ComfyUI-TurboQuant Features

ComfyUI-TurboQuant offers several features that enhance its functionality and usability:

  • TurboQuant KV Patch: This feature allows you to patch a model's attention layers to enable KV tensor compression. By doing so, you can significantly reduce the memory required for these layers, making it easier to work with large models on less powerful hardware.
  • TurboQuant Info: After running an inference, this feature provides detailed statistics about the compression process. It helps you understand how much memory was saved and the effectiveness of the compression.

These features can be customized based on your needs. For instance, you can choose to enable or disable the TurboQuant KV Patch depending on whether you need to compress the KV cache for a particular project.

Troubleshooting ComfyUI-TurboQuant

While using ComfyUI-TurboQuant, you might encounter some common issues. Here are a few troubleshooting tips to help you resolve them:

  • Issue: Model Not Loading Properly: Ensure that the TurboQuant KV Patch is correctly applied. Double-check the input and output settings to confirm that the model is patched as expected.
  • Issue: Unexpected Compression Results: If the compression statistics seem off, verify that the model's attention layers are compatible with the TQ3 quantization process. Some models may require specific configurations to achieve optimal results.
  • Issue: Performance Degradation: If you notice a drop in performance, consider adjusting the quantization settings or reviewing the model's compatibility with the extension. For further assistance, you can refer to community forums or documentation specific to ComfyUI-TurboQuant.

Learn More about ComfyUI-TurboQuant

To deepen your understanding of ComfyUI-TurboQuant and its applications, consider exploring the following resources:

  • Tutorials: Look for online tutorials that provide step-by-step guidance on using ComfyUI-TurboQuant effectively in your projects.
  • Community Forums: Join forums where AI artists and developers discuss their experiences with ComfyUI-TurboQuant. These platforms are great for asking questions and sharing insights.
  • Documentation: Review the official documentation for detailed technical information and advanced usage scenarios. By leveraging these resources, you can enhance your skills and make the most of ComfyUI-TurboQuant in your AI art projects.

ComfyUI-TurboQuant Related Nodes

RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

ComfyUI-TurboQuant detailed guide | ComfyUI