Save 4 hours! We auto-setup your workflow! Free!

Drop your workflow.json — we handle every dependency, custom node, and model. Just open the link and run.

Auto-Setup Workflow Json (Free) Now!
ComfyUI > Nodes > ComfyUI-INT8-Toolkit > INT8 Kernel Config

ComfyUI Node: INT8 Kernel Config

Class Name

INT8KernelConfigTuner

Category
loaders
Author
SparknightLLC (Account age: 683days)
Extension
ComfyUI-INT8-Toolkit
Latest Updated
2026-06-23
Github Stars
0.03K

How to Install ComfyUI-INT8-Toolkit

Install this extension via the ComfyUI Manager by searching for ComfyUI-INT8-Toolkit
  • 1. Click the Manager button in the main menu
  • 2. Select Custom Nodes Manager button
  • 3. Enter ComfyUI-INT8-Toolkit in the search bar
After installation, click the Restart button to restart ComfyUI. Then, manually refresh your browser to clear the cache and access the updated list of nodes.

Visit ComfyUI Online for ready-to-use ComfyUI environment

  • Free trial available
  • 16GB VRAM to 80GB VRAM GPU machines
  • 400+ preloaded models/nodes
  • Freedom to upload custom models/nodes
  • 200+ ready-to-run workflows
  • 100% private workspace with up to 200GB storage
  • Dedicated Support

Run ComfyUI Online

INT8 Kernel Config Description

Specialized node optimizing INT8 model performance through Triton kernel settings configuration for efficient matrix multiplication operations, simplifying kernel configuration for peak efficiency.

INT8 Kernel Config:

The INT8KernelConfigTuner is a specialized node designed to optimize the performance of INT8 models by configuring Triton kernel settings. This node allows you to fine-tune the kernel configurations for INT8 matrix multiplication operations, which are crucial for efficient model execution. By providing the ability to manually set kernel parameters or run microbenchmarks to determine the best configuration, the INT8KernelConfigTuner ensures that your model operates at peak efficiency. This is particularly beneficial for AI artists who want to leverage INT8 models for faster inference without delving into the complexities of kernel optimization. The node's primary goal is to simplify the process of kernel configuration, making it accessible and effective for users who may not have a deep technical background.

INT8 Kernel Config Input Parameters:

model

This parameter represents the INT8 model whose Triton kernel settings need to be synchronized during sampling. It ensures that the kernel configurations are applied to the correct model, facilitating efficient execution.

run_microbench

This boolean parameter, with a default value of False, determines whether to benchmark candidate kernel settings and use the fastest result for the model. Running a microbenchmark can help identify the most efficient kernel configuration, optimizing model performance.

block_m

This integer parameter specifies the Triton BLOCK_M tile size for fixed INT8 matrix multiplication kernels. It ranges from 16 to 512, with a default value of 128. Adjusting this value can impact the performance of the kernel by changing the size of the matrix tiles processed in parallel.

block_n

Similar to block_m, this integer parameter defines the Triton BLOCK_N tile size, with the same range and default value. It affects how the matrix multiplication is partitioned, influencing execution speed and efficiency.

block_k

This parameter sets the Triton BLOCK_K reduction tile size for fixed INT8 matrix multiplication kernels. It ranges from 16 to 512, with a default value of 64. This value determines the size of the reduction tiles, impacting the kernel's computational efficiency.

group_size_m

This integer parameter specifies the Triton GROUP_SIZE_M launch grouping value, ranging from 1 to 64, with a default of 8. It controls the grouping of threads during kernel execution, affecting parallelism and performance.

num_warps

This parameter defines the number of Triton warps per program, ranging from 1 to 16, with a default value of 4. Warps are groups of threads that execute instructions in lockstep, and adjusting this value can optimize resource utilization.

num_stages

This integer parameter sets the number of Triton pipeline stages, ranging from 1 to 8, with a default of 4. It determines the depth of the pipeline, influencing latency and throughput of the kernel execution.

bench_m

This parameter specifies the M dimension used by the optional synthetic kernel microbenchmark, ranging from 64 to 16384, with a default of 2048. It defines the size of the matrix dimension for benchmarking purposes.

bench_k

This parameter sets the K dimension for the synthetic kernel microbenchmark, with the same range and default as bench_m. It is used to evaluate the kernel's performance under different matrix sizes.

bench_n

Similar to bench_m and bench_k, this parameter defines the N dimension for the microbenchmark, with a default value of 4096. It helps in assessing the kernel's efficiency across various matrix configurations.

bench_warmup

This integer parameter specifies the number of warmup iterations before timing each candidate kernel configuration, ranging from 1 to 20, with a default of 2. Warmup iterations help stabilize performance measurements.

bench_iterations

This parameter sets the number of timed iterations per candidate kernel configuration, ranging from 2 to 100, with a default of 6. It determines how many times each configuration is tested to ensure accurate benchmarking results.

bench_include_scalar

This boolean parameter, with a default value of False, indicates whether to include scalar-weight kernel candidates in the benchmark. It is typically left off for per-row INT8 models to focus on more relevant configurations.

INT8 Kernel Config Output Parameters:

MODEL

The output parameter is the MODEL, which represents the INT8 model with the applied Triton kernel configuration. This output ensures that the model is optimized with the selected or benchmarked kernel settings, ready for efficient execution.

INT8 Kernel Config Usage Tips:

  • To achieve optimal performance, consider enabling run_microbench to automatically benchmark and select the best kernel configuration for your model.
  • Adjust the block_m, block_n, and block_k parameters based on the specific dimensions of your model's matrices to enhance execution efficiency.

INT8 Kernel Config Common Errors and Solutions:

INT8 Kernel Config: Triton kernel module unavailable

  • Explanation: This error occurs when the Triton kernel module is not available or cannot be imported.
  • Solution: Ensure that the Triton library is correctly installed and accessible in your environment.

INT8 Kernel Config: microbench failed

  • Explanation: This error indicates that the microbenchmarking process encountered an issue and could not complete successfully.
  • Solution: Check the input parameters for the microbenchmark, such as bench_m, bench_k, and bench_n, to ensure they are within valid ranges and try running the benchmark again.

INT8 Kernel Config Related Nodes

Go back to the extension to check out more related nodes.
ComfyUI-INT8-Toolkit
RunComfy
Copyright 2025 RunComfy. All Rights Reserved.

RunComfy is the premier ComfyUI platform, offering ComfyUI online environment and services, along with ComfyUI workflows featuring stunning visuals. RunComfy also provides AI Models, enabling artists to harness the latest AI tools to create incredible art.

INT8 Kernel Config