Reference Latent+:
ReferenceLatentPlus is an enhanced version of the ReferenceLatent node, designed to improve the integration of reference latents in various AI models that consume them, such as Flux, Flux2/Klein, Lumina2, Z-Image, Wan, Hunyuan, and Qwen-Image. This node addresses several limitations of the original ReferenceLatent by introducing features like signed per-image strength, per-image timestep gating, and per-image compositional masking. It also includes a built-in MediaPipe auto-mask for segmenting faces, bodies, clothes, and backgrounds, which enhances the precision of reference latents. Additionally, it manages attention costs through a megapixel cap and supports 1-4 image inputs without requiring manual VAE Encode wiring. The node is empirically tuned on the Flux2/Klein 9B model, but its mechanisms are applicable to other models, although optimal values may vary.
Reference Latent+ Input Parameters:
reference_latents
This parameter represents the latent references that the node will process. It allows for the input of 1 to 4 images, which are used as guides for the model. The node automatically handles the encoding of these images, eliminating the need for manual VAE Encode wiring. The strength and influence of each reference image can be adjusted, providing flexibility in how they contribute to the final output.
strength
The strength parameter controls the influence of each reference image on the model's output. It is a signed value, meaning it can be positive or negative, allowing for nuanced adjustments in how much each reference image affects the result. This parameter is crucial for fine-tuning the balance between the reference images and the primary input to achieve the desired artistic effect.
timestep_gating
This parameter allows for per-image timestep gating, which means you can control at which timesteps each reference image is applied during the model's processing. This feature provides additional control over the temporal influence of reference images, enabling more dynamic and context-sensitive outputs.
compositional_masking
Compositional masking is a feature that allows for the selective application of reference images to specific regions of the output. This parameter utilizes a built-in MediaPipe auto-mask to segment areas such as faces, bodies, clothes, and backgrounds, ensuring that reference images are applied only where desired. This enhances the precision and relevance of the reference images in the final output.
Reference Latent+ Output Parameters:
conditioned_output
The conditioned output is the result of the model's processing, influenced by the reference latents and the input parameters. It reflects the integration of the reference images with the primary input, adjusted according to the strength, timestep gating, and compositional masking settings. This output is crucial for achieving the desired artistic effect, as it combines the creative elements of the reference images with the model's inherent capabilities.
Reference Latent+ Usage Tips:
- Experiment with the strength parameter to find the right balance between the reference images and the primary input, as this can significantly affect the artistic outcome.
- Utilize the compositional masking feature to apply reference images selectively, enhancing specific regions of the output for more targeted artistic effects.
- Adjust the timestep gating to control the temporal influence of reference images, which can be particularly useful in dynamic or animated contexts.
Reference Latent+ Common Errors and Solutions:
"Invalid reference_latents input"
- Explanation: This error occurs when the input for reference_latents is not correctly formatted or exceeds the allowed number of images.
- Solution: Ensure that you are providing between 1 and 4 images as input for reference_latents, and verify that they are in the correct format.
"Strength value out of range"
- Explanation: The strength parameter has been set to a value outside the acceptable range.
- Solution: Check the strength values and ensure they are within the expected range, adjusting them to be either positive or negative as needed for your specific use case.
