This node is implemented based on the research paper “Looking Backward: Streaming Video-to-Video Translation with Feature Banks”. See the Acknowledgments section for details.

Introduction

The Feature Bank node improves temporal consistency—the visual coherence between consecutive frames—in generative video models by caching and reusing features from earlier frames. This reduces flickering and enables smoother transitions, resulting in a more stable visual flow. By controlling how much these cached features influence each frame, it also enables fine-grained stylistic consistency—especially valuable in real-time contexts where visual stability is essential.

Visual Comparison

The video below compares a baseline workflow with a real-time workflow using Feature Bank. Compared to the baseline, the Feature Bank provides:

  • Reduced Flickering – Backgrounds and recurring objects appear more stable.
  • Improved Temporal Consistency – Objects retain consistent form, texture, and color across frames.
  • Smoother Motion – Transitions feel more natural, with fewer visual jumps or jitters.

Side-by-side comparison (T2I workflow)

How It Works

The Feature Bank node enhances temporal consistency by integrating with the self-attention layers of the diffusion model. It caches features from previous frames and re-injects them into future ones—stabilizing transitions without altering the underlying model.

It performs three core steps:

  1. Caching – Stores attention features at regular intervals.
  2. Filtering – Selects relevant cached features using cosine similarity.
  3. Injection – Blends selected features into the current frame based on the configured strength.

The node supports any self-attention-based model (e.g., Stable Diffusion) and works with both text-to-image (T2I) and image-to-image (I2I) workflows.

Adding the Node

An example T2I workflow using the Feature Bank node is available here.

1

Install the StreamPack nodes

Ensure the StreamPack custom nodes are installed in your ComfyUI setup. Follow the installation instructions for a step-by-step guide.

2

Open the ComfyUI graph editor

Right-click on an empty area of the canvas and choose Add Node.

3

Find the Feature Bank node

Search for Feature Bank under the StreamPack/model_patches/unet category.

4

Place it in the workflow

Insert the node between your Model Loader and Sampling Node.

5

Connect inputs and configure parameters

Wire up the inputs and outputs. The node runs automatically using the parameters you set.

Feature Bank Node Integration (T2I workflow)

Parameters

The following parameters can be configured in the Feature Bank node:

feature_cache_interval
integer

Feature Cache Interval: Determines how often features from the attention layers are cached (e.g., every 4 frames). Lower values improve consistency but increase memory usage.

use_feature_injection
boolean

Use Feature Injection: Enables or disables the influence of cached features. Set to true to activate the Feature Bank.

feature_injection_strength
float

Feature Injection Strength: Controls how much influence cached features have on the current frame (0 = none, 1 = full). A blended approach is possible with values between 0 and 1.

feature_similarity_threshold
float

Feature Similarity Threshold: Sets the minimum cosine similarity for reusing cached features. Higher values ensure only relevant features are reused.

feature_bank_max_frames
integer

Feature Bank Max Frames: Maximum number of past frames to cache. Older frames are automatically discarded.

Strengths and Limitations

While the Feature Bank offers major stability and stylistic benefits, it also introduces a few trade-offs depending on your use case.

Strengths

  • Improved Temporal Consistency – Reduces flickering and stabilizes frame-to-frame transitions.
  • Stylistic Control – Adjust the influence of past frames for consistent or evolving styles.
  • Creative Exploration – Enables novel stylistic variations by selectively injecting past features.

Limitations

  • Reduced Novelty – High injection strength may overly rely on past frames, limiting new details.
  • Memory Usage – Storing features increases memory consumption, especially at high resolution.
  • Parameter Tuning – May require iteration to balance consistency and variation effectively.

Acknowledgments

This node is based on the research paper “Looking Backward: Streaming Video-to-Video Translation with Feature Banks” by Liang et al. 2023. We thank the original authors for their contributions and collaboration in adapting this concept for ComfyUI.