Introduction
Build real-time AI video and audio workflows using the ComfyStream toolkit for ComfyUI.
Overview
The ComfyStream toolkit adds powerful real-time video and audio capabilities to ComfyUI, making it easy to build interactive, AI-powered media workflows. It extends ComfyUI with specialized tools for streaming, live processing, and on-the-fly workflow updates, including:
- ComfyStream – A custom node that streams audio and video from your webcam and microphone into ComfyUI for real-time AI processing, then returns the processed output.
- ComfyUI-Stream-Pack – A collection of custom nodes designed to support advanced real-time audio and video workflows.
To help you get started, ComfyStream also includes a knowledge base with foundational workflows, optimization tips, and in-depth guides on all tools and nodes.
Getting Started
To get started, follow the installation guide below or explore the stream pack for additional nodes.
Install ComfyStream
Step-by-step instructions to install ComfyStream and start creating real-time workflows.
Check out the ComfyUI-Stream-Pack
A collection of custom nodes for building real-time audio and video workflows.
How It Works
ComfyStream enables real-time processing of audio and video streams by integrating a WebRTC server for low-latency, bidirectional communication, a custom tensor-based pipeline for converting media frames to and from tensors, and ComfyUI’s EmbeddedComfyClient for AI inference.
Data Flow Overview
Here’s how the system processes live audio and video end-to-end:
- Input: WebRTC receives video, audio, and control data from the client.
- Workflow Injection: The pipeline dynamically modifies the ComfyUI workflow by replacing standard input/output nodes with custom tensor nodes.
- Inference: The EmbeddedComfyClient processes incoming tensors in real-time using the updated workflow.
- Output Conversion: Processed tensors are converted back to video and audio, and streamed back to the client via WebRTC.
- Live Control: A control channel allows the client to update the workflow or modify parameters on the fly, without restarting the session.
This high-level overview is visualized below: