LipDub AI modifies on-screen performances to perfectly match target audio tracks.
It begins by analyzing who’s on screen and when they speak, intelligently grouping and labeling identities across all uploaded training footage.
From there, LipDub AI learns how each identity articulates while speaking, tracking every detail—lips, lower face, facial hair, and even how the neck and shirt collar move. This generative model aims to recreate the most realistic version of the original performance.
Once trained, LipDub AI flawlessly syncs the source performance to the new audio, delivering a result that looks and feels completely real.