Viseme
In short: A viseme is the visual representation of a phoneme, describing the specific mouth shape a speaker forms when producing a particular speech sound.
About Viseme
Visemes are the visual building blocks of lip sync technology. While phonemes represent individual sounds in spoken language, visemes represent the corresponding mouth shapes that produce those sounds.
In AI lip sync pipelines, the system maps detected phonemes from an audio track to their corresponding visemes, then renders those mouth shapes onto the target face frame by frame. Multiple phonemes can map to the same viseme since some distinct sounds produce visually similar mouth positions.
How Viseme Connects to Lip Sync
Viseme relates to several other concepts in the AI lip sync pipeline: Phoneme , and Face Landmark Detection .
Explore More
Related Terms
Try AI Lip Sync
Experience studio-quality lip synchronization for videos in any language.