US20260024264
2026-01-22
Physics
G06T13/40
A novel method and apparatus for personalized video content modification integrates advertisements seamlessly into existing video content. It identifies a section of a video where a character speaks, samples the speaker's voice, and uses speech synthesis to create an audio track with an advertising script. Simultaneously, the character's lip movements are altered to match the synthesized speech, ensuring a natural appearance without altering the original video structure.
This approach addresses the limitations of traditional product placement (PPL) in video content. As digital content consumption grows, diverse advertising methods are needed to reduce costs and integrate ads naturally. The system uses speech encoding and clustering techniques to dynamically modify video segments, offering a more flexible and cost-effective advertising solution compared to manual editing or re-recording.
The system works by obtaining a target section from a video, sampling speech data from the audio track, and generating speech synthesis data with an advertising script. The lip movements in the video are adjusted to match this new script. This process allows for the delivery of personalized advertisements by selecting relevant ad segments based on user demographics or behavior, enhancing engagement through tailored content.
The technology involves obtaining text data from video audio tracks, extracting relevant sections, and using speech encoders to sample speaker data. Lip movements are modified by identifying characters and adjusting their mouth regions to align with the script. The system can dynamically replace video sections with synthesized content, ensuring smooth playback and user-specific customization.
This method overcomes the challenges of conventional advertising systems by automating the integration of personalized ads into video streams. It allows for scalable, dynamic customization without disrupting playback or requiring multiple content versions. By leveraging speech synthesis and lip movement synchronization, the system provides a robust solution for modern advertising needs in digital media.