Press Relaease »SRI’s Spotting Audio-Visual Inconsistencies (SAVI) techniques detect tampered videos by identifying discrepancies between the audio and visual tracks. For example, the system can detect when lip synchronization is a little off or if there is an unexplained visual “jerk” in the video. Or it can flag a video as possibly tampered if the visual scene is outdoors, but analysis of the reverberation properties of the audio track indicates the recording was done in a small room.
This video shows how the SAVI system detects speaker inconsistencies. First, the system detects the person’s face, tracks it throughout the video clip, and verifies it is the same person for the entire clip. It then detects when she is likely to be speaking by tracking when she is moving her mouth appropriately. «