Cross-Modal Analysis of Audio-Visual Film Montage

By Matthias Zeppelzauer, Dalibor Mitrovic, and Christian Breiteneder


A stylistic device frequently employed by filmmakers is the synchronous montage (composition) of audio and visual elements. Synchronous montage helps to increase tension and tempo in a scene and highlights important events in the story. Sequences with synchronous montage usually contain rich semantics which is relevant for understanding a movie. This property is currently not exploited in automated indexing, annotation, and summarization of movies. We propose a cross-modal approach that extracts sequences from a movie with synchronous audio-visual montage. Experiments confirm that the extracted sequences have high semantic relevance. Consequently, they represent a useful basis for different high-level movie abstraction tasks such as automated movie annotation and movie summarization.


M. Zeppelzauer, D. Mitrovic, C. Breiteneder: "Cross-Modal Analysis of Audio-Visual Film Montage"; Talk: International Conference on Computer Communications and Networks, Multimedia Computing and Communication Workshop, Maui, Hawaii; 08-02-2011; in: "International Conference on Computer Communication and Netwirks (ICCCN)", IEEE eXpress Conference Publishing, (2011), ISBN: 978-1-4577-0636-3; 6 pages.


