About this Project
In our project, we concentrate on video processing tasks that arise in the context of interactive multimedia applications. In particular, we investigate the need for user interaction in fundamental video editing steps such as the insertion or deletion of objects, or the combination of videos from different sources ("video compositing".) These tasks currently require a significant amount of user assistance. A prerequisite for their automation is the recovery of a 3D model from the video scene, which afterwards provides the framework for the proper handling of occlusions and perspective distortions.
During the first part of our study, we perform experiments with two and more video cameras in various spatial arrangements in order to better understand the advantages and limitations of different stereo configurations on the subsequent 3D analysis. The main part of our investigations then deals with the development of suitable video processing algorithms for the extraction of spatial information from a dynamic video scene. In this context, we focus on methods for the automated segmentation and tracking of video objects and their combination with stereo-based depth analysis. The final goal of the project is to generate a high-level description of the video scene that relies on the identification of semantically meaningful video objects and their location in 3D space. Such a representation provides the basis for the subsequent encoding of the video content according to the recent MPEG-4 and MPEG-7 standards for video compression and description.
Funding provided by
Austrian Science Fund (FWF) - P15663