Interactive Media Systems, TU Vienna

Dense Stereo Matching for Urban Outdoor Scenes

Thesis by Katrin Lasinger

Supervision by Margrit Gelautz, Konrad Schindler, and Silvano Galliani


Dense stereo matching is an active research topic in the area of Computer Vision. Depth information is extracted from a dense correspondence search between two or more images of the same scene, taken from different camera positions. Extracted depth information can be used for various applications such as robotic navigation, automated driving or 3D reconstruction of objects and buildings. In this work we will focus on dense stereo matching for urban outdoor environments. We start from the recently published PatchMatch Stereo approach by Bleyer et al. [8] since it seems suitable for our purpose in terms of memory consumption and scalability for high resolution images. We further extend their idea to multi-view stereo. Our algorithm is tested on different urban outdoor image sets, including image pairs from cameras mounted on a car, panoramic images of urban areas as well as multi-view data from historic sites and aerial image data. For the correspondence search, experiments with different cost functions are performed. PatchMatch Stereo is a local stereo matching approach that estimates a 3D plane at each pixel position, hence, extracting not only disparity values but also surface normals. The Patch-Match Stereo algorithm is based on a randomized approximate correspondence search. Initially a random plane is selected for each pixel position. Good plane estimates are then propagated to neighboring pixels and further refined in an iterative process. We transform the PatchMatch Stereo approach to scene space in order to directly estimate depth values and work with non-rectified images. Mapping from one image to another is facilitated by plane induced homographies, utilizing the estimated planes (normal and depth) at each pixel position. Processing in scene space allows us to directly combine multiple images. The major contribution of our work is a multi-view stereo matching approach. The use of more than two images facilitates the handling of partially occluded image regions and therefore leads to more robust results. Our approach is quantitatively evaluated on existing benchmark data for two-view and multi-view image sequences. Results are compared with reported values of state-of-the-art stereo matching methods.


K. Lasinger: "Dense Stereo Matching for Urban Outdoor Scenes"; Supervisor: M. Gelautz, K. Schindler, S. Galliani; Institut für Softwaretechnik und interaktive Systeme, 2015; final examination: 01-12-2015.


Click into the text area and press Ctrl+A/Ctrl+C or ⌘+A/⌘+C to copy the BibTeX into your clipboard… or download the BibTeX.