Master Thesis - Region-Based Optical Flow Estimation with Treatment of Occlusions

Thesis by Christoph Rhemann

Supervision by Margrit Gelautz and Michael Bleyer

Abstract

The estimation of optical flow plays a key-role in several computer vision problems, including motion detection and segmentation, frame interpolation, three-dimensional scene reconstruction, robot navigation, video shot detection, mosaicking and video compression. In this work we propose a new algorithm for computing a dense optical flow field between two or more images of a video sequence, which tackles the inherent problems of conventional optical flow algorithms. These algorithms usually show a bad performance in regions of low texture as well as near motion boundaries. We try to overcome these problems by segmenting the reference frame into regions of homogeneous color. The color segmentation incorporates the assumption that the motion inside regions of homogeneous color varies smoothly and motion discontinuities coincide with the borders of those regions. The affine motion model is used to describe the motion inside a segment. To initialize the model parameters, we estimate a sparse set of correspondences. Layers are extracted from the initial segments, which represent the dominant motions likely to occur in the scene. Every color segment is then assigned to exactly one layer. This assignment is optimized by minimizing a global cost function with a graph-based technique.

The cost function is defined on the pixel level, as well as on the segment level. On the pixel level, a data term measures the pixel similarity based on the current flow field. Furthermore, occluded pixels are detected symmetrically. The segment level is connected to the pixel level in a way that the segmentation information is enforced on the pixel level. Additionally, a smoothness term is defined on the segment level.

Furthermore, we allow our algorithm to use multiple input frames in order to discriminate the motion of different layers when the interframe motion is small.

Finally, we demonstrate the good performance and robustness of our approach with results obtained from standard test sequences as well as one self-recorded video.

Reference

Master Thesis