In this paper, we tackle the challenge for VSLAM of handling nonstatic environments. We propose to include semantic information obtained by deep learning methods in the traditional geometric pipeline. Speciﬁcally, we compute a conﬁdence measure for each map point as a function of its semantic class (car, person, building, etc.) and its detection consistency over time. The conﬁdence is then applied to guide the usage of each point in the mapping and localization stage. Points with high conﬁdence are used to verify points with low conﬁdence in order to select the ﬁnal set of points for pose computation and mapping. Furthermore, we can handle map points whose state may change between static and dynamic (a car can be parked or in motion). Evaluating our method on public datasets, we show that it can successfully solve challenging situations in dynamic environments which cause state-of-theart baseline VSLAM algorithms to fail and that it maintains performance on static scenes. Code is available at github.com/mthz/slamantic
Reference currently not available.