Urban Scene Understanding

Sven Sickert, Clemens-Alexander Brust, Marcel Simon, and Erik Rodner

Incorporating Spatial Priors in CNNs

Classifying single image patches is important in many different applications, such as road detection or scene understanding. In this paper, we present convolutional patch networks, which are convolutional networks learned to distinguish different image patches and which can be used for pixel-wise labeling. We also show how to incorporate spatial information of the patch as an input to the network, which allows for learning spatial priors for certain categories jointly with an appearance model. In particular, we focus on road detection and urban scene understanding, two application areas where we are able to achieve state-of-the-art results on the KITTI as well as on the LabelMeFacade dataset. Furthermore, our paper offers a guideline for people working in the area and desperately wandering through all the painstaking details that render training CNs on image patches extremely difficult.

Code and dataset are available on GitHub! More information on the dataset can be found in our section on datasets.

Exploitation of Context Cues using Iterative Context Forests

In this paper, we present a new combined approach for feature extraction, classification, and context modelling in an iterative frame- work based on random decision trees and a huge amount of features. A major focus of this paper is to integrate different kinds of feature types like color, geometric context, and auto context features in a joint, flexible and fast manner. Furthermore, we perform an in-depth analysis of multiple feature extraction methods and different feature types. Extensive experiments are performed on challenging facade recognition datasets, where we show that our approach significantly outperforms previous approaches with a performance gain of more than 15% on the most difficult dataset.

The method itself is also suitable for anytime classification scenarios, where the challenge is to estimate a label for each pixel in an image while allowing an in- terruption of the estimation at any time. This offers the application of the introduced method in time-critical tasks, like automotive applications, with limited computational resources unknown in advance. Label estimation is done in an iterative manner and includes spatial context right from the beginning.

Large-scale Gaussian Process Inference for Semantic Segmentation

Semantic interpretation and understanding of images is an important goal of visual recognition research and offers a large variety of possible applications. One step towards this goal is semantic segmentation, which aims for automatic labeling of image regions and pixels with category names. Since usual images contain several millions of pixel, the use of kernel-based methods for the task of semantic segmentation is limited due to the involved computation times. In this paper, we overcome this drawback by exploiting efficient kernel calculations using the histogram intersection kernel for fast and exact Gaussian process classification. Our results show that non-parametric Bayesian methods can be utilized for semantic segmentation without sparse approximation techniques. Furthermore, in experiments, we show a significant benefit in terms of classification accuracy compared to state-of-the-art methods.

Clemens-Alexander Brust, Sven Sickert, Marcel Simon, Erik Rodner, Joachim Denzler:
Convolutional Patch Networks with Spatial Prior for Road Detection and Urban Scene Understanding.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 510-517. 2015.
[bibtex] [pdf] [doi] [abstract]
Clemens-Alexander Brust, Sven Sickert, Marcel Simon, Erik Rodner, Joachim Denzler:
Efficient Convolutional Patch Networks for Scene Understanding.
CVPR Workshop on Scene Understanding (CVPR-WS). 2015. Poster presentation and extended abstract
[bibtex] [pdf] [abstract]
Björn Fröhlich, Erik Rodner, Michael Kemmler, Joachim Denzler:
Large-Scale Gaussian Process Multi-Class Classification for Semantic Segmentation and Facade Recognition.
Machine Vision and Applications. 24 (5) : pp. 1043-1053. 2013.
[bibtex] [pdf]
Alexander Freytag, Björn Fröhlich, Erik Rodner, Joachim Denzler:
Efficient Semantic Segmentation with Gaussian Processes and Histogram Intersection Kernels.
International Conference on Pattern Recognition (ICPR). Pages 3313-3316. 2012.
[bibtex] [pdf]
Björn Fröhlich, Erik Rodner, Joachim Denzler:
As Time Goes By: Anytime Semantic Segmentation with Iterative Context Forests.
Symposium of the German Association for Pattern Recognition (DAGM). Pages 1-10. 2012.
[bibtex] [pdf]
Björn Fröhlich, Erik Rodner, Joachim Denzler:
Semantic Segmentation with Millions of Features: Integrating Multiple Cues in a Combined Random Forest Approach.
Asian Conference on Computer Vision (ACCV). Pages 218-231. 2012.
[bibtex] [pdf]
Björn Fröhlich, Erik Rodner, Joachim Denzler:
A Fast Approach for Pixelwise Labeling of Facade Images.
International Conference on Pattern Recognition (ICPR). Pages 3029-3032. 2010.
[bibtex] [pdf]