Constellation Models Revisited
Team

Marcel Simon and Erik Rodner

Motivation

The algorithm we developed is able to automatically distinguish between different types of birds, flowers, dog breeds, and in general very similar object categories. It is based on computer vision and machine learning techniques that learn the appearance of object categories from a given set of images together with their annotations. Very recent ideas from the deep learning area allow for estimating very complex visual models and boost the recognition performance up to 82% for a dataset with 200 different bird categories. Would you be able to distinguish them?

Method

Part models of object categories are essential for challenging recognition tasks, where differences in categories are subtle and only reflected in appearances of small parts of the object. We present an approach that is able to learn part models in a completely unsupervised manner, without part annotations and even without given bounding boxes during learning. The key idea is to find constellations of neural activation patterns computed using convolutional neural networks. In our experiments, we outperform existing approaches for fine-grained recognition on the CUB200-2011, Oxford PETS, and Oxford Flowers dataset in case no part or bounding box annotations are available and achieve state-of-the-art performance for the Stanford Dog dataset. We also show the benefits of neural constellation models as a data augmentation technique for fine-tuning. Furthermore, our paper unites the areas of generic and fine-grained classification, since our approach is suitable for both scenarios.

Selected Results

Results on CUB200-2011:

 Annotation Annotation
Approach
Accuracy
 Training  Test    
None None Xiao et al. [1] 77.9%
None None No parts 71.9%
None None Ours (rand.) 79.4%
None None Ours (const.) 81.0%

Results on NA birds:

 Annotation Annotation Approach Accuracy
 Training  Test    
Parts Parts Horn et al. [2] 75.0%
None None No parts 63.9%
None None Ours (const.) 76.3%
Code

Easy to use Matlab code for the approach: GitHub

Fine-tuned models that were used in the paper: GoogleDrive

Further Reading

Further work on fine-grained recognition can be found on the fine-grained project page. There are more methods and projects to be discovered. The current project is funded by the DFG. The Computer Vision Group of Prof. Joachim Denzler is currently also working on the application of methods in the area of biodiversity monitoring. Information about a preliminary work can be found here.

References

[1] T. Xiao, Y. Xu, K. Yang, J. Zhang, Y. Peng, and Z. Zhang. The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In CVPR, 2015.

[2] G. Van Horn, S. Branson, R. Farrell, S. Haber, J. Barry, P. Ipeirotis, P. Perona, and S. Belongie. Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In CVPR, pages 595-604, 2015.

Publications
[bibliography groups="partmodels"]