Active Learning
Team
Niklas Penzel, Clemens-Alexander Brust, Paul Bodesheim
Motivation
Although labeled data lies at the very core of most computer vision systems, obtaining labeled data that is useful and reliable is commonly a crucial problem. To reduce the amount of manual labeling, active learning techniques aim at explicitly picking samples that are actually worth being labeled with respect to the problem on hand. In this area of research, we are interested in modeling the “worthiness” of an unlabeled sample and to apply our algorithms to human-in-the-loop recognition systems.
Publications
2022
Paul Bodesheim, Jan Blunk, Matthias Körschens, Clemens-Alexander Brust, Christoph Käding, Joachim Denzler:
Pre-trained models are not enough: active and lifelong learning is important for long-term visual monitoring of mammals in biodiversity research. Individual identification and attribute prediction with image features from deep neural networks and decoupled decision models applied to elephants and great apes.
Mammalian Biology. 102 : pp. 875-897. 2022.
[bibtex] [web] [doi] [abstract]
Pre-trained models are not enough: active and lifelong learning is important for long-term visual monitoring of mammals in biodiversity research. Individual identification and attribute prediction with image features from deep neural networks and decoupled decision models applied to elephants and great apes.
Mammalian Biology. 102 : pp. 875-897. 2022.
[bibtex] [web] [doi] [abstract]
Animal re-identification based on image data, either recorded manually by photographers or automatically with camera traps, is an important task for ecological studies about biodiversity and conservation that can be highly automatized with algorithms from computer vision and machine learning. However, fixed identification models only trained with standard datasets before their application will quickly reach their limits, especially for long-term monitoring with changing environmental conditions, varying visual appearances of individuals over time that differ a lot from those in the training data, and new occurring individuals that have not been observed before. Hence, we believe that active learning with human-in-the-loop and continuous lifelong learning is important to tackle these challenges and to obtain high-performance recognition systems when dealing with huge amounts of additional data that become available during the application. Our general approach with image features from deep neural networks and decoupled decision models can be applied to many different mammalian species and is perfectly suited for continuous improvements of the recognition systems via lifelong learning. In our identification experiments, we consider four different taxa, namely two elephant species: African forest elephants and Asian elephants, as well as two species of great apes: gorillas and chimpanzees. Going beyond classical re-identification, our decoupled approach can also be used for predicting attributes of individuals such as gender or age using classification or regression methods. Although applicable for small datasets of individuals as well, we argue that even better recognition performance will be achieved by improving decision models gradually via lifelong learning to exploit huge datasets and continuous recordings from long-term applications. We highlight that algorithms for deploying lifelong learning in real observational studies exist and are ready for use. Hence, lifelong learning might become a valuable concept that supports practitioners when analyzing large-scale image data during long-term monitoring of mammals.
2021
Clemens-Alexander Brust, Björn Barz, Joachim Denzler:
Self-Supervised Learning from Semantically Imprecise Data.
arXiv preprint arXiv:2104.10901. 2021.
[bibtex] [pdf] [abstract]
Self-Supervised Learning from Semantically Imprecise Data.
arXiv preprint arXiv:2104.10901. 2021.
[bibtex] [pdf] [abstract]
Learning from imprecise labels such as "animal" or "bird", but making precise predictions like "snow bunting" at test time is an important capability when expertly labeled training data is scarce. Contributions by volunteers or results of web crawling lack precision in this manner, but are still valuable. And crucially, these weakly labeled examples are available in larger quantities for lower cost than high-quality bespoke training data. CHILLAX, a recently proposed method to tackle this task, leverages a hierarchical classifier to learn from imprecise labels. However, it has two major limitations. First, it is not capable of learning from effectively unlabeled examples at the root of the hierarchy, e.g. "object". Second, an extrapolation of annotations to precise labels is only performed at test time, where confident extrapolations could be already used as training data. In this work, we extend CHILLAX with a self-supervised scheme using constrained extrapolation to generate pseudo-labels. This addresses the second concern, which in turn solves the first problem, enabling an even weaker supervision requirement than CHILLAX. We evaluate our approach empirically and show that our method allows for a consistent accuracy improvement of 0.84 to 1.19 percent points over CHILLAX and is suitable as a drop-in replacement without any negative consequences such as longer training times.
Daphne Auer, Paul Bodesheim, Christian Fiderer, Marco Heurich, Joachim Denzler:
Minimizing the Annotation Effort for Detecting Wildlife in Camera Trap Images with Active Learning.
Computer Science for Biodiversity Workshop (CS4Biodiversity), INFORMATIK 2021. Pages 547-564. 2021.
[bibtex] [pdf] [doi] [abstract]
Minimizing the Annotation Effort for Detecting Wildlife in Camera Trap Images with Active Learning.
Computer Science for Biodiversity Workshop (CS4Biodiversity), INFORMATIK 2021. Pages 547-564. 2021.
[bibtex] [pdf] [doi] [abstract]
Analyzing camera trap images is a challenging task due to complex scene structures at different locations, heavy occlusions, and varying sizes of animals.One particular problem is the large fraction of images only showing background scenes, which are recorded when a motion detector gets triggered by signals other than animal movements.To identify these background images automatically, an active learning approach is used to train binary classifiers with small amounts of labeled data, keeping the annotation effort of humans minimal.By training classifiers for single sites or small sets of camera traps, we follow a region-based approach and particularly focus on distinct models for daytime and nighttime images.Our approach is evaluated on camera trap images from the Bavarian Forest National Park.Comparable or even superior performances to publicly available detectors trained with millions of labeled images are achieved while requiring significantly smaller amounts of annotated training images.
Niklas Penzel, Christian Reimers, Clemens-Alexander Brust, Joachim Denzler:
Investigating the Consistency of Uncertainty Sampling in Deep Active Learning.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 159-173. 2021.
[bibtex] [pdf] [web] [doi] [abstract]
Investigating the Consistency of Uncertainty Sampling in Deep Active Learning.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 159-173. 2021.
[bibtex] [pdf] [web] [doi] [abstract]
Uncertainty sampling is a widely used active learning strategy to select unlabeled examples for annotation. However, previous work hints at weaknesses of uncertainty sampling when combined with deep learning, where the amount of data is even more significant. To investigate these problems, we analyze the properties of the latent statistical estimators of uncertainty sampling in simple scenarios. We prove that uncertainty sampling converges towards some decision boundary. Additionally, we show that it can be inconsistent, leading to incorrect estimates of the optimal latent boundary. The inconsistency depends on the latent class distribution, more specifically on the class overlap. Further, we empirically analyze the variance of the decision boundary and find that the performance of uncertainty sampling is also connected to the class regions overlap. We argue that our findings could be the first step towards explaining the poor performance of uncertainty sampling combined with deep models.
2020
Clemens-Alexander Brust, Christoph Käding, Joachim Denzler:
Active and Incremental Learning with Weak Supervision.
Künstliche Intelligenz (KI). 2020.
[bibtex] [pdf] [doi] [abstract]
Active and Incremental Learning with Weak Supervision.
Künstliche Intelligenz (KI). 2020.
[bibtex] [pdf] [doi] [abstract]
Large amounts of labeled training data are one of the main contributors to the great success that deep models have achieved in the past. Label acquisition for tasks other than benchmarks can pose a challenge due to requirements of both funding and expertise. By selecting unlabeled examples that are promising in terms of model improvement and only asking for respective labels, active learning can increase the efficiency of the labeling process in terms of time and cost. In this work, we describe combinations of an incremental learning scheme and methods of active learning. These allow for continuous exploration of newly observed unlabeled data. We describe selection criteria based on model uncertainty as well as expected model output change (EMOC). An object detection task is evaluated in a continu ous exploration context on the PASCAL VOC dataset. We also validate a weakly supervised system based on active and incremental learning in a real-world biodiversity application where images from camera traps are analyzed. Labeling only 32 images by accepting or rejecting proposals generated by our method yields an increase in accuracy from 25.4% to 42.6%.
2019
Björn Barz, Christoph Käding, Joachim Denzler:
Information-Theoretic Active Learning for Content-Based Image Retrieval.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 650-666. 2019.
[bibtex] [pdf] [doi] [code] [supplementary] [abstract]
Information-Theoretic Active Learning for Content-Based Image Retrieval.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 650-666. 2019.
[bibtex] [pdf] [doi] [code] [supplementary] [abstract]
We propose Information-Theoretic Active Learning (ITAL), a novel batch-mode active learning method for binary classification, and apply it for acquiring meaningful user feedback in the context of content-based image retrieval. Instead of combining different heuristics such as uncertainty, diversity, or density, our method is based on maximizing the mutual information between the predicted relevance of the images and the expected user feedback regarding the selected batch. We propose suitable approximations to this computationally demanding problem and also integrate an explicit model of user behavior that accounts for possible incorrect labels and unnameable instances. Furthermore, our approach does not only take the structure of the data but also the expected model output change caused by the user feedback into account. In contrast to other methods, ITAL turns out to be highly flexible and provides state-of-the-art performance across various datasets, such as MIRFLICKR and ImageNet.
Clemens-Alexander Brust, Christoph Käding, Joachim Denzler:
Active Learning for Deep Object Detection.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 181-190. 2019.
[bibtex] [pdf] [doi] [abstract]
Active Learning for Deep Object Detection.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 181-190. 2019.
[bibtex] [pdf] [doi] [abstract]
The great success that deep models have achieved in the past is mainly owed to large amounts of labeled training data. However, the acquisition of labeled data for new tasks aside from existing benchmarks is both challenging and costly. Active learning can make the process of labeling new data more efficient by selecting unlabeled samples which, when labeled, are expected to improve the model the most. In this paper, we combine a novel method of active learning for object detection with an incremental learning scheme to enable continuous exploration of new unlabeled datasets. We propose a set of uncertainty-based active learning metrics suitable for most object detectors. Furthermore, we present an approach to leverage class imbalances during sample selection. All methods are evaluated systematically in a continuous exploration context on the PASCAL VOC 2012 dataset.
2017
Clemens-Alexander Brust, Christoph Käding, Joachim Denzler:
You Have To Look More Than Once: Active and Continuous Exploration using YOLO.
CVPR Workshop on Continuous and Open-Set Learning (CVPR-WS). 2017. Poster presentation and extended abstract
[bibtex] [abstract]
You Have To Look More Than Once: Active and Continuous Exploration using YOLO.
CVPR Workshop on Continuous and Open-Set Learning (CVPR-WS). 2017. Poster presentation and extended abstract
[bibtex] [abstract]
Traditionally, most research in the area of object detection builds on models trained once on reliable labeled data for a predefined application. However, in many application scenarios, new data becomes available over time or the distribution underlying the problem changes itself. In this case, models are usually retrained from scratch or refined via fine-tuning or incremental learning. For most applications, acquiring new labels is the limiting factor in terms of effort or costs. Active learning aims to minimize the labeling effort by selecting only valuable samples for annotation. It is widely studied in classification tasks, where different measures of uncertainty are the most common choice for selection. We combine the deep object detector YOLO with active learning and an incremental learning scheme to build an object detection system suitable for active and continuous exploration and open-set problems by querying whole images for annotation rather than single proposals.
Erik Rodner, Alexander Freytag, Paul Bodesheim, Björn Fröhlich, Joachim Denzler:
Large-Scale Gaussian Process Inference with Generalized Histogram Intersection Kernels for Visual Recognition Tasks.
International Journal of Computer Vision (IJCV). 121 (2) : pp. 253-280. 2017.
[bibtex] [pdf] [web] [doi] [abstract]
Large-Scale Gaussian Process Inference with Generalized Histogram Intersection Kernels for Visual Recognition Tasks.
International Journal of Computer Vision (IJCV). 121 (2) : pp. 253-280. 2017.
[bibtex] [pdf] [web] [doi] [abstract]
We present new methods for fast Gaussian process (GP) inference in large-scale scenarios including exact multi-class classification with label regression, hyperparameter optimization, and uncertainty prediction. In contrast to previous approaches, we use a full Gaussian process model without sparse approximation techniques. Our methods are based on exploiting generalized histogram intersection kernels and their fast kernel multiplications. We empirically validate the suitability of our techniques in a wide range of scenarios with tens of thousands of examples. Whereas plain GP models are intractable due to both memory consumption and computation time in these settings, our results show that exact inference can indeed be done efficiently. In consequence, we enable every important piece of the Gaussian process framework - learning, inference, hyperparameter optimization, variance estimation, and online learning - to be used in realistic scenarios with more than a handful of data.
2016
Christoph Käding, Alexander Freytag, Erik Rodner, Andrea Perino, Joachim Denzler:
Large-scale Active Learning with Approximated Expected Model Output Changes.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 179-191. 2016.
[bibtex] [pdf] [web] [doi] [code] [supplementary] [abstract]
Large-scale Active Learning with Approximated Expected Model Output Changes.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 179-191. 2016.
[bibtex] [pdf] [web] [doi] [code] [supplementary] [abstract]
Incremental learning of visual concepts is one step towards reaching human capabilities beyond closed-world assumptions. Besides recent progress, it remains one of the fundamental challenges in computer vision and machine learning. Along that path, techniques are needed which allow for actively selecting informative examples from a huge pool of unlabeled images to be annotated by application experts. Whereas a manifold of active learning techniques exists, they commonly suffer from one of two drawbacks: (i) either they do not work reliably on challenging real-world data or (ii) they are kernel-based and not scalable with the magnitudes of data current vision applications need to deal with. Therefore, we present an active learning and discovery approach which can deal with huge collections of unlabeled real-world data. Our approach is based on the expected model output change principle and overcomes previous scalability issues. We present experiments on the large-scale MS-COCO dataset and on a dataset provided by biodiversity researchers. Obtained results reveal that our technique clearly improves accuracy after just a few annotations. At the same time, it outperforms previous active learning approaches in academic and real-world scenarios.
Christoph Käding, Erik Rodner, Alexander Freytag, Joachim Denzler:
Active and Continuous Exploration with Deep Neural Networks and Expected Model Output Changes.
NIPS Workshop on Continual Learning and Deep Networks (NIPS-WS). 2016.
[bibtex] [pdf] [web] [abstract]
Active and Continuous Exploration with Deep Neural Networks and Expected Model Output Changes.
NIPS Workshop on Continual Learning and Deep Networks (NIPS-WS). 2016.
[bibtex] [pdf] [web] [abstract]
The demands on visual recognition systems do not end with the complexity offered by current large-scale image datasets, such as ImageNet. In consequence, we need curious and continuously learning algorithms that actively acquire knowledge about semantic concepts which are present in available unlabeled data. As a step towards this goal, we show how to perform continuous active learning and exploration, where an algorithm actively selects relevant batches of unlabeled examples for annotation. These examples could either belong to already known or to yet undiscovered classes. Our algorithm is based on a new generalization of the Expected Model Output Change principle for deep architectures and is especially tailored to deep neural networks. Furthermore, we show easy-to-implement approximations that yield efficient techniques for active selection. Empirical experiments show that our method outperforms currently used heuristics.
Christoph Käding, Erik Rodner, Alexander Freytag, Joachim Denzler:
Watch, Ask, Learn, and Improve: A Lifelong Learning Cycle for Visual Recognition.
European Symposium on Artificial Neural Networks (ESANN). Pages 381-386. 2016.
[bibtex] [pdf] [code] [presentation] [abstract]
Watch, Ask, Learn, and Improve: A Lifelong Learning Cycle for Visual Recognition.
European Symposium on Artificial Neural Networks (ESANN). Pages 381-386. 2016.
[bibtex] [pdf] [code] [presentation] [abstract]
We present WALI, a prototypical system that learns object categories over time by continuously watching online videos. WALI actively asks questions to a human annotator about the visual content of observed video frames. Thereby, WALI is able to receive information about new categories and to simultaneously improve its generalization abilities. The functionality of WALI is driven by scalable active learning, efficient incremental learning, as well as state-of-the-art visual descriptors. In our experiments, we show qualitative and quantitative statistics about WALI's learning process. WALI runs continuously and regularly asks questions.
2015
Christoph Käding, Alexander Freytag, Erik Rodner, Paul Bodesheim, Joachim Denzler:
Active Learning and Discovery of Object Categories in the Presence of Unnameable Instances.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Pages 4343-4352. 2015.
[bibtex] [pdf] [web] [doi] [code] [presentation] [supplementary] [abstract]
Active Learning and Discovery of Object Categories in the Presence of Unnameable Instances.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Pages 4343-4352. 2015.
[bibtex] [pdf] [web] [doi] [code] [presentation] [supplementary] [abstract]
Current visual recognition algorithms are "hungry" for data but massive annotation is extremely costly. Therefore, active learning algorithms are required that reduce labeling efforts to a minimum by selecting examples that are most valuable for labeling. In active learning, all categories occurring in collected data are usually assumed to be known in advance and experts should be able to label every requested instance. But do these assumptions really hold in practice? Could you name all categories in every image? Existing algorithms completely ignore the fact that there are certain examples where an oracle can not provide an answer or which even do not belong to the current problem domain. Ideally, active learning techniques should be able to discover new classes and at the same time cope with queries an expert is not able or willing to label. To meet these observations, we present a variant of the expected model output change principle for active learning and discovery in the presence of unnameable instances. Our experiments show that in these realistic scenarios, our approach substantially outperforms previous active learning methods, which are often not even able to improve with respect to the baseline of random query selection.
2014
Alexander Freytag, Erik Rodner, Joachim Denzler:
Selecting Influential Examples: Active Learning with Expected Model Output Changes.
European Conference on Computer Vision (ECCV). Pages 562-577. 2014.
[bibtex] [pdf] [presentation] [supplementary] [abstract]
Selecting Influential Examples: Active Learning with Expected Model Output Changes.
European Conference on Computer Vision (ECCV). Pages 562-577. 2014.
[bibtex] [pdf] [presentation] [supplementary] [abstract]
In this paper, we introduce a new general strategy for active learning. The key idea of our approach is to measure the expected change of model outputs, a concept that generalizes previous methods based on expected model change and incorporates the underlying data distribution. For each example of an unlabeled set, the expected change of model predictions is calculated and marginalized over the unknown label. This results in a score for each unlabeled example that can be used for active learning with a broad range of models and learning algorithms. In particular, we show how to derive very efficient active learning methods for Gaussian process regression, which implement this general strategy, and link them to previous methods. We analyze our algorithms and compare them to a broad range of previous active learning strategies in experiments showing that they outperform state-of-the-art on well-established benchmark datasets in the area of visual object recognition.
2013
Alexander Freytag, Erik Rodner, Paul Bodesheim, Joachim Denzler:
Labeling examples that matter: Relevance-Based Active Learning with Gaussian Processes.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 282-291. 2013.
[bibtex] [pdf] [web] [doi] [code] [supplementary] [abstract]
Labeling examples that matter: Relevance-Based Active Learning with Gaussian Processes.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 282-291. 2013.
[bibtex] [pdf] [web] [doi] [code] [supplementary] [abstract]
Active learning is an essential tool to reduce manual annotation costs in the presence of large amounts of unsupervised data. In this paper, we introduce new active learning methods based on measuring the impact of a new example on the current model. This is done by deriving model changes of Gaussian process models in closed form. Furthermore, we study typical pitfalls in active learning and show that our methods automatically balance between the exploitation and the exploration trade-off. Experiments are performed with established benchmark datasets for visual object recognition and show that our new active learning techniques are able to outperform state-of-the-art methods.
2012
Alexander Freytag, Erik Rodner, Paul Bodesheim, Joachim Denzler:
Rapid Uncertainty Computation with Gaussian Processes and Histogram Intersection Kernels.
Asian Conference on Computer Vision (ACCV). Pages 511-524. 2012. Best Paper Honorable Mention Award
[bibtex] [pdf] [web] [doi] [presentation] [abstract]
Rapid Uncertainty Computation with Gaussian Processes and Histogram Intersection Kernels.
Asian Conference on Computer Vision (ACCV). Pages 511-524. 2012. Best Paper Honorable Mention Award
[bibtex] [pdf] [web] [doi] [presentation] [abstract]
An important advantage of Gaussian processes is the ability to directly estimate classification uncertainties in a Bayesian manner. In this paper, we develop techniques that allow for estimating these uncertainties with a runtime linear or even constant with respect to the number of training examples. Our approach makes use of all training data without any sparse approximation technique while needing only a linear amount of memory. To incorporate new information over time, we further derive online learning methods leading to significant speed-ups and allowing for hyperparameter optimization on-the-fly. We conduct several experiments on public image datasets for the tasks of one-class classification and active learning, where computing the uncertainty is an essential task. The experimental results highlight that we are able to compute classification uncertainties within microseconds even for large-scale datasets with tens of thousands of training examples.