Kim Bjerge, Paul Bodesheim, Henrik Karstoft:
Few-Shot Learning with Novelty Detection.
International Conference on Deep Learning Theory and Applications (DeLTA).
2024.
Best Paper Award
[bibtex]
[web]
[abstract]
Machine learning has achieved considerable success in data-intensive applications, yet encounters challenges when confronted with small datasets. Recently, few-shot learning (FSL) has emerged as a promising solution to address this limitation. By leveraging prior knowledge, FSL exhibits the ability to swiftly generalize to new tasks, even when presented with only a handful of samples in an accompanied support set. This paper extends the scope of few-shot learning by incorporating novelty detection for samples of categories not present in the support set of FSL. This extension holds substantial promise for real-life applications where the availability of samples for each class is either sparse or absent. Our approach involves adapting existing FSL methods with a cosine similarity function, complemented by the learning of a probabilistic threshold to distinguish between known and outlier classes. During episodic training with domain generalization, we introduce a scatter loss function designed to disentangle the distribution of similarities between known and outlier classes, thereby enhancing the separation of novel and known classes. The efficacy of the proposed method is evaluated on commonly used FSL datasets and the EU Moths dataset characterized by few samples. Our experimental results showcase accuracy, ranging from 95.4% to 96.7%, as demonstrated on the Omniglot dataset through few-shot-novelty learning (FSNL). This high accuracy is observed across scenarios with 5 to 30 classes and the introduction of novel classes in each query set, underscoring the robustness and versatility of our proposed approach.
Martin Thümmel, Sven Sickert, Joachim Denzler:
Facial Behavior Analysis using 4D Curvature Statistics for Presentation Attack Detection.
IEEE International Workshop on Biometrics and Forensics (IWBF).
Pages 1-6.
2021.
[bibtex]
[web]
[doi]
[code]
[abstract]
The human face has a high potential for biometric identification due to its many individual traits. At the same time, such identification is vulnerable to biometric copies. These presentation attacks pose a great challenge in unsupervised authentication settings. As a countermeasure, we propose a method that automatically analyzes the plausibility of facial behavior based on a sequence of 3D face scans. A compact feature representation measures facial behavior using the temporal curvature change. Finally, we train our method only on genuine faces in an anomaly detection scenario. Our method can detect presentation attacks using elastic 3D masks, bent photographs with eye holes, and monitor replay-attacks. For evaluation, we recorded a challenging database containing such cases using a high-quality 3D sensor. It features 109 4D face scans including eleven different types of presentation attacks. We achieve error rates of 11% and 6% for APCER and BPCER, respectively.
Niklas Penzel, Christian Reimers, Clemens-Alexander Brust, Joachim Denzler:
Investigating the Consistency of Uncertainty Sampling in Deep Active Learning.
DAGM German Conference on Pattern Recognition (DAGM-GCPR).
Pages 159-173.
2021.
[bibtex]
[pdf]
[web]
[doi]
[abstract]
Uncertainty sampling is a widely used active learning strategy to select unlabeled examples for annotation. However, previous work hints at weaknesses of uncertainty sampling when combined with deep learning, where the amount of data is even more significant. To investigate these problems, we analyze the properties of the latent statistical estimators of uncertainty sampling in simple scenarios. We prove that uncertainty sampling converges towards some decision boundary. Additionally, we show that it can be inconsistent, leading to incorrect estimates of the optimal latent boundary. The inconsistency depends on the latent class distribution, more specifically on the class overlap. Further, we empirically analyze the variance of the decision boundary and find that the performance of uncertainty sampling is also connected to the class regions overlap. We argue that our findings could be the first step towards explaining the poor performance of uncertainty sampling combined with deep models.
Violeta Teodora Trifunov, Maha Shadaydeh, Björn Barz, Joachim Denzler:
Anomaly Attribution of Multivariate Time Series using Counterfactual Reasoning.
IEEE International Conference on Machine Learning and Applications (ICMLA).
Pages 166-172.
2021.
[bibtex]
[pdf]
[web]
[doi]
[abstract]
There are numerous methods for detecting anomalies in time series, but that is only the first step to understanding them. We strive to exceed this by explaining those anomalies. Thus we develop a novel attribution scheme for multivariate time series relying on counterfactual reasoning. We aim to answer the counterfactual question of would the anomalous event have occurred if the subset of the involved variables had been more similarly distributed to the data outside of the anomalous interval. Specifically, we detect anomalous intervals using the Maximally Divergent Interval (MDI) algorithm, replace a subset of variables with their in-distribution values within the detected interval and observe if the interval has become less anomalous, by re-scoring it with MDI. We evaluate our method on multivariate temporal and spatio-temporal data and confirm the accuracy of our anomaly attribution of multiple well-understood extreme climate events such as heatwaves and hurricanes.
Björn Barz, Erik Rodner, Yanira Guanche Garcia, Joachim Denzler:
Detecting Regions of Maximal Divergence for Spatio-Temporal Anomaly Detection.
IEEE Transactions on Pattern Analysis and Machine Intelligence.
41 (5) :
pp. 1088-1101.
2019.
(Pre-print published in 2018.)
[bibtex]
[pdf]
[web]
[doi]
[code]
[abstract]
Automatic detection of anomalies in space- and time-varying measurements is an important tool in several fields, e.g., fraud detection, climate analysis, or healthcare monitoring. We present an algorithm for detecting anomalous regions in multivariate spatio-temporal time-series, which allows for spotting the interesting parts in large amounts of data, including video and text data. In opposition to existing techniques for detecting isolated anomalous data points, we propose the "Maximally Divergent Intervals" (MDI) framework for unsupervised detection of coherent spatial regions and time intervals characterized by a high Kullback-Leibler divergence compared with all other data given. In this regard, we define an unbiased Kullback-Leibler divergence that allows for ranking regions of different size and show how to enable the algorithm to run on large-scale data sets in reasonable time using an interval proposal technique. Experiments on both synthetic and real data from various domains, such as climate analysis, video surveillance, and text forensics, demonstrate that our method is widely applicable and a valuable tool for finding interesting events in different types of data.
Violeta Teodora Trifunov, Maha Shadaydeh, Jakob Runge, Veronika Eyring, Markus Reichstein, Joachim Denzler:
Nonlinear Causal Link Estimation under Hidden Confounding with an Application to Time-Series Anomaly Detection.
DAGM German Conference on Pattern Recognition (DAGM-GCPR).
Pages 261-273.
2019.
[bibtex]
[pdf]
[doi]
[abstract]
Causality analysis represents one of the most important tasks when examining dynamical systems such as ecological time series. We propose to mitigate the problem of inferring nonlinear cause-effect de- pendencies in the presence of a hidden confounder by using deep learning with domain knowledge integration. Moreover, we suggest a time series anomaly detection approach using causal link intensity increase as an indicator of the anomaly. Our proposed method is based on the Causal Effect Variational Autoencoder (CEVAE) which we extend and apply to anomaly detection in time series. We evaluate our method on synthetic data having properties of ecological time series and compare to the vector autoregressive Granger causality (VAR-GC) baseline.
Yanira Guanche, Maha Shadaydeh, Miguel Mahecha, Joachim Denzler:
Attribution of Multivariate Extreme Events.
International Workshop on Climate Informatics (CI).
2019.
[bibtex]
[pdf]
[abstract]
The detection of multivariate extreme events is crucial to monitor the Earth system and to analyze their impacts on ecosystems and society. Once an abnormal event is detected, the following natural question is: what is causing this anomaly? Answering this question we try to understand these anomalies, to explain why they happened. In a previous work, the authors presented a multivariate anomaly detection approach based on the combination of a vector autoregressive model and the Mahalanobis distance metric. In this paper, we present an approach for the attribution of the detected anomalous events based on the decomposition of the Mahalanobis distance. The decomposed form of this metric provides an answer to the question: how much does each variable contribute to this distance metric? The method is applied to the extreme events detected in the land-atmosphere exchange fluxes: Gross Primary Productivity, Latent Energy, Net Ecosystem Exchange, Sensible Heat and Terrestrial Ecosystem Respiration. The attribution results of the proposed method for different known historic events are presented and compared with the univariate Z-score attribution method.
Yanira Guanche Garcia, Maha Shadaydeh, Miguel Mahecha, Joachim Denzler:
Extreme anomaly event detection in biosphere using linear regression and a spatiotemporal MRF model.
Natural Hazards.
pp. 1-19.
2018.
[bibtex]
[pdf]
[web]
[doi]
[abstract]
Detecting abnormal events within time series is crucial for analyzing and understanding the dynamics of the system in many research areas. In this paper, we propose a methodology to detect these anomalies in multivariate environmental data. Five biosphere variables from a preliminary version of the Earth System Data Cube have been used in this study: Gross Primary Productivity, Latent Energy, Net Ecosystem Exchange, Sensible Heat and Terrestrial Ecosystem Respiration. To tackle the spatiotemporal dependencies of the biosphere variables, the proposed methodology after preprocessing the data is divided into two steps: a feature extraction step applied to each time series in the grid independently, followed by a spatiotemporal event detection step applied to the obtained novelty scores over the entire study area. The first step is based on the assumption that the time series of each variable can be represented by an autoregressive moving average (ARMA) process, and the anomalies are those time instances that are not well represented by the estimated ARMA model. The Mahalanobis distance of the ARMA models’ multivariate residuals is used as a novelty score. In the second step, the obtained novelty scores of the entire study are treated as time series of images. Markov random fields (MRFs) provide an effective and theoretically well-established methodology for integrating spatiotemporal dependency into the classification of image time series. In this study, the classification of the novelty score images into three classes, intense anomaly, possible anomaly, and normal, is performed using unsupervised K-means clustering followed by multi-temporal MRF segmentation applied recursively on the images of each consecutive \(L \ge \) 1 time steps. The proposed methodology was applied to an area covering Europe and Africa. Experimental results and validation based on known historic events show that the method is able to detect historic events and also provides a useful tool to define sensitive regions.
Björn Barz, Yanira Guanche, Erik Rodner, Joachim Denzler:
Maximally Divergent Intervals for Extreme Weather Event Detection.
MTS/IEEE OCEANS Conference Aberdeen.
Pages 1-9.
2017.
[bibtex]
[pdf]
[doi]
[abstract]
We approach the task of detecting anomalous or extreme events in multivariate spatio-temporal climate data using an unsupervised machine learning algorithm for detection of anomalous intervals in time-series. In contrast to many existing algorithms for outlier and anomaly detection, our method does not search for point-wise anomalies, but for contiguous anomalous intervals. We demonstrate the suitability of our approach through numerous experiments on climate data, including detection of hurricanes, North Sea storms, and low-pressure fields.
Milan Flach, Fabian Gans, Alexander Brenning, Joachim Denzler, Markus Reichstein, Erik Rodner, Sebastian Bathiany, Paul Bodesheim, Yanira Guanche, Sebasitan Sippel, Miguel D. Mahecha:
Multivariate anomaly detection for Earth observations: a comparison of algorithms and feature extraction techniques.
Earth System Dynamics.
8 (3) :
pp. 677-696.
2017.
[bibtex]
[pdf]
[web]
[doi]
[abstract]
Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advancing our understanding of vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of extreme climatic events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only a few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations like sudden changes in basic characteristics of time series such as the sample mean, the variance, changes in the cycle amplitude, and trends. This artificial experiment is needed as there is no "gold standard" for the identification of anomalies in real Earth observations. Our results show that a well-chosen feature extraction step (e.g., subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify three detection algorithms (k-nearest neighbors mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme-event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.
Christoph Käding, Erik Rodner, Alexander Freytag, Joachim Denzler:
Watch, Ask, Learn, and Improve: A Lifelong Learning Cycle for Visual Recognition.
European Symposium on Artificial Neural Networks (ESANN).
Pages 381-386.
2016.
[bibtex]
[pdf]
[code]
[presentation]
[abstract]
We present WALI, a prototypical system that learns object categories over time by continuously watching online videos. WALI actively asks questions to a human annotator about the visual content of observed video frames. Thereby, WALI is able to receive information about new categories and to simultaneously improve its generalization abilities. The functionality of WALI is driven by scalable active learning, efficient incremental learning, as well as state-of-the-art visual descriptors. In our experiments, we show qualitative and quantitative statistics about WALI's learning process. WALI runs continuously and regularly asks questions.
Erik Rodner, Björn Barz, Yanira Guanche, Milan Flach, Miguel Mahecha, Paul Bodesheim, Markus Reichstein, Joachim Denzler:
Maximally Divergent Intervals for Anomaly Detection.
Workshop on Anomaly Detection (ICML-WS).
2016.
Best Paper Award
[bibtex]
[pdf]
[web]
[code]
[abstract]
We present new methods for batch anomaly detection in multivariate time series. Our methods are based on maximizing the Kullback-Leibler divergence between the data distribution within and outside an interval of the time series. An empirical analysis shows the benefits of our algorithms compared to methods that treat each time step independently from each other without optimizing with respect to all possible intervals.
Milan Flach, Miguel Mahecha, Fabian Gans, Erik Rodner, Paul Bodesheim, Yanira Guanche-Garcia, Alexander Brenning, Joachim Denzler, Markus Reichstein:
Using Statistical Process Control for detecting anomalies in multivariate spatiotemporal Earth Observations.
European Geosciences Union General Assembly (EGU): Abstract + Oral Presentation.
2016.
[bibtex]
[pdf]
[web]
[abstract]
The number of available Earth observations (EOs) is currently substantially increasing. Detecting anomalous pat-terns in these multivariate time series is an important step in identifying changes in the underlying dynamicalsystem. Likewise, data quality issues might result in anomalous multivariate data constellations and have to beidentified before corrupting subsequent analyses. In industrial application a common strategy is to monitor pro-duction chains with several sensors coupled to some statistical process control (SPC) algorithm. The basic ideais to raise an alarm when these sensor data depict some anomalous pattern according to the SPC, i.e. the produc-tion chain is considered ’out of control’. In fact, the industrial applications are conceptually similar to the on-linemonitoring of EOs. However, algorithms used in the context of SPC or process monitoring are rarely consideredfor supervising multivariate spatio-temporal Earth observations. The objective of this study is to exploit the poten-tial and transferability of SPC concepts to Earth system applications. We compare a range of different algorithmstypically applied by SPC systems and evaluate their capability to detect e.g. known extreme events in land sur-face processes. Specifically two main issues are addressed: (1) identifying the most suitable combination of datapre-processing and detection algorithm for a specific type of event and (2) analyzing the limits of the individual ap-proaches with respect to the magnitude, spatio-temporal size of the event as well as the data’s signal to noise ratio.Extensive artificial data sets that represent the typical properties of Earth observations are used in this study. Ourresults show that the majority of the algorithms used can be considered for the detection of multivariate spatiotem-poral events and directly transferred to real Earth observation data as currently assembled in different projectsat the European scale, e.g. http://baci-h2020.eu/index.php/ and http://earthsystemdatacube.net/. Known anomaliessuch as the Russian heatwave are detected as well as anomalies which are not detectable with univariate methods.
Milan Flach, Sebastian Sippel, Paul Bodesheim, Alexander Brenning, Joachim Denzler, Fabian Gans, Yanira Guanche, Markus Reichstein, Erik Rodner, Miguel D. Mahecha:
Hot spots of multivariate extreme anomalies in Earth observations.
American Geophysical Union Fall Meeting (AGU): Abstract + Oral Presentation.
2016.
[bibtex]
[web]
[abstract]
Anomalies in Earth observations might indicate data quality issues, extremes or the change of underlying processes within a highly multivariate system. Thus, considering the multivariate constellation of variables for extreme detection yields crucial additional information over conventional univariate approaches. We highlight areas in which multivariate extreme anomalies are more likely to occur, i.e. hot spots of extremes in global atmospheric Earth observations that impact the Biosphere. In addition, we present the year of the most unusual multivariate extreme between 2001 and 2013 and show that these coincide with well known high impact extremes. Technically speaking, we account for multivariate extremes by using three sophisticated algorithms adapted from computer science applications. Namely an ensemble of the k-nearest neighbours mean distance, a kernel density estimation and an approach based on recurrences is used. However, the impact of atmosphere extremes on the Biosphere might largely depend on what is considered to be normal, i.e. the shape of the mean seasonal cycle and its inter-annual variability. We identify regions with similar mean seasonality by means of dimensionality reduction in order to estimate in each region both the `normal' variance and robust thresholds for detecting the extremes. In addition, we account for challenges like heteroscedasticity in Northern latitudes. Apart from hot spot areas, those anomalies in the atmosphere time series are of particular interest, which can only be detected by a multivariate approach but not by a simple univariate approach. Such an anomalous constellation of atmosphere variables is of interest if it impacts the Biosphere. The multivariate constellation of such an anomalous part of a time series is shown in one case study indicating that multivariate anomaly detection can provide novel insights into Earth observations.
Yanira Guanche Garcia, Erik Rodner, Milan Flach, Sebastian Sippel, Miguel Mahecha, Joachim Denzler:
Detecting Multivariate Biosphere Extremes.
International Workshop on Climate Informatics (CI).
Pages 9-12.
2016.
[bibtex]
[web]
[doi]
[abstract]
The detection of anomalies in multivariate time series is crucial to identify changes in the ecosystems. We propose an intuitive methodology to assess the occurrence of tail events of multiple biosphere variables.
Christoph Käding, Alexander Freytag, Erik Rodner, Paul Bodesheim, Joachim Denzler:
Active Learning and Discovery of Object Categories in the Presence of Unnameable Instances.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Pages 4343-4352.
2015.
[bibtex]
[pdf]
[web]
[doi]
[code]
[presentation]
[supplementary]
[abstract]
Current visual recognition algorithms are "hungry" for data but massive annotation is extremely costly. Therefore, active learning algorithms are required that reduce labeling efforts to a minimum by selecting examples that are most valuable for labeling. In active learning, all categories occurring in collected data are usually assumed to be known in advance and experts should be able to label every requested instance. But do these assumptions really hold in practice? Could you name all categories in every image? Existing algorithms completely ignore the fact that there are certain examples where an oracle can not provide an answer or which even do not belong to the current problem domain. Ideally, active learning techniques should be able to discover new classes and at the same time cope with queries an expert is not able or willing to label. To meet these observations, we present a variant of the expected model output change principle for active learning and discovery in the presence of unnameable instances. Our experiments show that in these realistic scenarios, our approach substantially outperforms previous active learning methods, which are often not even able to improve with respect to the baseline of random query selection.
Paul Bodesheim, Alexander Freytag, Erik Rodner, Joachim Denzler:
Local Novelty Detection in Multi-class Recognition Problems.
IEEE Winter Conference on Applications of Computer Vision (WACV).
Pages 813-820.
2015.
[bibtex]
[pdf]
[web]
[doi]
[supplementary]
[abstract]
In this paper, we propose using local learning for multiclass novelty detection, a framework that we call local novelty detection. Estimating the novelty of a new sample is an extremely challenging task due to the large variability of known object categories. The features used to judge on the novelty are often very specific for the object in the image and therefore we argue that individual novelty models for each test sample are important. Similar to human experts, it seems intuitive to first look for the most related images thus filtering out unrelated data. Afterwards, the system focuses on discovering similarities and differences to those images only. Therefore, we claim that it is beneficial to solely consider training images most similar to a test sample when deciding about its novelty. Following the principle of local learning, for each test sample a local novelty detection model is learned and evaluated. Our local novelty score turns out to be a valuable indicator for deciding whether the sample belongs to a known category from the training set or to a new, unseen one. With our local novelty detection approach, we achieve state-of-the-art performance in multi-class novelty detection on two popular visual object recognition datasets, Caltech-256 and Image Net. We further show that our framework: (i) can be successfully applied to unknown face detection using the Labeled-Faces-in-the-Wild dataset and (ii) outperforms recent work on attribute-based unfamiliar class detection in fine-grained recognition of bird species on the challenging CUB-200-2011 dataset.
Paul Bodesheim, Alexander Freytag, Erik Rodner, Joachim Denzler:
Approximations of Gaussian Process Uncertainties for Visual Recognition Problems.
Scandinavian Conference on Image Analysis (SCIA).
Pages 182-194.
2013.
[bibtex]
[pdf]
[web]
[doi]
[abstract]
Gaussian processes offer the advantage of calculating the classification uncertainty in terms of predictive variance associated with the classification result. This is especially useful to select informative samples in active learning and to spot samples of previously unseen classes known as novelty detection. However, the Gaussian process framework suffers from high computational complexity leading to computation times too large for practical applications. Hence, we propose an approximation of the Gaussian process predictive variance leading to rigorous speedups. The complexity of both learning and testing the classification model regarding computational time and memory demand decreases by one order with respect to the number of training samples involved. The benefits of our approximations are verified in experimental evaluations for novelty detection and active learning of visual object categories on the datasets C-Pascal of Pascal VOC 2008, Caltech-256, and ImageNet.
Paul Bodesheim, Alexander Freytag, Erik Rodner, Michael Kemmler, Joachim Denzler:
Kernel Null Space Methods for Novelty Detection.
IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Pages 3374-3381.
2013.
[bibtex]
[pdf]
[web]
[doi]
[code]
[presentation]
[abstract]
Detecting samples from previously unknown classes is a crucial task in object recognition, especially when dealing with real-world applications where the closed-world assumption does not hold. We present how to apply a null space method for novelty detection, which maps all training samples of one class to a single point. Beside the possibility of modeling a single class, we are able to treat multiple known classes jointly and to detect novelties for a set of classes with a single model. In contrast to modeling the support of each known class individually, our approach makes use of a projection in a joint subspace where training samples of all known classes have zero intra-class variance. This subspace is called the null space of the training data. To decide about novelty of a test sample, our null space approach allows for solely relying on a distance measure instead of performing density estimation directly. Therefore, we derive a simple yet powerful method for multi-class novelty detection, an important problem not studied sufficiently so far. Our novelty detection approach is assessed in comprehensive multi-class experiments using the publicly available datasets Caltech-256 and ImageNet. The analysis reveals that our null space approach is perfectly suited for multi-class novelty detection since it outperforms all other methods.
Paul Bodesheim, Erik Rodner, Alexander Freytag, Joachim Denzler:
Divergence-Based One-Class Classification Using Gaussian Processes.
British Machine Vision Conference (BMVC).
Pages 50.1-50.11.
2012.
[bibtex]
[pdf]
[web]
[doi]
[presentation]
[abstract]
We present an information theoretic framework for one-class classification, which allows for deriving several new novelty scores. With these scores, we are able to rank samples according to their novelty and to detect outliers not belonging to a learnt data distribution. The key idea of our approach is to measure the impact of a test sample on the previously learnt model. This is carried out in a probabilistic manner using Jensen-Shannon divergence and reclassification results derived from the Gaussian process regression framework. Our method is evaluated using well-known machine learning datasets as well as large-scale image categorisation experiments showing its ability to achieve state-of-the-art performance.