Matthias Körschens, M.Sc.

Address: | Biodiversität der Pflanzen Institut für Ökologie und Evolution Fakultät für Biowissenschaften Friedrich-Schiller-Universität Jena Philosophenweg 16 07743 Jena Germany |
Phone: | +49 (0) 3641 9 49263 |
E-mail: | Matthias (dot) koerschens (at) uni-jena (dot) de |
Room: | 105 |
Links: |
Curriculum Vitae
- since April 2019: PhD Student & research associate at the group “Biodiversität der Pflanzen” of the Friedrich Schiller University Jena
- August 2018-March 2019: Scientific Assistant at the Computer Vision Group of the Friedrich Schiller University Jena
- May 2018: Master Thesis with title: “Identification in Wildlife Monitoring”
- April 2016-May 2018: Master Student in Computer Science at the Friedrich Schiller University Jena
- February 2016: Bachelor Thesis with title “Simulation eines kapazitiven Sensors zur Geometriebestimmung und -bewertung eines Katheters”
- September 2012-February 2016: Bachelor Student in Computer Science at Hochschule Harz in Wernigerode
- Juli 2012: Abitur at the Domgymnasium Merseburg
Research Interests
- Deep Learning
- Finegrained Classification & Detection
- Unified Networks
- Weakly Supervised Learning
- Self-Supervised Learning
Publications
2022
Beyond Global Average Pooling: Alternative Feature Aggregations for Weakly Supervised Localization
Matthias Körschens and Paul Bodesheim and Joachim Denzler.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 180-191. 2022.
[bibtex] [pdf] [abstract]
Matthias Körschens and Paul Bodesheim and Joachim Denzler.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 180-191. 2022.
[bibtex] [pdf] [abstract]
Weakly supervised object localization (WSOL) enables the detection and segmentation of objects in applications where localization annotations are hard or too expensive to obtain. Nowadays, most relevant WSOL approaches are based on class activation mapping (CAM), where a classification network utilizing global average pooling is trained for object classification. The classification layer that follows the pooling layer is then repurposed to generate segmentations using the unpooled features. The resulting localizations are usually imprecise and primarily focused around the most discriminative areas of the object, making a correct indication of the object location difficult. We argue that this problem is inherent in training with global average pooling due to its averaging operation. Therefore, we investigate two alternative pooling strategies: global max pooling and global log-sum-exp pooling. Furthermore, to increase the crispness and resolution of localization maps, we also investigate the application of Feature Pyramid Networks, which are commonplace in object detection. We confirm the usefulness of both alternative pooling methods as well as the Feature Pyramid Network on the CUB-200-2011 and OpenImages datasets.
Occlusion-Robustness of Convolutional Neural Networks via Inverted Cutout
Matthias Körschens and Paul Bodesheim and Joachim Denzler.
International Conference on Pattern Recognition (ICPR). 2022.
[bibtex] [pdf] [supplementary] [abstract]
Matthias Körschens and Paul Bodesheim and Joachim Denzler.
International Conference on Pattern Recognition (ICPR). 2022.
[bibtex] [pdf] [supplementary] [abstract]
Convolutional Neural Networks (CNNs) are able to reliably classify objects in images if they are clearly visible and only slightly affected by small occlusions. However, heavy occlusions can strongly deteriorate the performance of CNNs, which is critical for tasks where correct identification is paramount. For many real-world applications, images are taken in unconstrained environments under suboptimal conditions, where occluded objects are inevitable. We propose a novel data augmentation method called Inverted Cutout, which can be used for training a CNN by showing only small patches of the images. Together with this augmentation method, we present several ways of making the network robust against occlusion. On the one hand, we utilize a spatial aggregation module without modifying the base network and on the other hand, we achieve occlusion-robustness with appropriate fine-tuning in conjunction with Inverted Cutout. In our experiments, we compare two different aggregation modules and two loss functions on the Occluded-Vehicles and Occluded-COCO-Vehicles datasets, showing that our approach outperforms existing state-of-the-art methods for object categorization under varying levels of occlusion.
Pre-trained models are not enough: active and lifelong learning is important for long-term visual monitoring of mammals in biodiversity research. Individual identification and attribute prediction with image features from deep neural networks and decoupled decision models applied to elephants and great apes
Paul Bodesheim and Jan Blunk and Matthias Körschens and Clemens-Alexander Brust and Christoph Käding and Joachim Denzler.
Mammalian Biology. 2022.
[bibtex] [web] [abstract]
Paul Bodesheim and Jan Blunk and Matthias Körschens and Clemens-Alexander Brust and Christoph Käding and Joachim Denzler.
Mammalian Biology. 2022.
[bibtex] [web] [abstract]
Animal re-identification based on image data, either recorded manually by photographers or automatically with camera traps, is an important task for ecological studies about biodiversity and conservation that can be highly automatized with algorithms from computer vision and machine learning. However, fixed identification models only trained with standard datasets before their application will quickly reach their limits, especially for long-term monitoring with changing environmental conditions, varying visual appearances of individuals over time that differ a lot from those in the training data, and new occurring individuals that have not been observed before. Hence, we believe that active learning with human-in-the-loop and continuous lifelong learning is important to tackle these challenges and to obtain high-performance recognition systems when dealing with huge amounts of additional data that become available during the application. Our general approach with image features from deep neural networks and decoupled decision models can be applied to many different mammalian species and is perfectly suited for continuous improvements of the recognition systems via lifelong learning. In our identification experiments, we consider four different taxa, namely two elephant species: African forest elephants and Asian elephants, as well as two species of great apes: gorillas and chimpanzees. Going beyond classical re-identification, our decoupled approach can also be used for predicting attributes of individuals such as gender or age using classification or regression methods. Although applicable for small datasets of individuals as well, we argue that even better recognition performance will be achieved by improving decision models gradually via lifelong learning to exploit huge datasets and continuous recordings from long-term applications. We highlight that algorithms for deploying lifelong learning in real observational studies exist and are ready for use. Hence, lifelong learning might become a valuable concept that supports practitioners when analyzing large-scale image data during long-term monitoring of mammals.
2021
Domain Adaptation and Active Learning for Fine-Grained Recognition in the Field of Biodiversity
Bernd Gruner and Matthias Körschens and Björn Barz and Joachim Denzler.
Findings of the CVPR Workshop on Continual Learning in Computer Vision (CLVision). 2021.
[bibtex] [abstract]
Bernd Gruner and Matthias Körschens and Björn Barz and Joachim Denzler.
Findings of the CVPR Workshop on Continual Learning in Computer Vision (CLVision). 2021.
[bibtex] [abstract]
Deep-learning methods offer unsurpassed recognition performance in a wide range of domains, including fine-grained recognition tasks. However, in most problem areas there are insufficient annotated training samples. Therefore, the topic of transfer learning respectively domain adaptation is particularly important. In this work, we investigate to what extent unsupervised domain adaptation can be used for fine-grained recognition in a biodiversity context to learn a real-world classifier based on idealized training data, e.g. preserved butterflies and plants. Moreover, we investigate the influence of different normalization layers, such as Group Normalization in combination with Weight Standardization, on the classifier. We discovered that domain adaptation works very well for fine-grained recognition and that the normalization methods have a great influence on the results. Using domain adaptation and Transferable Normalization, the accuracy of the classifier could be increased by up to 12.35 % compared to the baseline. Furthermore, the domain adaptation system is combined with an active learning component to improve the results. We compare different active learning strategies with each other. Surprisingly, we found that more sophisticated strategies provide better results than the random selection baseline for only one of the two datasets. In this case, the distance and diversity strategy performed best. Finally, we present a problem analysis of the datasets.
Automatic Plant Cover Estimation with Convolutional Neural Networks
Matthias Körschens and Paul Bodesheim and Christine Römermann and Solveig Franziska Bucher and Mirco Migliavacca and Josephine Ulrich and Joachim Denzler.
Computer Science for Biodiversity Workshop (CS4Biodiversity), INFORMATIK 2021. Pages 499-516. 2021.
[bibtex] [pdf] [abstract]
Matthias Körschens and Paul Bodesheim and Christine Römermann and Solveig Franziska Bucher and Mirco Migliavacca and Josephine Ulrich and Joachim Denzler.
Computer Science for Biodiversity Workshop (CS4Biodiversity), INFORMATIK 2021. Pages 499-516. 2021.
[bibtex] [pdf] [abstract]
Monitoring the responses of plants to environmental changes is essential for plant biodiversity research. This, however, is currently still being done manually by botanists in the field. This work is very laborious, and the data obtained is, though following a standardized method to estimate plant coverage, usually subjective and has a coarse temporal resolution. To remedy these caveats, we investigate approaches using convolutional neural networks (CNNs) to automatically extract the relevant data from images, focusing on plant community composition and species coverages of 9 herbaceous plant species. To this end, we investigate several standard CNN architectures and different pretraining methods. We find that we outperform our previous approach at higher image resolutions using a custom CNN with a mean absolute error of 5.16%. In addition to these investigations, we also conduct an error analysis based on the temporal aspect of the plant cover images. This analysis gives insight into where problems for automatic approaches lie, like occlusion and likely misclassifications caused by temporal changes.
Weakly Supervised Segmentation Pretraining for Plant Cover Prediction
Matthias Körschens and Paul Bodesheim and Christine Römermann and Solveig Franziska Bucher and Mirco Migliavacca and Josephine Ulrich and Joachim Denzler.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 589-603. 2021.
[bibtex] [pdf] [supplementary] [abstract]
Matthias Körschens and Paul Bodesheim and Christine Römermann and Solveig Franziska Bucher and Mirco Migliavacca and Josephine Ulrich and Joachim Denzler.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 589-603. 2021.
[bibtex] [pdf] [supplementary] [abstract]
Automated plant cover prediction can be a valuable tool for botanists, as plant cover estimations are a laborious and recurring task in environmental research. Upon examination of the images usually encompassed in this task, it becomes apparent that the task is ill-posed and successful training on such images alone without external data is nearly impossible. While a previous approach includes pretraining on a domain-related dataset containing plants in natural settings, we argue that regular classification training on such data is insufficient. To solve this problem, we propose a novel pretraining pipeline utilizing weakly supervised object localization on images with only class annotations to generate segmentation maps that can be exploited for a second pretraining step. We utilize different pooling methods during classification pretraining, and evaluate and compare their effects on the plant cover prediction. For this evaluation, we focus primarily on the visible parts of the plants. To this end, contrary to previous works, we created a small dataset containing segmentations of plant cover images to be able to evaluate the benefit of our method numerically. We find that our segmentation pretraining approach outperforms classification pretraining and especially aids in the recognition of less prevalent plants in the plant cover dataset.
2020
Towards Confirmable Automated Plant Cover Determination
Matthias Körschens and Paul Bodesheim and Christine Römermann and Solveig Franziska Bucher and Josephine Ulrich and Joachim Denzler.
ECCV Workshop on Computer Vision Problems in Plant Phenotyping (CVPPP). 2020.
[bibtex] [pdf] [web] [supplementary] [abstract]
Matthias Körschens and Paul Bodesheim and Christine Römermann and Solveig Franziska Bucher and Josephine Ulrich and Joachim Denzler.
ECCV Workshop on Computer Vision Problems in Plant Phenotyping (CVPPP). 2020.
[bibtex] [pdf] [web] [supplementary] [abstract]
Changes in plant community composition reflect environmental changes like in land-use and climate. While we have the means to record the changes in composition automatically nowadays, we still lack methods to analyze the generated data masses automatically. We propose a novel approach based on convolutional neural networks for analyzing the plant community composition while making the results explainable for the user. To realize this, our approach generates a semantic segmentation map while predicting the cover percentages of the plants in the community. The segmentation map is learned in a weakly supervised way only based on plant cover data and therefore does not require dedicated segmentation annotations. Our approach achieves a mean absolute error of 5.3% for plant cover prediction on our introduced dataset with 9 herbaceous plant species in an imbalanced distribution, and generates segmentation maps, where the location of the most prevalent plants in the dataset is correctly indicated in many images.
2019
ELPephants: A Fine-Grained Dataset for Elephant Re-Identification
Matthias Körschens and Joachim Denzler.
ICCV Workshop on Computer Vision for Wildlife Conservation (ICCV-WS). 2019.
[bibtex] [pdf] [abstract]
Matthias Körschens and Joachim Denzler.
ICCV Workshop on Computer Vision for Wildlife Conservation (ICCV-WS). 2019.
[bibtex] [pdf] [abstract]
Despite many possible applications, machine learning and computer vision approaches are very rarely utilized in biodiversity monitoring. One reason for this might be that automatic image analysis in biodiversity research often poses a unique set of challenges, some of which are not commonly found in many popular datasets. Thus, suitable image datasets are necessary for the development of appropriate algorithms tackling these challenges. In this paper we introduce the ELPephants dataset, a re-identification dataset, which contains 276 elephant individuals in 2078 images following a long-tailed distribution. It offers many different challenges, like fine-grained differences between the individuals, inferring a new view on the elephant from only one training side, aging effects on the animals and large differences in skin color. We also present a baseline approach, which is a system using a YOLO object detector, feature extraction of ImageNet features and discrimination using a support vector machine. This system achieves a top-1 accuracy of 56% and top-10 accuracy of 80% on the ELPephants dataset.
2018
Towards Automatic Identification of Elephants in the Wild
Matthias Körschens and Björn Barz and Joachim Denzler.
AI for Wildlife Conservation Workshop (AIWC). 2018.
[bibtex] [pdf] [abstract]
Matthias Körschens and Björn Barz and Joachim Denzler.
AI for Wildlife Conservation Workshop (AIWC). 2018.
[bibtex] [pdf] [abstract]
Identifying animals from a large group of possible individuals is very important for biodiversity monitoring and especially for collecting data on a small number of particularly interesting individuals, as these have to be identified first before this can be done. Identifying them can be a very time-consuming task. This is especially true, if the animals look very similar and have only a small number of distinctive features, like elephants do. In most cases the animals stay at one place only for a short period of time during which the animal needs to be identified for knowing whether it is important to collect new data on it. For this reason, a system supporting the researchers in identifying elephants to speed up this process would be of great benefit. In this paper, we present such a system for identifying elephants in the face of a large number of individuals with only few training images per individual. For that purpose, we combine object part localization, off-the-shelf CNN features, and support vector machine classification to provide field researches with proposals of possible individuals given new images of an elephant. The performance of our system is demonstrated on a dataset comprising a total of 2078 images of 276 individual elephants, where we achieve 56% top-1 test accuracy and 80% top-10 accuracy. To deal with occlusion, varying viewpoints, and different poses present in the dataset, we furthermore enable the analysts to provide the system with multiple images of the same elephant to be identified and aggregate confidence values generated by the classifier. With that, our system achieves a top-1 accuracy of 74% and a top-10 accuracy of 88% on the held-out test dataset.