Matthias Körschens, M.Sc.
Address: | Biodiversität der Pflanzen |
Institut für Ökologie und Evolution | |
Fakultät für Biowissenschaften | |
Friedrich-Schiller-Universität Jena | |
Philosophenweg 16 | |
07743 Jena | |
Germany | |
Phone: | +49 (0) 3641 9 49263 |
E-mail: | matthias (dot) koerschens (at) uni-jena (dot) de |
Room: | 105 |
Links: |
Curriculum Vitae
since 2019 | PhD Student & Research Associate |
Group “Biodiversität der Pflanzen”, Friedrich Schiller University Jena | |
Computer Vision Group, Friedrich Schiller University Jena | |
2018 – 2019 | Research Assistant |
Computer Vision Group, Friedrich Schiller University Jena | |
2016 – 2018 | M. Sc. Computer Science |
Friedrich Schiller University Jena | |
Master Thesis: “Identification in Wildlife Monitoring” | |
2012 – 2016 | B. Sc. Computer Science |
Harz University of Applied Studies, Wernigerode | |
Bachelor Thesis: “Simulation eines kapazitiven Sensors zur Geometriebestimmung | |
und -bewertung eines Katheters” |
Research Interests
- Deep Learning
- Finegrained Classification & Detection
- Unified Networks
- Weakly Supervised Learning
- Self-Supervised Learning
Publications
2024
Matthias Körschens, Solveig Franziska Bucher, Paul Bodesheim, Josephine Ulrich, Joachim Denzler, Christine Römermann:
Determining the Community Composition of Herbaceous Species from Images using Convolutional Neural Networks.
Ecological Informatics. 80 : pp. 102516. 2024.
[bibtex] [web] [doi] [abstract]
Determining the Community Composition of Herbaceous Species from Images using Convolutional Neural Networks.
Ecological Informatics. 80 : pp. 102516. 2024.
[bibtex] [web] [doi] [abstract]
Global change has a detrimental impact on the environment and changes biodiversity patterns, which can be observed, among others, via analyzing changes in the composition of plant communities. Typically, vegetation relevées are done manually, which is time-consuming, laborious, and subjective. Applying an automatic system for such an analysis that can also identify co-occurring species would be beneficial as it is fast, effortless to use, and consistent. Here, we introduce such a system based on Convolutional Neural Networks for automatically predicting the species-wise plant cover. The system is trained on freely available image data of herbaceous plant species from web sources and plant cover estimates done by experts. With a novel extension of our original approach, the system can even be applied directly to vegetation images without requiring such cover estimates. Our extended approach, not utilizing dedicated training data, performs similarly to humans concerning the relative species abundances in the vegetation relevées. When trained on dedicated training annotations, it reflects the original estimates more closely than (independent) human experts, who manually analyzed the same sites. Our method is, with little adaptation, usable in novel domains and could be used to analyze plant community dynamics and responses of different plant species to environmental changes.
2023
Matthias Körschens, Solveig Franziska Bucher, Christine Römermann, Joachim Denzler:
Improving Data Efficiency for Plant Cover Prediction with Label Interpolation and Monte-Carlo Cropping.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). 2023.
[bibtex] [pdf] [web] [supplementary] [abstract]
Improving Data Efficiency for Plant Cover Prediction with Label Interpolation and Monte-Carlo Cropping.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). 2023.
[bibtex] [pdf] [web] [supplementary] [abstract]
The plant community composition is an essential indicator of environmental changes and is, for this reason, usually analyzed in ecological field studies in terms of the so-called plant cover. The manual acquisition of this kind of data is time-consuming, laborious, and prone to human error. Automated camera systems can collect high-resolution images of the surveyed vegetation plots at a high frequency. In combination with subsequent algorithmic analysis, it is possible to objectively extract information on plant community composition quickly and with little human effort. An automated camera system can easily collect the large amounts of image data necessary to train a Deep Learning system for automatic analysis. However, due to the amount of work required to annotate vegetation images with plant cover data, only few labeled samples are available. As automated camera systems can collect many pictures without labels, we introduce an approach to interpolate the sparse labels in the collected vegetation plot time series down to the intermediate dense and unlabeled images to artificially increase our training dataset to seven times its original size. Moreover, we introduce a new method we call Monte-Carlo Cropping. This approach trains on a collection of cropped parts of the training images to deal with high-resolution images efficiently, implicitly augment the training images, and speed up training. We evaluate both approaches on a plant cover dataset containing images of herbaceous plant communities and find that our methods lead to improvements in the species, community, and segmentation metrics investigated.
Matthias Körschens, Solveig Franziska Bucher, Christine Römermann, Joachim Denzler:
Unified Automatic Plant Cover and Phenology Prediction.
ICCV Workshop on Computer Vision in Plant Phenotyping and Agriculture (CVPPA). 2023.
[bibtex] [pdf] [abstract]
Unified Automatic Plant Cover and Phenology Prediction.
ICCV Workshop on Computer Vision in Plant Phenotyping and Agriculture (CVPPA). 2023.
[bibtex] [pdf] [abstract]
The composition and phenology of plant communities are paramount indicators for environmental changes, especially climate change, and are, due to this, subject to many ecological studies. While species composition and phenology are usually monitored by ecologists directly in the field, this process is slow, laborious, and prone to human error. In contrast, automated camera systems with intelligent image analysis methods can provide fast analyses with a high temporal resolution and therefore are highly advantageous for ecological research. Nowadays, methods already exist that can analyze the plant community composition from images, and others that investigate the phenology of plants. However, there are no automatic approaches that analyze the plant community composition together with the phenology of the same community, which is why we aim to close this gap by combining an existing plant cover prediction method based on convolutional neural networks with a novel phenology prediction module. The module builds on the species- and pixel-wise occurrence probabilities generated during the plant cover prediction process, and by that, significantly improves the quality of phenology predictions compared to isolated training of plant cover and phenology. We evaluate our approach by comparing the time trends of the observed and predicted phenology values on the InsectArmageddon dataset comprising cover and phenology data of eight herbaceous plant species. We find that our method significantly outperforms two dataset-statistics-based prediction baselines as well as a naive baseline that does not integrate any information from the plant cover prediction module.
2022
Matthias Körschens, Paul Bodesheim, Joachim Denzler:
Beyond Global Average Pooling: Alternative Feature Aggregations for Weakly Supervised Localization.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 180-191. 2022.
[bibtex] [pdf] [doi] [abstract]
Beyond Global Average Pooling: Alternative Feature Aggregations for Weakly Supervised Localization.
International Conference on Computer Vision Theory and Applications (VISAPP). Pages 180-191. 2022.
[bibtex] [pdf] [doi] [abstract]
Weakly supervised object localization (WSOL) enables the detection and segmentation of objects in applications where localization annotations are hard or too expensive to obtain. Nowadays, most relevant WSOL approaches are based on class activation mapping (CAM), where a classification network utilizing global average pooling is trained for object classification. The classification layer that follows the pooling layer is then repurposed to generate segmentations using the unpooled features. The resulting localizations are usually imprecise and primarily focused around the most discriminative areas of the object, making a correct indication of the object location difficult. We argue that this problem is inherent in training with global average pooling due to its averaging operation. Therefore, we investigate two alternative pooling strategies: global max pooling and global log-sum-exp pooling. Furthermore, to increase the crispness and resolution of localization maps, we also investigate the application of Feature Pyramid Networks, which are commonplace in object detection. We confirm the usefulness of both alternative pooling methods as well as the Feature Pyramid Network on the CUB-200-2011 and OpenImages datasets.
Matthias Körschens, Paul Bodesheim, Joachim Denzler:
Occlusion-Robustness of Convolutional Neural Networks via Inverted Cutout.
International Conference on Pattern Recognition (ICPR). Pages 2829-2835. 2022.
[bibtex] [pdf] [doi] [supplementary] [abstract]
Occlusion-Robustness of Convolutional Neural Networks via Inverted Cutout.
International Conference on Pattern Recognition (ICPR). Pages 2829-2835. 2022.
[bibtex] [pdf] [doi] [supplementary] [abstract]
Convolutional Neural Networks (CNNs) are able to reliably classify objects in images if they are clearly visible and only slightly affected by small occlusions. However, heavy occlusions can strongly deteriorate the performance of CNNs, which is critical for tasks where correct identification is paramount. For many real-world applications, images are taken in unconstrained environments under suboptimal conditions, where occluded objects are inevitable. We propose a novel data augmentation method called Inverted Cutout, which can be used for training a CNN by showing only small patches of the images. Together with this augmentation method, we present several ways of making the network robust against occlusion. On the one hand, we utilize a spatial aggregation module without modifying the base network and on the other hand, we achieve occlusion-robustness with appropriate fine-tuning in conjunction with Inverted Cutout. In our experiments, we compare two different aggregation modules and two loss functions on the Occluded-Vehicles and Occluded-COCO-Vehicles datasets, showing that our approach outperforms existing state-of-the-art methods for object categorization under varying levels of occlusion.
Paul Bodesheim, Jan Blunk, Matthias Körschens, Clemens-Alexander Brust, Christoph Käding, Joachim Denzler:
Pre-trained models are not enough: active and lifelong learning is important for long-term visual monitoring of mammals in biodiversity research. Individual identification and attribute prediction with image features from deep neural networks and decoupled decision models applied to elephants and great apes.
Mammalian Biology. 102 : pp. 875-897. 2022.
[bibtex] [web] [doi] [abstract]
Pre-trained models are not enough: active and lifelong learning is important for long-term visual monitoring of mammals in biodiversity research. Individual identification and attribute prediction with image features from deep neural networks and decoupled decision models applied to elephants and great apes.
Mammalian Biology. 102 : pp. 875-897. 2022.
[bibtex] [web] [doi] [abstract]
Animal re-identification based on image data, either recorded manually by photographers or automatically with camera traps, is an important task for ecological studies about biodiversity and conservation that can be highly automatized with algorithms from computer vision and machine learning. However, fixed identification models only trained with standard datasets before their application will quickly reach their limits, especially for long-term monitoring with changing environmental conditions, varying visual appearances of individuals over time that differ a lot from those in the training data, and new occurring individuals that have not been observed before. Hence, we believe that active learning with human-in-the-loop and continuous lifelong learning is important to tackle these challenges and to obtain high-performance recognition systems when dealing with huge amounts of additional data that become available during the application. Our general approach with image features from deep neural networks and decoupled decision models can be applied to many different mammalian species and is perfectly suited for continuous improvements of the recognition systems via lifelong learning. In our identification experiments, we consider four different taxa, namely two elephant species: African forest elephants and Asian elephants, as well as two species of great apes: gorillas and chimpanzees. Going beyond classical re-identification, our decoupled approach can also be used for predicting attributes of individuals such as gender or age using classification or regression methods. Although applicable for small datasets of individuals as well, we argue that even better recognition performance will be achieved by improving decision models gradually via lifelong learning to exploit huge datasets and continuous recordings from long-term applications. We highlight that algorithms for deploying lifelong learning in real observational studies exist and are ready for use. Hence, lifelong learning might become a valuable concept that supports practitioners when analyzing large-scale image data during long-term monitoring of mammals.
2021
Bernd Gruner, Matthias Körschens, Björn Barz, Joachim Denzler:
Domain Adaptation and Active Learning for Fine-Grained Recognition in the Field of Biodiversity.
Findings of the CVPR Workshop on Continual Learning in Computer Vision (CLVision). 2021.
[bibtex] [abstract]
Domain Adaptation and Active Learning for Fine-Grained Recognition in the Field of Biodiversity.
Findings of the CVPR Workshop on Continual Learning in Computer Vision (CLVision). 2021.
[bibtex] [abstract]
Deep-learning methods offer unsurpassed recognition performance in a wide range of domains, including fine-grained recognition tasks. However, in most problem areas there are insufficient annotated training samples. Therefore, the topic of transfer learning respectively domain adaptation is particularly important. In this work, we investigate to what extent unsupervised domain adaptation can be used for fine-grained recognition in a biodiversity context to learn a real-world classifier based on idealized training data, e.g. preserved butterflies and plants. Moreover, we investigate the influence of different normalization layers, such as Group Normalization in combination with Weight Standardization, on the classifier. We discovered that domain adaptation works very well for fine-grained recognition and that the normalization methods have a great influence on the results. Using domain adaptation and Transferable Normalization, the accuracy of the classifier could be increased by up to 12.35 % compared to the baseline. Furthermore, the domain adaptation system is combined with an active learning component to improve the results. We compare different active learning strategies with each other. Surprisingly, we found that more sophisticated strategies provide better results than the random selection baseline for only one of the two datasets. In this case, the distance and diversity strategy performed best. Finally, we present a problem analysis of the datasets.
Matthias Körschens, Paul Bodesheim, Christine Römermann, Solveig Franziska Bucher, Mirco Migliavacca, Josephine Ulrich, Joachim Denzler:
Automatic Plant Cover Estimation with Convolutional Neural Networks.
Computer Science for Biodiversity Workshop (CS4Biodiversity), INFORMATIK 2021. Pages 499-516. 2021.
[bibtex] [pdf] [doi] [abstract]
Automatic Plant Cover Estimation with Convolutional Neural Networks.
Computer Science for Biodiversity Workshop (CS4Biodiversity), INFORMATIK 2021. Pages 499-516. 2021.
[bibtex] [pdf] [doi] [abstract]
Monitoring the responses of plants to environmental changes is essential for plant biodiversity research. This, however, is currently still being done manually by botanists in the field. This work is very laborious, and the data obtained is, though following a standardized method to estimate plant coverage, usually subjective and has a coarse temporal resolution. To remedy these caveats, we investigate approaches using convolutional neural networks (CNNs) to automatically extract the relevant data from images, focusing on plant community composition and species coverages of 9 herbaceous plant species. To this end, we investigate several standard CNN architectures and different pretraining methods. We find that we outperform our previous approach at higher image resolutions using a custom CNN with a mean absolute error of 5.16%. In addition to these investigations, we also conduct an error analysis based on the temporal aspect of the plant cover images. This analysis gives insight into where problems for automatic approaches lie, like occlusion and likely misclassifications caused by temporal changes.
Matthias Körschens, Paul Bodesheim, Christine Römermann, Solveig Franziska Bucher, Mirco Migliavacca, Josephine Ulrich, Joachim Denzler:
Weakly Supervised Segmentation Pretraining for Plant Cover Prediction.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 589-603. 2021.
[bibtex] [pdf] [doi] [supplementary] [abstract]
Weakly Supervised Segmentation Pretraining for Plant Cover Prediction.
DAGM German Conference on Pattern Recognition (DAGM-GCPR). Pages 589-603. 2021.
[bibtex] [pdf] [doi] [supplementary] [abstract]
Automated plant cover prediction can be a valuable tool for botanists, as plant cover estimations are a laborious and recurring task in environmental research. Upon examination of the images usually encompassed in this task, it becomes apparent that the task is ill-posed and successful training on such images alone without external data is nearly impossible. While a previous approach includes pretraining on a domain-related dataset containing plants in natural settings, we argue that regular classification training on such data is insufficient. To solve this problem, we propose a novel pretraining pipeline utilizing weakly supervised object localization on images with only class annotations to generate segmentation maps that can be exploited for a second pretraining step. We utilize different pooling methods during classification pretraining, and evaluate and compare their effects on the plant cover prediction. For this evaluation, we focus primarily on the visible parts of the plants. To this end, contrary to previous works, we created a small dataset containing segmentations of plant cover images to be able to evaluate the benefit of our method numerically. We find that our segmentation pretraining approach outperforms classification pretraining and especially aids in the recognition of less prevalent plants in the plant cover dataset.
2020
Matthias Körschens, Paul Bodesheim, Christine Römermann, Solveig Franziska Bucher, Josephine Ulrich, Joachim Denzler:
Towards Confirmable Automated Plant Cover Determination.
ECCV Workshop on Computer Vision Problems in Plant Phenotyping (CVPPP). 2020.
[bibtex] [pdf] [web] [doi] [supplementary] [abstract]
Towards Confirmable Automated Plant Cover Determination.
ECCV Workshop on Computer Vision Problems in Plant Phenotyping (CVPPP). 2020.
[bibtex] [pdf] [web] [doi] [supplementary] [abstract]
Changes in plant community composition reflect environmental changes like in land-use and climate. While we have the means to record the changes in composition automatically nowadays, we still lack methods to analyze the generated data masses automatically. We propose a novel approach based on convolutional neural networks for analyzing the plant community composition while making the results explainable for the user. To realize this, our approach generates a semantic segmentation map while predicting the cover percentages of the plants in the community. The segmentation map is learned in a weakly supervised way only based on plant cover data and therefore does not require dedicated segmentation annotations. Our approach achieves a mean absolute error of 5.3% for plant cover prediction on our introduced dataset with 9 herbaceous plant species in an imbalanced distribution, and generates segmentation maps, where the location of the most prevalent plants in the dataset is correctly indicated in many images.
2019
Matthias Körschens, Joachim Denzler:
ELPephants: A Fine-Grained Dataset for Elephant Re-Identification.
ICCV Workshop on Computer Vision for Wildlife Conservation (ICCV-WS). 2019.
[bibtex] [pdf] [abstract]
ELPephants: A Fine-Grained Dataset for Elephant Re-Identification.
ICCV Workshop on Computer Vision for Wildlife Conservation (ICCV-WS). 2019.
[bibtex] [pdf] [abstract]
Despite many possible applications, machine learning and computer vision approaches are very rarely utilized in biodiversity monitoring. One reason for this might be that automatic image analysis in biodiversity research often poses a unique set of challenges, some of which are not commonly found in many popular datasets. Thus, suitable image datasets are necessary for the development of appropriate algorithms tackling these challenges. In this paper we introduce the ELPephants dataset, a re-identification dataset, which contains 276 elephant individuals in 2078 images following a long-tailed distribution. It offers many different challenges, like fine-grained differences between the individuals, inferring a new view on the elephant from only one training side, aging effects on the animals and large differences in skin color. We also present a baseline approach, which is a system using a YOLO object detector, feature extraction of ImageNet features and discrimination using a support vector machine. This system achieves a top-1 accuracy of 56% and top-10 accuracy of 80% on the ELPephants dataset.
2018
Matthias Körschens, Björn Barz, Joachim Denzler:
Towards Automatic Identification of Elephants in the Wild.
AI for Wildlife Conservation Workshop (AIWC). 2018.
[bibtex] [pdf] [abstract]
Towards Automatic Identification of Elephants in the Wild.
AI for Wildlife Conservation Workshop (AIWC). 2018.
[bibtex] [pdf] [abstract]
Identifying animals from a large group of possible individuals is very important for biodiversity monitoring and especially for collecting data on a small number of particularly interesting individuals, as these have to be identified first before this can be done. Identifying them can be a very time-consuming task. This is especially true, if the animals look very similar and have only a small number of distinctive features, like elephants do. In most cases the animals stay at one place only for a short period of time during which the animal needs to be identified for knowing whether it is important to collect new data on it. For this reason, a system supporting the researchers in identifying elephants to speed up this process would be of great benefit. In this paper, we present such a system for identifying elephants in the face of a large number of individuals with only few training images per individual. For that purpose, we combine object part localization, off-the-shelf CNN features, and support vector machine classification to provide field researches with proposals of possible individuals given new images of an elephant. The performance of our system is demonstrated on a dataset comprising a total of 2078 images of 276 individual elephants, where we achieve 56% top-1 test accuracy and 80% top-10 accuracy. To deal with occlusion, varying viewpoints, and different poses present in the dataset, we furthermore enable the analysts to provide the system with multiple images of the same elephant to be identified and aggregate confidence values generated by the classifier. With that, our system achieves a top-1 accuracy of 74% and a top-10 accuracy of 88% on the held-out test dataset.