Skip to Main content Skip to Navigation
Theses

Learning Spatio-temporal Representations of Satellite Time Series for Large-scale Crop Mapping

Abstract : Understanding and monitoring the agricultural activity of a territory requires the production of ac- curate crop type maps. Such maps identify the boundaries of each agricultural parcel along with the cultivated crop type. This information is valuable for a variety of stakeholders and has applications ranging from food supply prediction to subsidy allocation and environmental monitoring. While early crop type maps required tedious in situ data collection, the advent of automated analysis of remote sensing data enabled large-scale mapping efforts. In this dissertation, we consider the problem of crop type mapping from multispectral satellite image time series. In most of the literature of the past decade, this problem is typically addressed with traditional machine learning models trained on hand-engineered descriptors. Meanwhile, in Computer Vision (CV) and Natu- ral Language Processing (NLP), the ability to learn representations directly from raw data provoked a paradigm shift leading to unprecedented levels of performance on a variety of problems. Similarly, the application of deep learning models to remote sensing data significantly improved the state-of-the-art for crop type mapping as well as other tasks. In this thesis, we hold that the direct application of CV and NLP methods to remote sensing tasks tends to ignore crucial particularities of the data at hand. Instead, we argue for the design of bespoke methods leveraging the complex spatial, spectral, and temporal structures of satellite time series. We successively formulate crop type mapping as parcel-based classification, semantic segmentation, and panoptic segmentation, three increasingly difficult tasks. For each of these tasks, we propose a novel deep learning architecture adapted to the task’s specificities and inspired by recent advances in the deep learning literature. Our methods set a new state-of-the-art for each task while being more com- putationally efficient than competing approaches. Specifically, we introduce (i) the Pixel-Set Encoder, an efficient spatial parcel-based encoder, (ii) the Temporal Attention Encoder (TAE), a self-attention temporal encoder, (iii) U-net with TAE, a variation of the TAE for segmentation problems, and (iv) Parcel-as-Point, a lightweight instance segmentation module for the panoptic segmentation of parcels. We also explore how these architectures can be adapted to multimodal image time series combin- ing optical and radar information through well-chosen fusion schemes. Multimodality improves the mapping performance as well as the robustness to cloud obstruction. Lastly, we focus on the hierar- chical tree that encapsulates the semantic relationships between crop classes. We introduce a method to include such structure in the learning process. For crop classification as well as other classification problems, we show that our method reduces the rate of errors between semantically distant classes. Along with these methods, we introduce PASTIS, the first large-scale open-access dataset of mul- timodal satellite image time series with panoptic annotations of agricultural parcels. We hope that this dataset, along with the promising results presented in this dissertation, will encourage further research in this direction and help produce ever more accurate agricultural maps.
Complete list of metadata

https://hal.archives-ouvertes.fr/tel-03524429
Contributor : Vivien Sainte Fare Garnot Connect in order to contact the contributor
Submitted on : Thursday, January 13, 2022 - 11:29:39 AM
Last modification on : Tuesday, January 25, 2022 - 3:53:06 AM
Long-term archiving on: : Thursday, April 14, 2022 - 6:36:29 PM

File

manuscript_these_vsfg_2021-4.p...
Files produced by the author(s)

Identifiers

  • HAL Id : tel-03524429, version 1

Citation

Vivien Sainte Fare Garnot. Learning Spatio-temporal Representations of Satellite Time Series for Large-scale Crop Mapping. Computer Vision and Pattern Recognition [cs.CV]. University Gustave Eiffel, 2022. English. ⟨tel-03524429v1⟩

Share

Metrics

Record views

283

Files downloads

5