Processing oceanographic data by Python libraries NumPy, SciPy and Pandas - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Aquatic Research Année : 2019

Processing oceanographic data by Python libraries NumPy, SciPy and Pandas

Polina Lemenkova

Résumé

The study area is located in western Pacific Ocean, Mariana Trench. The aim of the data analysis is to analyze the potential influence of how various geological and tectonic factors may affect the ge-omorphological shape of the Mariana Trench. Statistical analysis of the data set in marine geology and oceanography requires an adequate strategy on big data processing. In this context, current research proposes a combination of the Python-based methodology that couples GIS geospatial data analysis. The Quantum GIS part of the methodology produces an optimized representative sampling dataset consisting of 25 cross-section profiles having in total 12,590 bathymetric observation points. The sampling of the geospatial dataset are located across the Mariana Trench. The second part of the methodology consists of statistical data processing by means of high-level programming language Python. Current research uses libraries Pandas, NumPy and SciPy. The data processing also involves the subsampling of two auxiliary masked data frames from the initial large data set that only consists of the target variables: sediment thickness, slope angle degrees and bathymetric observation points across four tectonic plates: Pacific, Philippine, Mariana, and Caroline. Finally, the data were analyzed by several approaches: 1) Kernel Density Estimation (KDE) for analysis of the probability of data distribution; 2) stacked area chart for visualization of the data range across various segments of the trench; 3) spacial series of radar charts; 4) stacked bar plots showing the data distribution by tectonic plates; 5) stacked bar charts for correlation of sediment thickness by profiles, versus distance from the igneous volcanic areas; 6) circular pie plots visualizing data distribution by 25 profiles; 7) scatterplot matrices for correlation analysis between marine geologic variables. The results presented a distinct correlation between the geologic, tectonic and oceanographic variables. Six Python codes are provided in full for repeatability of this research.
Fichier principal
Vignette du fichier
Lemenkova-DOI:10.3153:AR19009.pdf (5.86 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02093491 , version 1 (09-04-2019)

Licence

Copyright (Tous droits réservés)

Identifiants

Citer

Polina Lemenkova. Processing oceanographic data by Python libraries NumPy, SciPy and Pandas. Aquatic Research, 2019, 2 (2), pp.73-91. ⟨10.3153/AR19009⟩. ⟨hal-02093491⟩
715 Consultations
1259 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More