Declarative queries on large astronomy databases: Experiments with Hive and HadoopDB - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Declarative queries on large astronomy databases: Experiments with Hive and HadoopDB

Amin Mesmoudi
Mohand-Said Hacid

Résumé

With the amount of data produced in several application domains, it becomes, in many cases, difficult to manage and query large data repositories. Within the PetaSky project (http://com.isima.fr/Petasky), we focus on the problem of managing scientific data in the field of cosmology. The data we consider are those of the LSST project (http://www.lsst.org/). The overall expected size of the database that will be produced will exceed 60 PB (http://www.lsst.org/lsst/science/concept data). In order to evaluate the performances of existing SQL On MapReduce data management systems, we conducted experiments by using data and queries from the area of corpuscular physics and cosmology. The goal of this work is to report on the ability of such systems to support large scale declarative queries. We mainly investigated the impact of data partitioning, indexing and compression on query execution performances.
Fichier non déposé

Dates et versions

hal-01499278 , version 1 (31-03-2017)

Identifiants

  • HAL Id : hal-01499278 , version 1

Citer

Amin Mesmoudi, Mohand-Said Hacid, Farouk Toumani. Declarative queries on large astronomy databases: Experiments with Hive and HadoopDB. Conférence Base de données avancées (BDA), Oct 2014, Grenoble-Autrans, France. pp.1-20. ⟨hal-01499278⟩
236 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More