Efficient Graph-Oriented Summary for Optimized Resource Description Framework Streams Processing Using Extended Centrality Measures - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Efficient Graph-Oriented Summary for Optimized Resource Description Framework Streams Processing Using Extended Centrality Measures

Résumé

Existing RDF Stream Processing (RSP) systems allow continuous processing of RDF data issued from different application domains such as weather station measuring phenomena, geolocation, IoT applications, drinking water distribution management and so on. However processing window phase often expires before finishing the entire session and RSP systems immediately delete data streams after each processed window. Such mechanism does not allow an optimized exploitation of the RDF data streams as the most relevant and pertinent information of the data is often not used in a due time and almost impossible to be exploited for further analyzes. It should be better to keep the most informative part of data within streams while minimizing the memory storage space. In this work, we propose an RDF graph summarization system based on an explicit and implicit expressed needs through three (3) main approaches: (1) an approach for user queries (SPARQL) in order to extract their needs and group them into a more global query, (2) an extension of the closeness centrality measure issued from Social Network Analysis (SNA) to determine the most informative parts of the graph and (3) an RDF graph summarization technique combining extracted user query needs and the extended centrality measure. Experiments and evaluations show efficient result in term of memory space storage and the most expected approximate query results on summarized graphs compared to the source ones. Existing RDF Stream Processing (RSP) systems allow continuous processing of RDF data issued from different application domains such as weather station measuring phenomena, geolocation, IoT applications, drinking water distribution management and so on. However processing window phase often expires before finishing the entire session and RSP systems immediately delete data streams after each processed window. Such mechanism does not allow an optimized exploitation of the RDF data streams as the most relevant and pertinent information of the data is often not used in a due time and almost impossible to be exploited for further analyzes. It should be better to keep the most informative part of data within streams while minimizing the memory storage space. In this work, we propose an RDF graph summarization system based on an explicit and implicit expressed needs through three (3) main approaches: (1) an approach for user queries (SPARQL) in order to extract their needs and group them into a more global query, (2) an extension of the closeness centrality measure issued from Social Network Analysis (SNA) to determine the most informative parts of the graph and (3) an RDF graph summarization technique combining extracted user query needs and the extended centrality measure. Experiments and evaluations show efficient result in term of memory space storage and the most expected approximate query results on summarized graphs compared to the source ones.

Mots clés

Fichier non déposé

Dates et versions

hal-02468605 , version 1 (06-02-2020)

Identifiants

  • HAL Id : hal-02468605 , version 1

Citer

Amadou Fall Dia, Maurras Ulbricht Togbe, Aliou Boly, Zakia Kasi-Aoul, Elisabeth Metais. Efficient Graph-Oriented Summary for Optimized Resource Description Framework Streams Processing Using Extended Centrality Measures. ICDM 2018 : 20th International Conference on Data Mining, Jul 2018, Istamboul, Turkey. pp.1430-1441. ⟨hal-02468605⟩
95 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More