Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets

Abstract : Extracting valuable data among large volumes of data is one of the main challenges in Big Data. In this paper, a Hierarchical Multi-Label Classification process called Semantic HMC is presented. This process aims to extract valuable data from very large data sources, by automatically learning a label hierarchy and classifying data items.The Semantic HMC process is composed of five scalable steps, namely Indexation, Vectorization, Hierarchization, Resolution and Realization. The first three steps construct automatically a label hierarchy from statistical analysis of data. This paper focuses on the last two steps which perform item classification according to the label hierarchy. The process is implemented as a scalable and distributed application, and deployed on a Big Data platform. A quality evaluation is described, which compares the approach with multi-label classification algorithms from the state of the art dedicated to the same goal. The Semantic HMC approach outperforms state of the art approaches in some areas.
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01356375
Contributor : Thomas Hassan <>
Submitted on : Thursday, August 25, 2016 - 3:43:56 PM
Last modification on : Thursday, February 7, 2019 - 3:56:19 PM

Identifiers

Collections

Citation

Rafael Peixoto, Thomas Hassan, Christophe Cruz, Aurélie Bertaux, Nuno Silva. Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets. Open Journal Of Semantic Web, Research Online Publishing (RonPub), 2016, ⟨10.19210/1006.3.1.1⟩. ⟨hal-01356375⟩

Share

Metrics

Record views

234