Skip to Main content Skip to Navigation
Journal articles

Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets

Abstract : Extracting valuable data among large volumes of data is one of the main challenges in Big Data. In this paper, a Hierarchical Multi-Label Classification process called Semantic HMC is presented. This process aims to extract valuable data from very large data sources, by automatically learning a label hierarchy and classifying data items.The Semantic HMC process is composed of five scalable steps, namely Indexation, Vectorization, Hierarchization, Resolution and Realization. The first three steps construct automatically a label hierarchy from statistical analysis of data. This paper focuses on the last two steps which perform item classification according to the label hierarchy. The process is implemented as a scalable and distributed application, and deployed on a Big Data platform. A quality evaluation is described, which compares the approach with multi-label classification algorithms from the state of the art dedicated to the same goal. The Semantic HMC approach outperforms state of the art approaches in some areas.
Complete list of metadata
Contributor : Thomas Hassan <>
Submitted on : Thursday, August 25, 2016 - 3:43:56 PM
Last modification on : Friday, July 17, 2020 - 2:54:07 PM



Rafael Peixoto, Thomas Hassan, Christophe Cruz, Aurélie Bertaux, Nuno Silva. Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets. Open Journal Of Semantic Web, Research Online Publishing (RonPub), 2016, ⟨10.19210/1006.3.1.1⟩. ⟨hal-01356375⟩



Record views