An unsupervised classification process for large datasets using web reasoning

Abstract : Determining valuable data among large volumes of data is one of the main challenges in Big Data. We aim to extract knowledge from these sources using a Hierarchical Multi-Label Classification process called Semantic HMC. This process automatically learns a label hierarchy and classifies items from very large data sources. Five steps compose the Semantic HMC process: Indexation, Vectorization, Hierarchization, Resolution and Realization. The first three steps construct automatically the label hierarchy from data sources. The last two steps classify new items according to the label hierarchy. This paper focuses in the last two steps and presents a new highly scalable process to classify items from huge sets of unstructured text by using ontologies and rule-based reasoning. The process is implemented in a scalable and distributed platform to process Big Data and some results are discussed.
Type de document :
Communication dans un congrès
The International Workshop on Semantic Big Data (SBD 2016), Jul 2016, San Francisco, United States. pp.1 - 6, 2016, Proceedings of the International Workshop on Semantic Big Data 〈https://www.ifis.uni-luebeck.de/~groppe/sbd/2016〉. 〈10.1145/2928294.2928301〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01452220
Contributeur : Thomas Hassan <>
Soumis le : mardi 7 février 2017 - 14:04:11
Dernière modification le : mercredi 12 septembre 2018 - 01:26:42

Identifiants

Collections

Citation

Rafael Peixoto, Thomas Hassan, Christophe Cruz, Aurélie Bertaux, Nuno Silva. An unsupervised classification process for large datasets using web reasoning. The International Workshop on Semantic Big Data (SBD 2016), Jul 2016, San Francisco, United States. pp.1 - 6, 2016, Proceedings of the International Workshop on Semantic Big Data 〈https://www.ifis.uni-luebeck.de/~groppe/sbd/2016〉. 〈10.1145/2928294.2928301〉. 〈hal-01452220〉

Partager

Métriques

Consultations de la notice

134