Possibilistic Similarity Measures for Data Science and Machine Learning Applications - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue IEEE Access Année : 2020

Possibilistic Similarity Measures for Data Science and Machine Learning Applications

Résumé

Measuring similarity is of a great interest in many research areas such as in data sciences, machine learning, pattern recognition, text analysis and information retrieval to name a few. Literature has shown that possibility is an attractive notion in the context of distinguishability assessment and can lead to very efficient and computationally inexpensive learning schemes. This paper focuses on determining the similarity between two possibility distributions. A review of existing similarity measures within the possi-bilistic framework is presented first. Then, similarity measures are analyzed with respect to their capacity to satisfy a set of required properties that a similarity measure should own. Most of the existing possibilistic similarity measures produce undesirable outcomes since they generally depend on the application context. A new similarity measure, called InfoSpecificity, is introduced and the similarity measures are categorized into three main methods: morphic-based, amorphic-based and hybrid. Two experiments are being conducted using four benchmark databases. The aim of the experiments is to compare the efficiency of the possibilistic similarity measures when applied to real data. Empirical experiments have shown good results for the hybrid methods, particularly with the InfoSpecificity measure. In general, the hybrid methods outperform the other two categories when evaluated on small-size samples, i.e., poor-data context (or poor-informed environment) where possibility theory can be used at the greatest benefit.
Fichier principal
Vignette du fichier
Possibilistic_Similarity_Measures_for_Data_Science_and_Machine_Learning_Applications.pdf (2.02 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02890097 , version 1 (06-07-2020)

Identifiants

Citer

Asma Charfi, Sonda Ammar Bouhamed, Eloi Bosse, Imene Khanfir Kallel, Wassim Bouchaala, et al.. Possibilistic Similarity Measures for Data Science and Machine Learning Applications. IEEE Access, 2020, ⟨10.1109/ACCESS.2020.2979553⟩. ⟨hal-02890097⟩
35 Consultations
323 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More