Efficient model similarity estimation with robust hashing - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Software and Systems Modeling Année : 2021

Efficient model similarity estimation with robust hashing

Résumé

As model-driven engineering (MDE) is increasingly adopted in complex industrial scenarios, modeling artefacts become a key and strategic asset for companies. As such, any MDE ecosystem must provide mechanisms to protect and exploit them. Current approaches depend on the calculation of the relative similarity among pairs of models. Unfortunately, model similarity calculation mechanisms are computationally expensive which prevents their use in large repositories or very large models. In this sense, this paper explores the adaptation of the robust hashing technique to the MDE domain as an efficient estimation method for model similarity. Indeed, robust hashing algorithms (i.e., hashing algorithms that generate similar outputs from similar input data) have proved useful as a key building block in intellectual property protection, authenticity assessment and fast comparison and retrieval solutions for different application domains. We present a detailed method for the generation of robust hashes for different types of models. Our approach is based on the translation to the MDE domain of diverse techniques such as summary extraction, minhash generation and locality-sensitive hash function families, originally developed for the comparison and classification of large datasets. We validate our approach with a prototype implementation and show that: (1) our approach can deal with any graph-based model representation; (2) a strong correlation exists between the similarity calculated directly on the robust hashes and a distance metric calculated over the original models; and (3) our approach scales well on large models and greatly reduces the time required to find similar models in large repositories.
Fichier non déposé

Dates et versions

hal-03316737 , version 1 (06-08-2021)

Identifiants

Citer

Salvador Martínez, Sébastien Gérard, Jordi Cabot. Efficient model similarity estimation with robust hashing. Software and Systems Modeling, 2021, ⟨10.1007/s10270-021-00915-9⟩. ⟨hal-03316737⟩
54 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More