Probabilistic Relational Model Benchmark Generation

Abstract : The validation of any database mining methodology goes through an evaluation process where benchmarks availability is essential. In this paper, we aim to randomly generate relational database benchmarks that allow to check probabilistic dependencies among the attributes. We are particularly interested in Probabilistic Relational Models (PRMs), which extend Bayesian Networks (BNs) to a relational data mining context and enable effective and robust reasoning over relational data. Even though a panoply of works have focused, separately , on the generation of random Bayesian networks and relational databases, no work has been identified for PRMs on that track. This paper provides an algorithmic approach for generating random PRMs from scratch to fill this gap. The proposed method allows to generate PRMs as well as synthetic relational data from a randomly generated relational schema and a random set of probabilistic dependencies. This can be of interest not only for machine learning researchers to evaluate their proposals in a common framework, but also for databases designers to evaluate the effectiveness of the components of a database management system.
Type de document :
Rapport
[Technical Report] LARODEC Laboratory, ISG, Université de Tunis, Tunisia; DUKe research group, LINA Laboratory UMR 6241, University of Nantes, France; DataForPeople, Nantes, France. 2016
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01273307
Contributeur : Mouna Ben Ishak <>
Soumis le : dimanche 14 février 2016 - 21:56:43
Dernière modification le : lundi 23 octobre 2017 - 17:44:02
Document(s) archivé(s) le : samedi 12 novembre 2016 - 19:11:20

Fichiers

TRGeneration.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01273307, version 1
  • ARXIV : 1603.00709

Collections

Citation

Mouna Ben Ishak, Rajani Chulyadyo, Philippe Leray. Probabilistic Relational Model Benchmark Generation. [Technical Report] LARODEC Laboratory, ISG, Université de Tunis, Tunisia; DUKe research group, LINA Laboratory UMR 6241, University of Nantes, France; DataForPeople, Nantes, France. 2016. 〈hal-01273307〉

Partager

Métriques

Consultations de la notice

159

Téléchargements de fichiers

99