A General Top-k Algorithm for Web Data Sources - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2011

A General Top-k Algorithm for Web Data Sources

Résumé

Several algorithms for top-k query processing over web data sources have been proposed, where sources return relevance scores for some query predicate, aggregated through a composition function. They assume specific conditions for the type of source access (sorted and/or random) and for the access cost, and propose various heuristics for choosing the next source to probe, while generally trying to refine the score of the most promising candidate. We present BreadthRefine (BR), a generic top-k algorithm, working for any combination of source access types and any cost settings. It proposes a new heuristic strategy, based on refining all the current top-k candidates, not only the best one. We present a rich panel of experiments comparing BR with state-of-the art algorithms and show that BR adapts to the specific settings of these algorithms, with lower cost.
Fichier principal
Vignette du fichier
topk-hal.pdf (1.47 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00624406 , version 1 (16-09-2011)

Identifiants

  • HAL Id : hal-00624406 , version 1

Citer

Mehdi Badr, Dan Vodislav. A General Top-k Algorithm for Web Data Sources. DEXA - Database and Expert Systems Applications, 2011, Toulouse, France. pp.379-393. ⟨hal-00624406⟩
139 Consultations
187 Téléchargements

Partager

Gmail Facebook X LinkedIn More