Stab-FD: a cooperative and adaptive failure detector for wide area networks - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Parallel and Distributed Computing Année : 2024

Stab-FD: a cooperative and adaptive failure detector for wide area networks

Résumé

Failure detectors (FDs) are a fundamental abstraction that plays a central role in the design of distributed systems. FDs are distributed oracles that provide processes with unreliable information about process failures, often in the form of a list of trusted or suspected process identities. In this article, we propose a timer-based FD which assesses the quality of its input links, and exchanges its local estimations with other nodes. Nodes use this information to adjust their timers dynamically. Capturing the variations in the quality of each link reduces the number of false suspicions without degrading failure detection time. We present experiments on a dataset of real traces collected on PlanetLab, and compare our approach to well-known state-of-the-art algorithms. Our results show that our new algorithms yield a good trade-off in terms of failure detection speed and accuracy in real scenarios.
Fichier principal
Vignette du fichier
JPDC-2024.pdf (935.14 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04389132 , version 1 (11-01-2024)

Identifiants

Citer

Pierre Sens, Luciana Arantes, Anubis Graciela de Moraes Rossetto, Olivier Marin. Stab-FD: a cooperative and adaptive failure detector for wide area networks. Journal of Parallel and Distributed Computing, 2024, 186, pp.104803. ⟨10.1016/j.jpdc.2023.104803⟩. ⟨hal-04389132⟩
27 Consultations
9 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More