Adaptive Failure Detection Timers for IGP Networks

Abstract : Guaranteeing a high availability of communication services to customers is a day-to-day challenge for Carrier networks. In the context of IGP routing, when a failure occurs, a re-convergence process is initiated in order to re-establish a consistent view of the network. During this process, the latency of the failure detection, which is realized by the Hello protocol, is responsible for an important unavailability. Indeed, quick failure detection would require the use of fast Hello exchange which, in turn, would cause false detection and instability. However, there are often forewarning signs that a network device is about to stop working properly. Based on an embedded and real-time risk-level assessment, one can adapt in a real-time manner the Hello message frequency of sick nodes and thus reduce unavailability while maintaining the routing stability. This papers details and evaluates a mechanism for adaptive failure detection timers in IGP networks. The impacts in terms of availability and quantity of Hello messages have been estimated based on an analytical model and then simulated to measure the benefits of the proposed proactive self-healing function.
Bruno Vidalenc, Ludovic Noirie, Samir Ghamri-Doudane, Eric Renault. Adaptive Failure Detection Timers for IGP Networks. IFIP Networking Conference 2013, May 2013, Brooklyn, NY, United States. pp.1-9. ⟨hal-00904729⟩



