Skip to Main content Skip to Navigation
Conference papers

Towards Internet-Scale Convolutional Root-Cause Analysis with DiagNet

Abstract : Diagnosing problems in Internet-scale services remains particularly difficult and costly for both content providers and ISPs. Because the Internet is decentralized, the cause of such problems might lie anywhere between a user’s device and the datacenters hosting the service. Further, the set of possible problems and causes is not known in advance, making it impossible in practice to train a classifier with all combinations of problems, causes and locations. In this paper, we explore how machine learning techniques can be used for Internet-scale root cause analysis based on measurements taken from end-user devices. Using convolutional neural networks, we show how to build generic models that (i) are agnostic to the underlying network topology, (ii) do not require to define the full set of possible causes during training, and (iii) can be quickly adapted to diagnose new services. We evaluate our proposal, DiagNet, on a geodistributed multi-cloud deployment of online services, using a combination of fault injection and emulated clients running within automated browsers. Our experiments demonstrate the promising capabilities of our technique, delivering a recall of 73.9%, including on causes that were unknown at training time.
Complete list of metadata
Contributor : Loïck Bonniot Connect in order to contact the contributor
Submitted on : Tuesday, May 18, 2021 - 6:00:38 PM
Last modification on : Saturday, August 6, 2022 - 3:32:32 AM


Files produced by the author(s)





Loïck Bonniot, Christoph Neumann, François Taïani. Towards Internet-Scale Convolutional Root-Cause Analysis with DiagNet. IPDPS 2021 - The 35th IEEE International Parallel and Distributed Processing Symposium, May 2021, Portland / Virtual, United States. ⟨10.1109/IPDPS49936.2021.00084⟩. ⟨hal-02534888v2⟩



Record views


Files downloads