Practical Online Debugging of Spark-like applications

Matteo Marra; Guillermo Polito; Elisa Gonzalez Boix

Communication Dans Un Congrès Année : 2021

Practical Online Debugging of Spark-like applications

(1) , (2, 3) , (1)

1
2
3

Matteo Marra

Fonction : Auteur
PersonId : 1084703

Software Languages Lab

Guillermo Polito

Fonction : Auteur
PersonId : 13017
IdHAL : guillermo-polito
ORCID : 0000-0003-0813-8584
IdRef : 188347836

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Analyses and Languages Constructs for Object-Oriented Application Evolution

Elisa Gonzalez Boix

Fonction : Auteur

Software Languages Lab

Résumé

Apache Spark is a framework widely used for writing Big Data analytics applications that offers a scalable and fault-tolerant model based on rescheduling failing tasks on other nodes. While this is well-suited for hardware and infrastructure errors, it is not for application errors as they will reappear in the rescheduled tasks. As a result, applications are killed, losing all the progress and forcing developers to restart them from scratch. Despite the popularity of such a failure-recovery model, understanding and debugging Sparklike applications remain challenging. When an error occurs, developers need to analyze huge log files or undergo timeconsuming replays to find the bug. To address these concerns, we present an online debugging approach tailored to Big Data analytics applications. Our approach includes local debugging of remote parallel exceptions through dynamic local checkpoints, extended with domain-specific debugging operations and live code updating functionality. To deal with data-cleaning errors, we extend our model to easily allow developers to automatically ignore exceptions that happen at runtime. We validate our solution through performance benchmarks that show how our debugging approach is comparable or better than state-of-theart debugging solutions for Big Data. Furthermore, we conduct a user study to compare our approach with another state-of-theart debugging approach, and results show a lower time to find the solution to a bug using our approach, as well as a generally good perception of the features of the debugger.

Domaines

Langage de programmation [cs.PL]

Fichier principal

paper.pdf (1.73 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Lse Lse : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03398772

Soumis le : samedi 23 octobre 2021-11:24:49

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-03398772 , version 1 (23-10-2021)

Identifiants

HAL Id : hal-03398772 , version 1

Citer

Matteo Marra, Guillermo Polito, Elisa Gonzalez Boix. Practical Online Debugging of Spark-like applications. IEEE QRS 2021 : International Conference on Software Security and Reliability, Dec 2021, Hainan Island, China. ⟨hal-03398772⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-RMOD UNIV-LILLE

59 Consultations

68 Téléchargements

Practical Online Debugging of Spark-like applications

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager