Contributions to large-scale data processing systems

Abstract : This thesis covers the topic of large-scale data processing systems,and more precisely three complementary approaches: the design of asystem to perform prediction about computer failures through theanalysis of monitoring data; the routing of data in a real-time systemlooking at correlations between message fields to favor locality; andfinally a novel framework to design data transformations usingdirected graphs of blocks.Through the lenses of the Smart Support Center project, we design ascalable architecture, to store time series reported by monitoringengines, which constantly check the health of computer systems. We usethis data to perform predictions, and detect potential problems beforethey arise.We then dive in routing algorithms for stream processing systems, anddevelop a layer to route messages more efficiently, by avoiding hopsbetween machines. For that purpose, we identify in real-time thecorrelations which appear in the fields of these messages, such ashashtags and their geolocation, for example in the case of tweets. Weuse these correlations to create routing tables which favor theco-location of actors handling these messages.Finally, we present λ-blocks, a novel programming framework to computedata processing jobs without writing code, but rather by creatinggraphs of blocks of code. The framework is fast, and comes withbatteries included: block libraries, plugins, and APIs to extendit. It is also able to manipulate computation graphs, foroptimization, analyzis, verification, or any other purposes.
Document type :
Theses
Complete list of metadatas

Cited literature [119 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01891825
Contributor : Abes Star <>
Submitted on : Wednesday, October 10, 2018 - 8:59:05 AM
Last modification on : Thursday, May 16, 2019 - 1:44:06 AM
Long-term archiving on : Friday, January 11, 2019 - 1:04:12 PM

File

CANEILL_2018__-_archivage.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01891825, version 1

Collections

Citation

Matthieu Caneill. Contributions to large-scale data processing systems. Other [cs.OH]. Université Grenoble Alpes, 2018. English. ⟨NNT : 2018GREAM006⟩. ⟨tel-01891825⟩

Share

Metrics

Record views

128

Files downloads

116