Skip to Main content Skip to Navigation
Theses

Incremental Bayesian network structure learning from data streams

Abstract : In the last decade, data stream mining has become an active area of research, due to the importance of its applications and an increase in the generation of streaming data. The major challenges for data stream analysis are unboundedness, adaptiveness in nature and limitations over data access. Therefore, traditional data mining techniques cannot directly apply to the data stream. The problem aggravates for incoming data with high dimensional domains such as social networks, bioinformatics, telecommunication etc, having several hundreds and thousands of variables. It poses a serious challenge for existing Bayesian network structure learning algorithms. To keep abreast with the latest trends, learning algorithms need to incorporate novel data continuously. The existing state of the art in incremental structure learning involves only several tens of variables and they do not scale well beyond a few tens to hundreds of variables. This work investigates a Bayesian network structure learning problem in high dimensional domains. It makes a number of contributions in order to solve these problems. In the first step we proposed an incremental local search approach iMMPC to learn a local skeleton for each variable. Further, we proposed an incremental version of Max-Min Hill-Climbing (MMHC) algorithm to learn the whole structure of the network. We also proposed some guidelines to adapt it with sliding and damped window environments. Finally, experimental results and theoretical justifications that demonstrate the feasibility of our approach demonstrated through extensive experiments on synthetic datasets.
Document type :
Theses
Complete list of metadata

Cited literature [136 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01284332
Contributor : Lina Duke Connect in order to contact the contributor
Submitted on : Monday, March 7, 2016 - 2:46:35 PM
Last modification on : Wednesday, April 27, 2022 - 4:42:18 AM
Long-term archiving on: : Wednesday, June 8, 2016 - 2:41:56 PM

Identifiers

  • HAL Id : tel-01284332, version 1

Citation

Amanullah Yasin. Incremental Bayesian network structure learning from data streams. Machine Learning [cs.LG]. Université de Nantes, 2013. English. ⟨tel-01284332⟩

Share

Metrics

Record views

303

Files downloads

1633