An Approach for Fast Score Computation in Bayesian Network Structure Learning Over Large-Scale Distributed Data

Katib, Anas Adnan

An Approach for Fast Score Computation in Bayesian Network Structure Learning Over Large-Scale Distributed Data

Files

Dissertation_2018_Katib.pdf (731.65 KB)

Authors

Katib, Anas Adnan

Date

2018

Format

Thesis

Abstract

On a fundamental level, Bayes’ theorem enables us to utilize prior knowledge to determine the probability of an event. In consequence, its suitability for probabilistic reasoning has lead to its employment in probabilistic graphical modeling and the inception of Bayesian networks. The ﬁeld is saturated with techniques to learn the structure of a Bayesian network (also known as Bayes network). Nevertheless, most of the techniques struggle when the number of variables (or network nodes) and the input data grow drastically. At that point, parallel distributed processing is the best alternative to alleviate the computational complexity of this problem. To this end, we propose a gossip-based distributed score computation approach called DiSC that is used to compute the suﬃcient statistics of families of variables in order to accelerate the structure learning process of Bayesian networks. We show that DiSC can signiﬁcantly outperform map-reduce style score computations executed by the distributed computation framework Apache Spark on a variety of synthetic and real datasets with a low accuracy trade-oﬀ.

URI

https://hdl.handle.net/10355/67033

Degree

Ph.D.

Thesis Department

Computer Science, Computer Networking and Communication Systems (UMKC)

Collections

2018 UMKC Dissertations - Freely Available Online
Computer Science and Electrical Engineering Electronic Theses and Dissertations (UMKC)

Full item page

An Approach for Fast Score Computation in Bayesian Network Structure Learning Over Large-Scale Distributed Data

Files

Authors

Meeting name

Sponsors

Date

Journal Title

Format

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Table of Contents

URI

DOI

PubMed ID

Degree

Thesis Department

Rights

License

Collections