NCC-EM: A hybrid framework for decision making with missing information
Metadata[+] Show full item record
Accounting for uncertainty is important in any data driven decision making. The popular treatment of uncertainties is to employ classical probability theory by expressing variables as random variables or processes in terms of random distributions. This precise approach encounters difficulty and leads to deceptive predictions when the sources of uncertainty are epistemic in terms of incomplete (missing), conflicting, or erroneous information due to the lack of knowledge. There have been many frameworks developed against the precise probability formalism, and one of such frameworks is the Imprecise Probability (IP) based modeling. In this thesis, we develop and provide a novel hybrid framework, Naïve Credal Classifier with Expectation-Maximization data imputation, for decision making with missing information. The IP-based Credal Set concept is first introduced to model uncertainties for data with missing information. Then the Naïve Credal Classifier (NCC) is employed in this work, which is provided by the latest JNCC2 package. The key idea and research findings in this research is to model missing data using advanced imputation techniques to minimize the performance (accuracy) loss in NCC. The resulting NCC-EM framework is hybrid where the EM imputation technique is used as a preprocessing step. To verify and validate this hybrid framework, the NCC-EM is extensively tested on open machine learning datasets by simulating missing values, and it is shown that NCC-EM outperforms the existing NCC framework and traditional supervised classification methods.
Table of Contents
Introduction -- introduction to imprecise probability -- Naïve Bayes Classifier and Naïve Credal classifier -- NCC-EM: a novel Credal based framework -- Conclusion and future work