dc.contributor.advisor | Shyu, Chi-Ren | eng |
dc.contributor.author | Mahamaneerat, Wannapa Kay, 1974- | eng |
dc.date.issued | 2008 | eng |
dc.date.submitted | 2008 Summer | eng |
dc.description | Title from PDF of title page (University of Missouri--Columbia, viewed on February 24, 2010). | eng |
dc.description | The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file. | eng |
dc.description | Dissertation advisor: Dr. Chi-Ren Shyu. | eng |
dc.description | Vita. | eng |
dc.description | Ph.D. University of Missouri--Columbia 2008. | eng |
dc.description.abstract | Traditional brute-force association mining approaches, when applied to large datasets, are thorough but inefficient due to computational complexity. A low global minimum probability threshold can worsen this complexity by producing an overwhelming number of associations; however, a high threshold may not uncover valuable associations, especially from underrepresented groups within the population. Regardless, the uncovered associations are not systematically organized. To solve these problems, novel Domain-Concept Mining (DCM) with Partition Aggregation (DCM-PA) has been developed. DCM organizes data by grouping transactions with common characteristics, such as a certain age group, into "domain-concepts" (dc). DCM granulizes partitioning criteria by pairing each attribute with its values. Criteria may include under-represented groups as well as spatial, temporal, and incremental dimensionalities. Then, a statistical power analysis is utilized to determine if multiple criteria of the same attribute, such as age group 18-24 and 25-34, should be combined to form a broader partition. Doing so maintains the tradeoff between findings with statistical significance and computational resource consumptions, while preserving data organization. Associations can be extracted from each partition independently because a partition contains all of its qualified transactions. Moreover, the partition size proportionally adjusts the global threshold to be more specific and sensitive. After the initial phase is complete, DCM-PA efficiently reuses DCM's associations to compute results from multiple-partition aggregation (union or intersection) using Bayes Theorem and a pipelining technique. DCM-PA offers the flexibility to perform association mining that is expected to uncovering more valuable knowledge through means like trends and comparisons from various dc partitions and their aggregations. | eng |
dc.description.bibref | Includes bibliographical references. | eng |
dc.format.extent | xvi, 211 pages | eng |
dc.identifier.oclc | 609888853 | eng |
dc.identifier.uri | https://hdl.handle.net/10355/7195 | |
dc.identifier.uri | https://doi.org/10.32469/10355/7195 | eng |
dc.language | English | eng |
dc.publisher | University of Missouri--Columbia | eng |
dc.relation.ispartofcommunity | University of Missouri--Columbia. Graduate School. Theses and Dissertations | eng |
dc.rights | OpenAccess. | eng |
dc.rights.license | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. | |
dc.subject.lcsh | Data mining | eng |
dc.subject.lcsh | Association rule mining | eng |
dc.subject.lcsh | Data structures (Computer science) | eng |
dc.subject.lcsh | Database searching | eng |
dc.title | Domain-concept mining : an efficient on-demand data mining approach | eng |
dc.type | Thesis | eng |
thesis.degree.discipline | Computer science (MU) | eng |
thesis.degree.grantor | University of Missouri--Columbia | eng |
thesis.degree.level | Doctoral | eng |
thesis.degree.name | Ph. D. | eng |