[-] Show simple item record

dc.contributor.advisorShyu, Chi-Reneng
dc.contributor.authorMahamaneerat, Wannapa Kay, 1974-eng
dc.date.issued2008eng
dc.date.submitted2008 Summereng
dc.descriptionTitle from PDF of title page (University of Missouri--Columbia, viewed on February 24, 2010).eng
dc.descriptionThe entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file.eng
dc.descriptionDissertation advisor: Dr. Chi-Ren Shyu.eng
dc.descriptionVita.eng
dc.descriptionPh.D. University of Missouri--Columbia 2008.eng
dc.description.abstractTraditional brute-force association mining approaches, when applied to large datasets, are thorough but inefficient due to computational complexity. A low global minimum probability threshold can worsen this complexity by producing an overwhelming number of associations; however, a high threshold may not uncover valuable associations, especially from underrepresented groups within the population. Regardless, the uncovered associations are not systematically organized. To solve these problems, novel Domain-Concept Mining (DCM) with Partition Aggregation (DCM-PA) has been developed. DCM organizes data by grouping transactions with common characteristics, such as a certain age group, into "domain-concepts" (dc). DCM granulizes partitioning criteria by pairing each attribute with its values. Criteria may include under-represented groups as well as spatial, temporal, and incremental dimensionalities. Then, a statistical power analysis is utilized to determine if multiple criteria of the same attribute, such as age group 18-24 and 25-34, should be combined to form a broader partition. Doing so maintains the tradeoff between findings with statistical significance and computational resource consumptions, while preserving data organization. Associations can be extracted from each partition independently because a partition contains all of its qualified transactions. Moreover, the partition size proportionally adjusts the global threshold to be more specific and sensitive. After the initial phase is complete, DCM-PA efficiently reuses DCM's associations to compute results from multiple-partition aggregation (union or intersection) using Bayes Theorem and a pipelining technique. DCM-PA offers the flexibility to perform association mining that is expected to uncovering more valuable knowledge through means like trends and comparisons from various dc partitions and their aggregations.eng
dc.description.bibrefIncludes bibliographical references.eng
dc.format.extentxvi, 211 pageseng
dc.identifier.oclc609888853eng
dc.identifier.urihttps://hdl.handle.net/10355/7195
dc.identifier.urihttps://doi.org/10.32469/10355/7195eng
dc.languageEnglisheng
dc.publisherUniversity of Missouri--Columbiaeng
dc.relation.ispartofcommunityUniversity of Missouri--Columbia. Graduate School. Theses and Dissertationseng
dc.rightsOpenAccess.eng
dc.rights.licenseThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.
dc.subject.lcshData miningeng
dc.subject.lcshAssociation rule miningeng
dc.subject.lcshData structures (Computer science)eng
dc.subject.lcshDatabase searchingeng
dc.titleDomain-concept mining : an efficient on-demand data mining approacheng
dc.typeThesiseng
thesis.degree.disciplineComputer science (MU)eng
thesis.degree.grantorUniversity of Missouri--Columbiaeng
thesis.degree.levelDoctoraleng
thesis.degree.namePh. D.eng


Files in this item

[PDF]
[PDF]
[PDF]

This item appears in the following Collection(s)

[-] Show simple item record