dc.contributor.advisor | Lee, Yugyung, 1960- | eng |
dc.contributor.author | Bavirisetty, Venkata Pramod Gupta | eng |
dc.date.issued | 2014-07-30 | eng |
dc.date.submitted | 2014 Spring | eng |
dc.description | Title from PDF of title page, viewed on July 30, 2014 | eng |
dc.description | Thesis advisor: Yugyung Lee | eng |
dc.description | Vita | eng |
dc.description | Includes bibliographical references (pages 62-65) | eng |
dc.description | Thesis (M. S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2014 | eng |
dc.description.abstract | As huge amounts of data are created rapidly, the demand for the integration and analysis of such data has been growing steadily. It is especially essential to retrieve relevant and accurate evidence in healthcare and biomedical research. Even though query systems based on Ontology, Medical Subject Headings (MeSH), or keyword searches are available, query systems based on evidence and effective retrieval of data from large collections of clinical data are not sufficiently available. This thesis proposes a novel approach to analyze big data sets collected from Clinical trials research and discover significant evidence and association patterns with respect to conditions, treatment, and medication side effects. Our approach makes use of machine learning techniques in the Apache Hadoop framework with support from MetaMap and RxNorm. In this thesis, a heuristic measure of empirical evidence was newly designed considering the association degree of conditions, treatment, and medication side effects and the percentage of people affected. The Apriori algorithm was used to discover strong positive association rules with various measures including support, and confidence. We have examined a large and complex data set (12,327 study results) from clinicaltrials.gov and identified 8,291 strong association rules and 59,228 combinations with 432,841 subjects, 1761 conditions, 2836 drugs, and 27 side effects. The significance of these association patterns was evaluated in terms of the impact factor representing the percentage of the population with a high rate of side effects. Using these association rules and combination strengths, an evidence based query system was implemented to answer some integral questions. This query system also provided an interface to retrieve relevant publications from PubMed. The searching outcomes from this query system are compared with those from the PubMed search based on medical subject headings. | eng |
dc.description.tableofcontents | Abstract -- Illustrations -- Tables -- Introductions -- Related work -- Evidence based medical query model -- Implementation -- Results & Evaluation -- Conclusion and future work -- References | eng |
dc.format.extent | viii, 66 pages | eng |
dc.identifier.uri | http://hdl.handle.net/10355/43577 | eng |
dc.subject.lcsh | Evidence-based medicine -- Data processing | eng |
dc.subject.lcsh | Management information systems | eng |
dc.subject.other | Thesis -- University of Missouri--Kansas City -- Computer science | eng |
dc.title | Evidence based medical query system on large scale data | eng |
dc.type | Thesis | eng |
thesis.degree.discipline | Computer Science (UMKC) | eng |
thesis.degree.grantor | University of Missouri--Kansas City | eng |
thesis.degree.level | Masters | eng |
thesis.degree.name | M. S. | eng |