dc.contributor.advisor | Lee, Yugyung, 1960- | |
dc.contributor.author | Gaikwad, Priyanka V. | |
dc.date.issued | 2020 | |
dc.date.submitted | 2020 Spring | |
dc.description | Title from PDF of title page viewed June 24, 2020 | |
dc.description | Thesis advisor: Yugyung Lee | |
dc.description | Vita | |
dc.description | Includes bibliographical references (pages 44-46) | |
dc.description | Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2020 | |
dc.description.abstract | Deep learning is beneficial from big data while facing computationally expensive, with an increase in data size. Some severe data issues, such as the presence of highly skewed, sparse, and imbalanced data, would substantially influence the findings of machine learning. Due to the complexity of such data, the ability to assess and evaluate the data is central to cost-effective deep learning. More specifically, in Deep Learning, choosing the right validation method is vital to ensure the accuracy and biases of the validation process. Current validation techniques, including k-fold cross-validation or random split of training and testing datasets, are hampered by the lack of systematic sampling with a comprehensive understanding of the data.
In this thesis, we proposed a sampling technique called DeepSampling that aims at achieving cost-effective deep learning for a given application. For the proposed DeepSampling framework, two sampling schemes are designed [1] to resolve the imbalanced data issues using Generative Adversarial Networks (GANs), [2] to develop an effective sampling technique based on clustering. The clustering techniques are based on Mahalanobis distance metric and use t-SNE (T-distributed Stochastic Neighbor Embedding), to overcome the data skewness and sparseness issues.
The proposed DeepSampling technique for cost-effective deep learning has been evaluated with three Deep Learning models and four benchmark datasets, including MNIST, Breast Histology, Malaria cell images, and Stanford dog. The results confirm that the accuracies obtained by DeepSampling are improved by approximately 2-3% for image classification, compared to traditional evaluation techniques on the same dataset. | |
dc.description.tableofcontents | Introduction -- Background and related work -- Proposed framework -- Results and evaluations -- Conclusion and future work | |
dc.format.extent | xi, 47 pages | |
dc.identifier.uri | https://hdl.handle.net/10355/74350 | |
dc.subject.lcsh | Machine learning | |
dc.subject.lcsh | Data mining | |
dc.subject.lcsh | Image data mining | |
dc.subject.other | Thesis -- University of Missouri--Kansas City -- Computer science | |
dc.title | DeepSampling: Image Sampling Technique for Cost-Effective Deep Learning | |
thesis.degree.discipline | Computer Science (UMKC) | |
thesis.degree.grantor | University of Missouri--Kansas City | |
thesis.degree.level | Masters | |
thesis.degree.name | M.S. (Master of Science) | |