DeepSampling: Image Sampling Technique for Cost-Effective Deep Learning

Gaikwad, Priyanka V.

URI

https://hdl.handle.net/10355/74350

dc.contributor.advisor	Lee, Yugyung, 1960-
dc.contributor.author	Gaikwad, Priyanka V.
dc.date.issued	2020
dc.date.submitted	2020 Spring
dc.description	Title from PDF of title page viewed June 24, 2020
dc.description	Thesis advisor: Yugyung Lee
dc.description	Vita
dc.description	Includes bibliographical references (pages 44-46)
dc.description	Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2020
dc.description.abstract	Deep learning is beneficial from big data while facing computationally expensive, with an increase in data size. Some severe data issues, such as the presence of highly skewed, sparse, and imbalanced data, would substantially influence the findings of machine learning. Due to the complexity of such data, the ability to assess and evaluate the data is central to cost-effective deep learning. More specifically, in Deep Learning, choosing the right validation method is vital to ensure the accuracy and biases of the validation process. Current validation techniques, including k-fold cross-validation or random split of training and testing datasets, are hampered by the lack of systematic sampling with a comprehensive understanding of the data. In this thesis, we proposed a sampling technique called DeepSampling that aims at achieving cost-effective deep learning for a given application. For the proposed DeepSampling framework, two sampling schemes are designed [1] to resolve the imbalanced data issues using Generative Adversarial Networks (GANs), [2] to develop an effective sampling technique based on clustering. The clustering techniques are based on Mahalanobis distance metric and use t-SNE (T-distributed Stochastic Neighbor Embedding), to overcome the data skewness and sparseness issues. The proposed DeepSampling technique for cost-effective deep learning has been evaluated with three Deep Learning models and four benchmark datasets, including MNIST, Breast Histology, Malaria cell images, and Stanford dog. The results confirm that the accuracies obtained by DeepSampling are improved by approximately 2-3% for image classification, compared to traditional evaluation techniques on the same dataset.
dc.description.tableofcontents	Introduction -- Background and related work -- Proposed framework -- Results and evaluations -- Conclusion and future work
dc.format.extent	xi, 47 pages
dc.identifier.uri	https://hdl.handle.net/10355/74350
dc.subject.lcsh	Machine learning
dc.subject.lcsh	Data mining
dc.subject.lcsh	Image data mining
dc.subject.other	Thesis -- University of Missouri--Kansas City -- Computer science
dc.title	DeepSampling: Image Sampling Technique for Cost-Effective Deep Learning
thesis.degree.discipline	Computer Science (UMKC)
thesis.degree.grantor	University of Missouri--Kansas City
thesis.degree.level	Masters
thesis.degree.name	M.S. (Master of Science)

Files in this item

Name:: Gaikwad_umkc_0134P_11595.pdf
Size:: 1.780Mb
Format:: PDF
Description:: DeepSampling: Image Sampling ...

View/Open

This item appears in the following Collection(s)

2020 UMKC Theses - Freely Available Online
Computer Science and Electrical Engineering Electronic Theses and Dissertations (UMKC)
The items in this collection are the scholarly output of UMKC graduate students.

[-] Show simple item record