dc.contributor.advisor | Lee, Yugyung, 1960- | |
dc.contributor.author | Chilukuri, Nagababu | |
dc.date.issued | 2020 | |
dc.date.submitted | 2020 Fall | |
dc.description | Title from PDF of title page viewed February 25, 2021 | |
dc.description | Vita | |
dc.description | Includes bibliographical references (page 42-44) | |
dc.description | Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2020 | |
dc.description | Thesis advisor: Yugyung Lee | |
dc.description.abstract | In recent years, there is growing interest in environmental sound classification with a plethora of real-world applications, especially in audio fields like speech and music. Recent research works have proven spectral images based on deep learning models for better performance than standard methods. This thesis intends to design a fusion system by combining various audio features, including Spectrogram (SG), Chromagram (CG), and Mel Frequency Cepstral Coefficient (MFCC), for useful environmental sound classification. We propose the AudioCNN model based on a fusion network consisting of multiple Convolutional Neural Networks (CNN) with aggregation methods for various spectral image spectrogram features and audio-specific data augmentation techniques. We have conducted our extensive experiments with benchmark datasets, including Urbansound8k, ESC-50, and ESC-10, emotion datasets. We have obtained state-of-the-art results by outperforming the previous solutions. The experiment results show that combined features with lighter network CNN models outperform baseline environmental sound classification methods. The proposed Multi-Channel fusion network with data augmentation achieved competitive results on UrbanSound8K datasets compared to existing models. | |
dc.description.tableofcontents | Introduction -- Background -- Related work -- Methodology -- Results and evaluation -- Conclusion | |
dc.format.extent | ix, 45 pages | |
dc.identifier.uri | https://hdl.handle.net/10355/80791 | |
dc.subject.lcsh | Machine learning | |
dc.subject.lcsh | Computer sound processing | |
dc.subject.lcsh | Sounds -- Classification | |
dc.subject.other | Thesis -- University of Missouri--Kansas City -- Computer science | |
dc.title | AudioCNN: Audio Event Classification With Deep Learning Based Multi-Channel Fusion Networks | |
thesis.degree.discipline | Computer Science (UMKC) | |
thesis.degree.grantor | University of Missouri--Kansas City | |
thesis.degree.level | Masters | |
thesis.degree.name | M.S. (Master of Science) | |