Indoor human activites recognition using audio signals based on support vector machines and convolutional neural networks
Indoor human activities recognition can be of great importance in our daily life, especially for surveillance and security purposes, such as informing elderly people affected in hearing capabilities about environmental sounds (door bells, alarm signals, etc.). This thesis will present two approaches using audio signal processing: support vector machines method and convolutional neutral networks method. In both methods, audio feature extraction is needed since the original audio signal contains undesired information which could cause difficulties in audio recognition. For the support vector machines method, melfrequency cepstral coefficients are extracted from the original audio signals. With melfrequency cepstral coefficients, histogram feature can be generated using the concept of bag of words, which to be past into support vector machines algorithm, then the recognition results are generated by the algorithm. For convolutional neutral networks method, multiple audio features are extracted from the original audio signal including melfrequency cepstral coefficients, mel-scaled spectrogram, chroma feature and spectral contrast. These audio features are fed into a 5-layer convolutional neutral network, and the recognition results are generated by the network. The support vector machine method focusing on implementing the algorithm into a single chip machine, with a good accuracy at 90%. The convolutional neutral network method yields a much higher accuracy at around 97%.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.