Large scale image classification and object detection

Sun, Miao (Engineer)

Sun, Miao (Engineer)

View/Open

public.pdf (2.194Kb)

research.pdf (7.105Mb)

short.pdf (20.34Kb)

Date

2016

Format

Thesis

Metadata

[+] Show full item record

Abstract

Significant advancement of research on image classification and object detection has been achieved in the past decade. Deep convolutional neural networks have exhibited superior performance in many visual recognition tasks including image classification, object detection, and scene labeling, due to their large learning capacity and resistance to overfit. However, learning a robust deep CNN model for object recognition is still quite challenging because image classification and object detection is a severely unbalanced large-scale problem. In this dissertation, we aim at improving the performance of image classification and object detection algorithms by taking advantage of deep convolutional neural networks by utilizing the following strategies: We introduce Deep Neural Pattern, a local feature densely extracted from an image with arbitrary resolution using a well trained deep convolutional neural network. We propose a latent CNN framework, which will automatically select the most discriminate region in the image to reduce the effect of irrelevant regions. We also develop a new combination scheme for multiple CNNs via Latent Model Ensemble to overcome the local minima problem of CNNs. In addition, a weakly supervised CNN framework, referred to as Multiple Instance Learning Convolutional Neural Networks is developed to alleviate strict label requirements. Finally, a novel residual-network architecture, Residual networks of Residual networks, is constructed to improve the optimization ability of very deep convolutional neural networks. All the proposed algorithms are validated by thorough experiments and have shown solid accuracy on large scale object detection and recognition benchmarks.

URI

https://hdl.handle.net/10355/59786
https://doi.org/10.32469/10355/59786

Degree

Ph. D.

Thesis Department

Electrical and computer engineering (MU)

Rights

OpenAccess.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.