Deep heterogeneous superpixel neural networks for image analysis and feature extraction
Metadata[+] Show full item record
Lately, deep convolutional neural networks are rapidly transforming and enhancing computer vision accuracy and performance, and pursuing higher-level and interpretable object recognition. Superpixel-based methodologies have been used in conventional computer vision research where their efficient representation has superior effects. In contemporary computer vision research driven by deep neural networks, superpixel-based approaches mainly rely on oversegmentation to provide a more efficient representation of the imagery data, especially when the computation is too expensive in time or memory to perform in pairwise similarity regularization or complex graphical probabilistic inference. In this dissertation, we proposed a novel superpixel-enabled deep neural network paradigm by relaxing some of the prior assumptions in the conventional superpixel-based methodologies and exploring its capabilities in the context of advanced deep convolutional neural networks. This produces novel neural network architectures that can achieve higher-level object relation modeling, weakly supervised segmentation, high explainability, and facilitate insightful visualizations. This approach has the advantage of being an efficient representation of the visual signal and has the capability to dissect out relevant object components from other background noise by spatially re-organizing visual features. Specifically, we have created superpixel models that join graphical neural network techniques and multiple-instance learning to achieve weakly supervised object detection and generate precise object bounding without pixel-level training labels. This dissection and the subsequent learning by the architecture promotes explainable models, whereby the human users of the models can see the parts of the objects that have led to recognition. Most importantly, this neural design's natural result goes beyond abstract rectangular bounds of an object occurrence (e.g., bounding box or image chip), but instead approaches efficient parts-based segmented recognition. It has been tested on commercial remote sensing satellite imagery and achieved success. Additionally, We have developed highly efficient monocular indoor depth estimation based on superpixel feature extraction. Furthermore, we have demonstrated state-of-theart weakly supervised object detection performance on two contemporary benchmark data sets, MS-COCO and VOC 2012. In the future, deep learning techniques based on superpixel-enabled image analysis can be further optimized in accuracy and computational performance; and it will also be interesting to evaluate in other research domains, such as those involving medical imagery, infrared imagery, or hyperspectral imagery.