Deep learning and DCT-based hand-crafted features for computer vision tasks
Abstract
Feature extraction and matching are critical components for many computer vision tasks including camera pose estimation, 3D reconstruction, simultaneous localization and mapping (SLAM), and object tracking, etc. Features are image patterns that have distinctive characteristics in its immediate neighborhood and are generally unique structures such as keypoints, lines, and shapes. Effectiveness of a feature is heavily dependent on the data and problem. For instance, aerial video presents challenges due to oblique viewing angles, perspective shape distortions, and numerous repetitive objects and textures. Consequently, it is important to design features optimized for specific applications like Structure-from-Motion (SfM) and scene perception. In this dissertation, we developed several feature detection, matching, and tracking pipelines for various computer vision tasks. First, we proposed two novel discrete cosine transform-based feature descriptors for robustly matching and tracking feature keypoints across oblique views in city-scale aerial video. Second, we introduced a similar frequency domain descriptor for line segment matching to complement keypoint features particularly for image sequences with low-texture scenes. Third, we evaluated local features for camera pose estimation accuracy on simulated aerial imagery within 3D reconstructions pipelines and demonstrated that the proposed hand-crafted features with optimized parameters still outperform the state-of-the-art deep learning based features. Besides, we developed a fusion-based single object tracker suitable for hover and stare mode of unmanned aerial vehicles such as drones. Finally, we applied the feature detection and tracking techniques to plant images and proposed a seam carving inspired root tracing and growth analysis pipeline.
Degree
Ph. D.