Accurate, fast, and robust 3D city-scale reconstruction using wide area motion imagery

Yao, Shizeng

Yao, Shizeng

View/Open

YaoShizengResearch.pdf (54.65Mb)

Date

2021

Format

Thesis

Metadata

[+] Show full item record

Abstract

Multi-view stereopsis (MVS) is a core problem in computer vision, which takes a set of scene views together with known camera poses, then produces a geometric representation of the underlying 3D model Using 3D reconstruction one can determine any object's 3D profile, as well as knowing the 3D coordinate of any point on the profile. The 3D reconstruction of objects is a generally scientific problem and core technology of a wide variety of fields, such as Computer Aided Geometric Design (CAGD), computer graphics, computer animation, computer vision, medical imaging, computational science, virtual reality, digital media, etc. However, though MVS problems have been studied for decades, many challenges still exist in current state-of-the-art algorithms, for example, many algorithms still lack accuracy and completeness when tested on city-scale large datasets, most MVS algorithms available require a large amount of execution time and/or specialized hardware and software, which results in high cost, and etc... This dissertation work tries to address all the challenges we mentioned, and proposed multiple solutions. More specifically, this dissertation work proposed multiple novel MVS algorithms to automatically and accurately reconstruct the underlying 3D scenes. By proposing a novel volumetric voxel-based method, one of our algorithms achieved near real-time runtime speed, which does not require any special hardware or software, and can be deployed onto power-constrained embedded systems. By developing a new camera clustering module and a novel weighted voting-based surface likelihood estimation module, our algorithm is generalized to process di erent datasets, and achieved the best performance in terms of accuracy and completeness when compared with existing algorithms. This dissertation work also performs the very first quantitative evaluation in terms of precision, recall, and F-score using real-world LiDAR groundtruth data. Last but not least, this dissertation work proposes an automatic workflow, which can stitch multiple point cloud models with limited overlapping areas into one larger 3D model for better geographical coverage. All the results presented in this dissertation work have been evaluated in our wide area motion imagery (WAMI) dataset, and improved the state-of-the-art performances by a large margin.The generated results from this dissertation work have been successfully used in many aspects, including: city digitization, improving detection and tracking performances, real time dynamic shadow detection, 3D change detection, visibility map generating, VR environment, and visualization combined with other information, such as building footprint and roads.

URI

https://hdl.handle.net/10355/93247
https://doi.org/10.32469/10355/93247

Degree

Ph. D.

Thesis Department

Computer science (MU)