Video summarization using mosaicing and activity maps for aerial and biomedical imagery
No Thumbnail Available
Authors
Meeting name
Sponsors
Date
Journal Title
Format
Thesis
Subject
Abstract
The objective of this research is to develop a methodology to summarize the content of long videos using a few geospatial coverage maps or mini-mosaics for both aerial and biomedical imagery. Automatic mosaic and coverage map generation is a challenging research problem especially in computer vision in scenes with significant structure change, motion-induced parallax, sudden view point change and self-occlusion by camera, bad frames transmission, etc. In traditional approach, the mosaic generation can be very time consuming as it is quadratic with number of images in a sequence. Besides, frame-to-frame matching suffers from error accumulation or drifting problem. In contrast, we develop a linear-time approach which assembles a group of frames together and match each frame directly to the first frame (is de- fined as the reference) of the group. Next, the reference of each group is mapped to the reference of previous group in order to establish global alignment. It lessens the long chain of homography matrix multiplication across a video sequence resulting in reduced error accumulation or drifting in aerial imagery mosaicing. This approach is solely image based and uses no additional meta-data information. In the aspect of medical imaging, it is crucial to obtain certain structures or even cellular features in microscopic resolution. Unfortunately, the field of view is inherently limited by the capability of capturing instruments. Thus, mosaicing of such micro-structure is of utmost importance in order to restore original visual information for establishing broad structure morphology. But mosaicing can be challenging if there are deformable, motion-blurred, texture-less, feature-poor frames in the sequence. Discrete feature-based methods perform poorly in such cases for the lack of distinctive keypoints. Standard single block correlation matching strategies might not provide robust registration due to deformable content. In addition, the panorama suffers if there is motion blur present in a sequence. To handle these challenges, we propose a novel algorithm, Deformable Normalized Cross Correlation (DNCC) image matching with Random Sample Consensus (RANSAC) to establish robust registration and mosaicing of Frog Mesentery sequences. Besides, to produce seamless panorama from motion-blurred frames we present a novel technique, Gradient blending, based on image gradient information. We also implemented a pipeline for burnt-in label Identification (deep learning based), Segmentation and Interpolation for mosaicing of burnt-in videos. Once mosaicing is done, the next step is to extend it to activity recognition task which is the process of understanding and interpreting what is happening in a video stream. For example, activity maps and pattern of life maps, social maps and other types of geospatial data can be overlaid on the mosaics generated in previous step. Application of recognition includes improved target discrimination, identification, event recognition, pattern-of-life detection, and tracking. While mosaicing is the static content summarization of a sequence, activity maps summarizes dynamic events. We generate spatio-temporal volume for Mesentery images to model the dynamics of full vessel network at once at high resolution. Here we propose a new technique, Deformable Spatio-Temporal Interpolation (DSTI) for visualizing the ow in the entire network that looks interesting by fusing all of the spot videos using the scan, stop and image technique for imaging a large field of view using a small field of view lens at high resolution. So the synthesized Living Mosaic or video mosaic gives dynamic information on the full vessel network that is interesting for modeling and analysis.
Table of Contents
PubMed ID
Degree
Ph. D.
