In-memory distributed indexing for large-scale media data retrieval

No Thumbnail Available

Meeting name

Sponsors

Date

Journal Title

Format

Thesis

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Multimedia data includes various media types such as text, image and video. Recent research has shown that media data retrieval serves a critical role in the development of multimedia applications. However, due to the exponential growth of multimedia data, high-speed and efficient indexing is becoming more difficult than ever. In this thesis work, we propose a novel approach to speed up the retrieval process by adopting a distributed computing paradigm through the Apache Spark framework. Utilizing search trees on the Apache Spark ecosystem leads to fast and cost-effective media database retrievals by caching indexing structures into memory and aggregating ranked results with flexibilities for users to specify the importance of search cues. We conducted computational experiments on large-scaled biomedical images and protein 3D structures to demonstrate the effectiveness and scalability of our system with reasonably high accuracy.

Table of Contents

DOI

PubMed ID

Degree

M.S.

Thesis Department

Rights

Access is limited to the campuses of the University of Missouri.

License