In-memory distributed indexing for large-scale media data retrieval

Ma, Yinmiao

In-memory distributed indexing for large-scale media data retrieval

Authors

Ma, Yinmiao

Date

2017

Format

Thesis

Abstract

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Multimedia data includes various media types such as text, image and video. Recent research has shown that media data retrieval serves a critical role in the development of multimedia applications. However, due to the exponential growth of multimedia data, high-speed and efficient indexing is becoming more difficult than ever. In this thesis work, we propose a novel approach to speed up the retrieval process by adopting a distributed computing paradigm through the Apache Spark framework. Utilizing search trees on the Apache Spark ecosystem leads to fast and cost-effective media database retrievals by caching indexing structures into memory and aggregating ranked results with flexibilities for users to specify the importance of search cues. We conducted computational experiments on large-scaled biomedical images and protein 3D structures to demonstrate the effectiveness and scalability of our system with reasonably high accuracy.

URI

https://hdl.handle.net/10355/63362

Degree

M.S.

Thesis Department

Computer science (MU)

Rights

Access is limited to the campuses of the University of Missouri.

Collections

2017 MU theses - Access restricted to UM
Computer Science electronic theses and dissertations (MU)

Full item page

In-memory distributed indexing for large-scale media data retrieval

Authors

Meeting name

Sponsors

Date

Journal Title

Format

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Table of Contents

URI

DOI

PubMed ID

Degree

Thesis Department

Rights

License

Collections