MDRED: Multi-Modal Multi-Task Distributed Recognition for Event Detection
Abstract
Understanding users’ context is essential in emerging mobile sensing applications, such
as Metal Detector, Glint Finder, Facefirst. Over the last decade, Machine Learning (ML)
techniques have evolved dramatically for real-world applications. Specifically, Deep Learning
(DL) has attracted tremendous attention for diverse applications including speech recognition,
computer vision. However, ML requires extensive computing resources. ML applications are
not suitable for devices with limited computing capabilities. Furthermore, customizing ML
applications for users’ context is not easy. Such a situation presents real challenges to mobile
based ML applications. We are motivated to solve this problem by designing a distributed and
collaborative computing framework for ML edge computing and applications.
In this thesis, we propose the Multi-Modal Multi-Task Distributed Recognition for
Event Detection (MDRED) framework for complex event recognition with images. The MDRED
framework is based on a hybrid ML model that is composed of Deep Learning (DL) and Shallow
Learning (SL). The lower level of the MDRED framework is based on the DL models for (1)
object detection, (2) color recognition, (3) emotion recognition, (4) face detection, (5) text
detection with event images. The higher level is based on the SL-based fusion techniques for
the event detection based on the outcomes from the lower level DL models. The fusion model
is designed as a weighted feature vector generated by a modified Term Frequency and Inverse
Document Frequency (TF-IDF) algorithm, considering common and unique multi-modal
features that are recognized for event detection. The prototype of the MDRED framework has
been implemented: A master-slave architecture was designed for coordinating the distributed
computing among multiple mobile devices at the edge while connecting the edge devices to
the cloud ML servers. The MDRED model has been evaluated with the benchmark event
datasets and compared with the state-of-the-art event detection models. The MDRED
accuracy of 90.5%, 98.8%, 78% for SocEID, UIUC Sports, RED Events datasets, respectively,
outperformed the baseline models of AlexNet-fc7, WEBLY-fc7, WIDER-fc7 and Event concepts.
We also demonstrate the MDRED application running on Android devices for the real-time
event detection.
Table of Contents
Introduction -- Background and related work -- Proposed work -- Results and evaluation -- Conclusion and future work
Degree
M.S.