MDRED: Multi-Modal Multi-Task Distributed Recognition for Event Detection
Metadata[+] Show full item record
Understanding users’ context is essential in emerging mobile sensing applications, such as Metal Detector, Glint Finder, Facefirst. Over the last decade, Machine Learning (ML) techniques have evolved dramatically for real-world applications. Specifically, Deep Learning (DL) has attracted tremendous attention for diverse applications including speech recognition, computer vision. However, ML requires extensive computing resources. ML applications are not suitable for devices with limited computing capabilities. Furthermore, customizing ML applications for users’ context is not easy. Such a situation presents real challenges to mobile based ML applications. We are motivated to solve this problem by designing a distributed and collaborative computing framework for ML edge computing and applications. In this thesis, we propose the Multi-Modal Multi-Task Distributed Recognition for Event Detection (MDRED) framework for complex event recognition with images. The MDRED framework is based on a hybrid ML model that is composed of Deep Learning (DL) and Shallow Learning (SL). The lower level of the MDRED framework is based on the DL models for (1) object detection, (2) color recognition, (3) emotion recognition, (4) face detection, (5) text detection with event images. The higher level is based on the SL-based fusion techniques for the event detection based on the outcomes from the lower level DL models. The fusion model is designed as a weighted feature vector generated by a modified Term Frequency and Inverse Document Frequency (TF-IDF) algorithm, considering common and unique multi-modal features that are recognized for event detection. The prototype of the MDRED framework has been implemented: A master-slave architecture was designed for coordinating the distributed computing among multiple mobile devices at the edge while connecting the edge devices to the cloud ML servers. The MDRED model has been evaluated with the benchmark event datasets and compared with the state-of-the-art event detection models. The MDRED accuracy of 90.5%, 98.8%, 78% for SocEID, UIUC Sports, RED Events datasets, respectively, outperformed the baseline models of AlexNet-fc7, WEBLY-fc7, WIDER-fc7 and Event concepts. We also demonstrate the MDRED application running on Android devices for the real-time event detection.
Table of Contents
Introduction -- Background and related work -- Proposed work -- Results and evaluation -- Conclusion and future work