IntelliEdgent: device-server collaborative deep learning model composition for resource-efficient edge intelligence

No Thumbnail Available

Meeting name

Sponsors

Date

Journal Title

Format

Thesis

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Deep Learning models have achieved tremendous success lately towards analysing high-dimensional data like images, texts, audio, etc. Despite their phenomenal predictive performance, the high demands on computation resources (e.g., memory requirement is in the order of hundreds of megabytes for doing a single inference on a typical deep learning model consisting of millions of parameters associated with these models stagger their widespread deployment, particularly in the resource-constrained devices. Hence, the user devices cannot execute the tasks of model inference and maintenance in a standalone manner. Towards this end, the straightforward or Baseline solution to solve these issue is to deploy the models at a server, and the devices communicate with the server for running the models on the received user inputs and updating the models based on the users’ feedback. However, the latency involved with this Baseline setup is typically too high (in the order of hundreds of milliseconds) for each round of communication. Since both the device-only and server-only setup are not efficient enough, we propose a collaborative model composition approach for deep learning execution, based on edge computing, called IntelliEdgent, which intelligently splits the computation workload across both the device and the server, so that the resultant execution latency is optimal. At first, we study the problem of detecting Out-Of-Distribution (OOD) samples on device using a shallow detector, so that for these samples the server is not invoked for getting the classification results (since the results are going to be wrong anyways) in our device-server collaborative IntelliEdgent setup. In the next work, we study the problem deploying multitasking models in our IntelliEdgent setup. We call our approach Chimera, that can dynamically generate optimal multitasking deployment based on the dynamically changing deployment configurations. Thirdly, we study a device-server collaborative Vision Transformer (ViT) classifier called FactionFormer, based on the idea of a dynamically changing narrower deployment context compared to the off-the-shelf pretrained classifier. Besides these three works, we have other two works in the pipeline where we study the problems like personalized recommender model generation under our proposed IntelliEdgent framework.

Table of Contents

Introduction -- Background -- Earlin: early out-of-distribution detection for resource-efficient collaborative inference -- Chimera: context-aware splittable deep multitasking models for edge intelligence -- Factionformer: context-driven collaborative vision transformer models for edge intelligence -- Context-driven device-edge collaborative vision transformer models for edge AI -- Percolator: device-cloud collaborative model composition for personalized recommendation -- Conclusion and future directions

DOI

PubMed ID

Degree

Ph.D. (Doctor of Philosophy)

Thesis Department

Rights

License