Shared Context through Multi-Level Attention Transformers for Text Classification

No Thumbnail Available

Meeting name

Sponsors

Date

Journal Title

Format

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

Natural language processing (NLP) has seen recent explosive growth by creating artificial intelligence with human-level intelligence. Understanding the context using an attention mechanism could be further improved by fine-tuning their composition for classification, question answering, and topic modeling. Real-world datasets are much more complex and tend to require multi-fold models. Such models tend to be larger, deeper, more complicated; for example, BERT has 340 million parameters, Turing NLG is 17billion parameters, and GPT-3 is about 175 billion parameters. Understanding their implications requires the immense computational ability to process the text corpus during both training and inferences. This thesis proposes a novel deep learning architecture for scalable multi-fold text classification that is an extension of BERT by sharing context across abstraction levels of domains. Four types of deep learning models (BERT flat, BERT hierarchical, BERT hierarchical tuned, BERT Feature extracted) are proposed for the multi-label attention transformers on the architecture. The proposed models provide a means to overcome competing limitations, training concurrently, and providing predictions for an extra level of classes simultaneously. Our work overcomes the limitations of knowledge distillation or transfer-learning, i.e., it is not scalable or sustainable, and it’s also costly. We have performed experiments to validate the reliability model using both benchmark and real-world data (KCMO 311 data). Quantitative results confirm that the proposed models can enhance model performance in terms of computational requirements and provide competitive accuracy.

Table of Contents

Introduction -- Related work -- Multi-Level Attention Transformers -- Evaluation and Results -- Conclusion and Future Work

DOI

PubMed ID

Degree

M.S. (Master of Science)

Thesis Department

Rights

License