Conversation understanding and realistic artificial crash data generation with deep learning

Loading...
Thumbnail Image

Authors

Meeting name

Sponsors

Date

Journal Title

Format

Thesis

Subject

Research Projects

Organizational Units

Journal Issue

Abstract

This dissertation focuses on conversation understanding and realistic crash data generation with deep learning. Conversation understanding includes conversational outcome, formality, and politeness prediction. For conversation outcome prediction, we use recorded audio calls collected from a partnering Fortune 500 firm that captures conversations between inside salespeople and business customers. Analysis of communication effectiveness is accomplished by transcribing these audio files and subsequently segmenting each conversation into customer and salesperson speaker turns to enable extraction of audio features and text embeddings for each speaker turn. In this dissertation we propose that a multimodal transformer network (MTN) can capture the importance of different speak turns and be used to effectively predict the outcome of the call using both audio and text features. Results from the proposed model outperform current state-of-the-art results and reveal that text features offer superior outcome prediction compared to audio features. Formality and politeness analysis can help us have a better understanding about conversations. We propose Formality-BERT which is a BERT-based model for sentence formality prediction. Formality-BERT is trained on a public dataset that contains four genres of text and outperforms existing models by 14 points on the Spearman correlation between predicted formality and human-labeled formality for four genres. We propose Politeness-BERT for sentence politeness prediction. Politeness-BERT is trained on Stanford politeness corpus and achieved human-level results. ChatGPT is evaluated for politeness prediction with zero-shot learning and its results are worse than Politeness-BERT by 6 percent. Besides sentence formality and politeness, we propose a novel approach for predicting formality and politeness of voice data. A deep neural network is trained on both audio features and text embeddings extracted from pre-trained Formality-BERT or Politeness-BERT and achieved the best results. Understanding the relationship between the roadway characteristics and traffic crashes can help the researchers to better predict crashes to prevent some crashes from happening. In this dissertation, we propose a transformer-based tabular generative adversarial network (TTGAN) for generating realistic artificial crash data. We evaluate the synthetic crash data generated by TTGAN on distribution of single variable, pairwise correlation between variables, and crash prediction performance of various machine learning models against the real crash data. The results show that TTGAN can capture the distribution, correlation, and causal-effect relationships from real crash data.

Table of Contents

PubMed ID

Degree

Ph. D.

Thesis Department

Rights

License