[-] Show simple item record

dc.contributor.advisorZeng, Wenjun, 1967-eng
dc.contributor.authorShah, Abhishekeng
dc.date.issued2015eng
dc.date.submitted2015 Falleng
dc.description.abstractTwitter scales 500 million tweets per day and has 316 million monthly active users. The majority of tweets are in the form of natural language. Using natural language makes it difficult to understand Twitter's data programmatically. In our research, we attempt to solve this challenge using various machine learning techniques. This thesis includes a new approach for classifying Twitter trends by adding a layer of feature selection and feature ranking. A variety of feature ranking algorithms, such as TF-IDF and bag-of-words, are used to facilitate the feature selection process. This helps in surfacing the important features, while reducing the feature space and making the classification process more efficient. Four Na�ve Bayes text classifiers (one for each class), backed by these sophisticated feature ranking and feature selection techniques, are used to successfully categorize Twitter trends. Using the bag-of-words and TF-IDF rankings, our research provides an average class precision improvement, over the current methodologies, of 33.14% and 28.67% correspondinglyeng
dc.identifier.urihttps://hdl.handle.net/10355/48617
dc.languageEnglisheng
dc.publisherUniversity of Missouri--Columbiaeng
dc.relation.ispartofcommunityUniversity of Missouri--Columbia. Graduate School. Theses and Dissertationseng
dc.sourceSubmitted to MOspace by University of Missouri--Columbia Graduate Studies.eng
dc.titleClassification of twitter trends using feature ranking and forward feature selectioneng
dc.typeThesiseng
thesis.degree.disciplineComputer science (MU)eng
thesis.degree.grantorUniversity of Missouri--Columbiaeng
thesis.degree.levelMasterseng
thesis.degree.nameM.S.eng


Files in this item

[PDF]
[PDF]
[PDF]

This item appears in the following Collection(s)

[-] Show simple item record