On cross-domain social semantic learning

Deb Roy, Suman

URI

https://hdl.handle.net/10355/42943
https://doi.org/10.32469/10355/42943

dc.contributor.advisor	Zeng, Wenjun, 1967-	eng
dc.contributor.author	Deb Roy, Suman	eng
dc.date.issued	2013	eng
dc.date.submitted	2013 Fall	eng
dc.description.abstract	Approximately 2.4 billion people are now connected to the Internet, generating massive amounts of data through laptops, mobile phones, sensors and other electronic devices or gadgets. Not surprisingly then, ninety percent of the world's digital data was created in the last two years. This massive explosion of data provides tremendous opportunity to study, model and improve conceptual and physical systems from which the data is produced. It also permits scientists to test pre-existing hypotheses in various fields with large scale experimental evidence. Thus, developing computational algorithms that automatically explores this data is the holy grail of the current generation of computer scientists. Making sense of this data algorithmically can be a complex process, specifically due to two reasons. Firstly, the data is generated by different devices, capturing different aspects of information and resides in different web resources/ platforms on the Internet. Therefore, even if two pieces of data bear singular conceptual similarity, their generation, format and domain of existence on the web can make them seem considerably dissimilar. Secondly, since humans are social creatures, the data often possesses inherent but murky correlations, primarily caused by the causal nature of direct or indirect social interactions. This drastically alters what algorithms must now achieve, necessitating intelligent comprehension of the underlying social nature and semantic contexts within the disparate domain data and a quantifiable way of transferring knowledge gained from one domain to another. Finally, the data is often encountered as a stream and not as static pages on the Internet. Therefore, we must learn, and re-learn as the stream propagates. The main objective of this dissertation is to develop learning algorithms that can identify specific patterns in one domain of data which can consequently augment predictive performance in another domain. The research explores existence of specific data domains which can function in synergy with another and more importantly, proposes models to quantify the synergetic information transfer among such domains. We include large-scale data from various domains in our study: social media data from Twitter, multimedia video data from YouTube, video search query data from Bing Videos, Natural Language search queries from the web, Internet resources in form of web logs (blogs) and spatio-temporal social trends from Twitter. Our work presents a series of solutions to address the key challenges in cross-domain learning, particularly in the field of social and semantic data. We propose the concept of bridging media from disparate sources by building a common latent topic space, which represents one of the first attempts toward answering sociological problems using cross-domain (social) media. This allows information transfer between social and non-social domains, fostering real-time socially relevant applications. We also engineer a concept network from the semantic web, called semNet, that can assist in identifying concept relations and modeling information granularity for robust natural language search. Further, by studying spatio-temporal patterns in this data, we can discover categorical concepts that stimulate collective attention within user groups.	eng
dc.description.bibref	Includes bibliographical references (pages 210-214).	eng
dc.format.extent	1 online resource (215 pages) : illustrations (chiefly color)	eng
dc.identifier.oclc	895002983	eng
dc.identifier.uri	https://hdl.handle.net/10355/42943
dc.identifier.uri	https://doi.org/10.32469/10355/42943	eng
dc.language	English	eng
dc.publisher	University of Missouri--Columbia	eng
dc.relation.ispartofcommunity	University of Missouri--Columbia. Graduate School. Theses and Dissertations	eng
dc.rights	OpenAccess.	eng
dc.rights.license	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License.
dc.source	Submitted by the University of Missouri--Columbia Graduate School.	eng
dc.subject	Learning algorithms	eng
dc.subject	Domains of data	eng
dc.subject	Synergetic information transfer	eng
dc.subject	Intelligent learning	eng
dc.subject.lcsh	Big data	eng
dc.subject.lcsh	Data mining	eng
dc.subject.lcsh	Semantic Web	eng
dc.subject.lcsh	Social media	eng
dc.subject.lcsh	Learning	eng
dc.title	On cross-domain social semantic learning	eng
dc.type	Thesis	eng
thesis.degree.discipline	Computer science (MU)	eng
thesis.degree.grantor	University of Missouri--Columbia	eng
thesis.degree.level	Doctoral	eng
thesis.degree.name	Ph. D.	eng

Files in this item

Name:: research.pdf
Size:: 6.865Mb
Format:: PDF
Description:: Full dissertation

View/Open

Name:: public.pdf
Size:: 7.856Kb
Format:: PDF
Description:: Brief abstract

View/Open

Name:: short.pdf
Size:: 10.88Kb
Format:: PDF
Description:: Brief abstract

View/Open

This item appears in the following Collection(s)

2013 MU dissertations - Freely available online
These dissertations are accessible by the general public.
Computer Science electronic theses and dissertations (MU)
The electronic theses and dissertations of the Department of Computer Science.

[-] Show simple item record