dc.contributor.advisor | Rao, Praveen R. | eng |
dc.contributor.author | Slavov, Vasil Georgiev | eng |
dc.date.issued | 2012-10-03 | eng |
dc.date.submitted | 2012 Summer | eng |
dc.description | Title from PDF of title page, viewed on October 3, 2012 | eng |
dc.description | Thesis advisor: Praveen R. Rao | eng |
dc.description | Vita | eng |
dc.description | Includes bibliographic references (p. 58-69) | eng |
dc.description | Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2012 | eng |
dc.description.abstract | After more than a decade of active research and development, the peer-to-peer
(P2P) computing model continues to be successful. We have witnessed the deployment
of commercial P2P applications in large, Internet-scale environments. With
the rise and growth of P2P, indexing and querying data stored in large-scale sharing
systems has become increasingly di cult. Computing statistics over data stored in
Internet-scale P2P systems is an important component of query optimization. Decentralized
gossip-based protocols are very popular in networking, and in particular,
in sensor networks. The simplicity and scalability of gossip protocols render them
perfect for quickly computing accurate estimates of aggregates (sums, averages, etc.)
in Internet-scale systems where node and link failures are the norm. In this thesis, we present the problem of cardinality estimation of XPath
queries over XML data stored in a distributed, Internet-scale environment. We focus
our work on three objectives: implementing gossip in an Internet-scale environment,
conducting a comprehensive performance evaluation in a wide-area network, and analyzing the experimental results. We implement two gossip-based algorithms (VanillaXGossip and XGossip)
which, given an XPath query, estimate the number of XML documents in the network
that contain a match for the query. XGossip employs a new, divide-and-conquer
strategy for load-balancing and reducing the bandwidth consumption. We conduct a
comprehensive performance evaluation of both gossip algorithms on Amazon Elastic
Compute Cloud (Amazon EC2) web service using a heterogeneous collection of XML
documents. The goal of the performance evaluation is to nd if the results we obtain
are consistent with the theoretical analysis of VanillaXGossip and XGossip. | eng |
dc.description.tableofcontents | Introduction -- Background and motivations -- The design of VanillaXGossip and XGossip -- Implementation of VanillaXGossip and XGossip -- Evaluation -- Conclusion and future work -- Appendix A. Algorithms -- Appendix B. XPath grammar | eng |
dc.format.extent | xi, 70 pages | eng |
dc.identifier.uri | http://hdl.handle.net/10355/15636 | eng |
dc.publisher | University of Missouri--Kansas City | eng |
dc.subject.lcsh | Peer-to-peer architecture (Computer networks) | eng |
dc.subject.lcsh | Querying (Computer science) | eng |
dc.subject.other | Thesis -- University of Missouri--Kansas City -- Computer science | eng |
dc.title | A study of gossip algorithms for internet-scale cardinality estimation of distributed XML data | eng |
dc.type | Thesis | eng |
thesis.degree.discipline | Computer Science (UMKC) | eng |
thesis.degree.grantor | University of Missouri--Kansas City | eng |
thesis.degree.level | Masters | eng |
thesis.degree.name | M.S. | eng |