[-] Show simple item record

dc.contributor.advisorRao, Praveen R.eng
dc.contributor.authorPaturi, Srivenueng
dc.date.issued2013eng
dc.date.submitted2013 Summereng
dc.descriptionTitle from PDF of title page, viewed on October 21, 2013eng
dc.descriptionVitaeng
dc.descriptionThesis advisor: Praveen Raoeng
dc.descriptionIncludes bibliographic references (pages 78-82)eng
dc.descriptionThesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2013eng
dc.description.abstractThe Resource Description Framework (RDF) has become a popular data model for representing data on the Web. Using RDF, any assertion can be represented as a (subject, predicate, object) triple. Essentially, RDF datasets can be viewed as directed, labeled graphs. Queries on RDF data are written using the SPARQL query language and contain basic graph patterns (BGPs). We present a new filtering index and query processing technique for processing large BGPs in SPARQL queries. Our approach called RIS treats RDF graphs as "first-class citizens." Unlike previous scalable approaches that store RDF data as triples in an RDBMS and process SPARQL queries by executing appropriate SQL queries, RIS aims to speed up query processing by reducing the processing cost of join operations. In RIS, RDF graphs are mapped into signatures, which are multisets. These signatures are grouped based on a similarity metric and indexed using Counting Bloom Filters. During query processing, the Counting Bloom Filters are checked to filter out non-matches, and finally the candidates are verified using Apache Jena. The filtering step prunes away a large portion of the dataset and results in faster processing of queries. We have conducted an in-depth performance evaluation using the Lehigh University Benchmark (LUBM) dataset and SPARQL queries containing large BGPs. We compared RIS with RDF-3X, which is a state-of-the-art scalable RDF querying engine that uses an RDBMS. RIS can significantly outperform RDF-3X in terms of total execution time for the tested dataset and queries.eng
dc.description.tableofcontentsIntroduction -- Motivation and related work -- Background -- Bloom filters and Bloom counters -- System architecture -- Signature tree generation -- Querying the signature tree -- Evaluation -- Experiments -- Conclusioneng
dc.format.extentxiv, 83 pageseng
dc.identifier.urihttp://hdl.handle.net/10355/39099eng
dc.subject.lcshWeb sites -- Indexing and abstractingeng
dc.subject.lcshQuery languages (Computer science)eng
dc.subject.otherThesis -- University of Missouri--Kansas City -- Computer scienceeng
dc.titleA new filtering index for fast processing of SPARQL querieseng
dc.typeThesiseng
thesis.degree.disciplineComputer Science (UMKC)eng
thesis.degree.grantorUniversity of Missouri--Kansas Cityeng
thesis.degree.levelMasterseng
thesis.degree.nameM.S.eng


Files in this item

[PDF]

This item appears in the following Collection(s)

[-] Show simple item record