header

RIQ: Fast Processing of SPARQL Queries on RDF Quadruples

21 Pages Posted: 18 Jul 2018 Publication Status: Accepted

See all articles by Anas Katib

Anas Katib

University of Missouri at Kansas City - School of Computing and Engineering

Vasil Slavov

University of Missouri at Kansas City - School of Computing and Engineering

Praveen Rao

University of Missouri at Kansas City - School of Computing and Engineering

Abstract

In this paper, we address the problem of fast processing of SPARQL queries on a large RDF dataset, where the RDF statements are quadruples (or quads). Quads can capture provenance or other relevant information about facts. This is especially powerful in modeling knowledge graphs, which are becoming increasingly important on the Web to provide high quality search results to users. We propose a new approach called RIQ that employs a decrease-and-conquer strategy for fast SPARQL query processing. Rather than indexing the entire RDF dataset, RIQ identies groups of similar RDF graphs and creates indexes on each group separately. It employs a new vector representation for RDF graphs and locality sensitive hashing to construct the groups eciently. It constructs a novel ltering index on the groups and compactly represents the index as a combination of Bloom and Counting Bloom Filters. During query processing, RIQ employs a streamlined approach. It constructs a query plan for a SPARQL query (containing one or more graph patterns), searches the ltering index to quickly identify candidate groups that may contain matches for the query, and rewrites the original query to produce an optimized query for each candidate. The optimized queries are then executed using an existing SPARQL processor that supports quads to produce the nal results. We conducted a comprehensive evaluation of RIQ using a real and synthetic dataset, each containing about 1.4 billion quads. Our results show that RIQ can outperform its competitors designed to support named graph queries on RDF quads (e.g., Jena TDB and Virtuoso) for a variety of queries.

Keywords: RDF, Quadruples, SPARQL, Query processing, Knowledge graphs

Suggested Citation

Katib, Anas and Slavov, Vasil and Rao, Praveen, RIQ: Fast Processing of SPARQL Queries on RDF Quadruples (2016). Available at SSRN: https://ssrn.com/abstract=3199230 or http://dx.doi.org/10.2139/ssrn.3199230

Anas Katib

University of Missouri at Kansas City - School of Computing and Engineering ( email )

546 Flarsheim Hall
Kansas City, MO 64110
United States

Vasil Slavov

University of Missouri at Kansas City - School of Computing and Engineering ( email )

546 Flarsheim Hall
Kansas City, MO 64110
United States

Praveen Rao (Contact Author)

University of Missouri at Kansas City - School of Computing and Engineering ( email )

546 Flarsheim Hall
Kansas City, MO 64110
United States

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
55
Abstract Views
1,794
PlumX Metrics