Purdue University Graduate School
Browse
Efficient_Query_Processing_Over_Web_Scale_RDF_Data.pdf (1.98 MB)

Efficient Query Processing Over Web-Scale RDF Data

Download (1.98 MB)
thesis
posted on 2019-01-17, 14:37 authored by Amgad M. MadkourAmgad M. Madkour
The Semantic Web, or the Web of Data, promotes common data formats for representing structured data and their links over the web. RDF is the defacto standard for semantic data where it provides a flexible semi-structured model for describing concepts and relationships. RDF datasets consist of entries (i.e, triples) that range from thousands to Billions. The astronomical growth of RDF data calls for scalable RDF management and query processing strategies. This dissertation addresses efficient query processing over web-scale RDF data. The first contribution is WORQ, an online, workload-driven, RDF query processing technique. Based on the query workload, reduced sets of intermediate results (or reductions, for short) that are common for specific join pattern(s) are computed in an online fashion. Also, we introduce an efficient solution for RDF queries with unbound properties. The second contribution is SPARTI, a scalable technique for computing the reductions offline. SPARTI utilizes a partitioning schema, termed SemVP, that enables efficient management of the reductions. SPARTI uses a budgeting mechanism with a cost model to determine the worthiness of partitioning. The third contribution is KC, an efficient RDF data management system for the cloud. KC uses generalized filtering that encompasses both exact and approximate set membership structures that are used for filtering irrelevant data. KC defines a set of common operations and introduces an efficient method for managing and constructing filters. The final contribution is semantic filtering where data can be reduced based on the spatial, temporal, or ontological aspects of a query. We present a set of encoding techniques and demonstrate how to use semantic filters to reduce irrelevant data in a distributed setting.

History

Degree Type

  • Doctor of Philosophy

Department

  • Computer Science

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Walid G. Aref

Additional Committee Member 2

Sunil Prabhakar

Additional Committee Member 3

Tiark Rompf

Additional Committee Member 4

Sonia Fahmy

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC