www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

RDF Database and SPARQL

Data Representation
RDF and SPARQL API and SQL
IRI Dereferencing
RDF Views -- Mapping Relational Data to RDF
SPARQL Implementation
RDF Inference in Virtuoso
Introduction Making Rule Sets Changing Rule Sets Subclasses and Subproperties Implementation Enabling Inference Examples
Using Full Text Search in SPARQL
Aggregates in SPARQL
Virtuoso SPARQL Query Service

14.6. RDF Inference in Virtuoso

14.6.1. Introduction

Virtuoso SPARQL can use an inference context for inferring triples that are not physically stored. Such an inference context can be built from one or more graphs containing RDF Schema triples. The supported RDF Schema or OWL constraints are imported from these graphs and are grouped together into rule bases. A rule base is a persistent entity that can be referenced by a SPARQL query or end point. Queries running with a given rule base work as if the triples asserted by this rule base were included in the graph or graphs accessed by the query.

As of version 5.0, Virtuoso recognizes rdfs:subClassOf and rdfs:subPropertyOf. Other RDF Schema or OWL information is not taken into account.


14.6.2. Making Rule Sets

Since RDF Schema and OWL schemas are RDF graphs, these can be loaded into the triple store. Thus, in order to use such a schema as query context, one first loads the corresponding document into the triple store using ttlp or rdf_load_rdfxml or related functions. After the schema document is loaded, one can add the assertions therein into an inference context with the rdfs_rule_set function. This function specifies a logical name for the rule set plus a graph URI. It is possible to combine multiple schema graphs into a single rule set. A single schema graph may also independently participate in multiple rule sets.

rdfs_rule_set (in name varchar, in uri varchar, in remove int := 0)

This function adds the applicable facts of the graph into a rule set. The graph URI must correspond to the graph IRI of a graph stored in the triple store of the Virtuoso instance. If the remove argument is true, the specified graph is removed from the rule set instead.


14.6.3. Changing Rule Sets

Changing a rule set affects queries made after the change. Some queries may have been previously compiled and will not be changed as a result of modifying the rule set. When a rule set is changed, i.e. when rdfs_rule_set is called with the first argument set to a pre-existing rule set's name, all the graphs associated with this name are read and the relevant facts are added to a new empty rule set. Thus, if triples are deleted from or added to the graphs comprising the rule set, calling rdfs_rule_set will refresh the rule set to correspond to the state of the stored graphs.


14.6.4. Subclasses and Subproperties

Virtuoso SPARQL supports RDF Schema subclasses and subproperties.

The predicates rdfs:subClassOf and rdfs:subPropertyOf are recognized when they appear in graphs included in a rule set. When such a rule set is specified as a context for a SPARQL query, the following extra triples are generated as needed.

For every ?s rdf:type ?class, a triple ?s rdf:type ?superclass is considered to exist, such that ?superclass is a direct or indirect superclass of ?class. Direct superclasses are declared with the rdfs:subClassOf predicate in the rule set graph. Transitivity of superclasses is automatically taken into account, meaning that if a is a superclass of b and b a superclass of c, then a is a superclass of c also. Cyclic superclass relations are not allowed. If such occur in the rule set data, the behavior is undefined but will not involve unterminating recursion.

For every ?s ?subpredicate ?o, a triple ?s ?superpredicate ?o is considered to exist if the rule context declares ?superpredicate to be a superpredicate of ?predicate. This is done by havingthe triple ?subpredicate rdfs:subPropertyOf ?superpredicate as part of the graphs making up the rule context. Transitivity is observed, thus if a is a subpredicate of b and b a subpredicate of c, then a is also a subpredicate of c.


14.6.5. Implementation

Triples entailed by subclass or subproperty statements in an inference context are not physically stored. Such triples are added to the result set by the query run time as needed. Also queries involving subclass or subproperty rules are not rewritten into unions of all the possible triple patterns that might imply the pattern that is requested. Instead, the SQL compiler adds special nodes that iterate over subclasses or subproperties at run time. The cost model also takes subclasses and subproperties into account when determining the approximate cardinality of triple patterns.

In essence, Virtuoso's support for subclasses and subproperties is backward chaining, i.e. it does not materialize all implied triples but rather looks for the basic facts implying these triples at query evaluation time.


14.6.6. Enabling Inference

In a SPARQL query, the define input:inference clause is used to instruct the compiler to use the rules in the named rule set. For example:

SQL> rdfs_rule_set ('sample', 'rule_graph');

SQL> sparql define input:inference "sample" select * from <g> where {?s ?p ?o};

will include all the implied triples in the result set, using the rules in the sample rule set.

Inference can be enabled triple pattern by triple patttern. This is done with the option (inference 'rule_set') clause after the triple pattern concerned. Specifying option (inference none) will disable inference for the pattern concerned while the default inference context applies to the rest of the patterns. Note that the keyword is input:inference in the query header and simply inference in the option clause. See the examples section below for examples.

In SQL, if RDF_QUAD occurs in a select from clause, inference can be added with the table option WITH, as follows:

select * from rdf_quad table option (with 'sample') where g = iri_to_id ('xx', 0);

This is about the same as:

define input:inference "sample" select * from <xx> where {?s ?p ?o}

14.6.7. Examples

ttlp ('
  <ic1> a <c1> .
  <ic2> a <c2> .
  <ic3> a <c3> .
  <ic1> <p1> <ic1p1> .
  <ic2> <p1> <ic2p1>.
  <ic3> <p1> <ic3p1> .
  <ic1> <cl2> <c2> .
  ', '', 'inft');

This loads a graph with some sample triples. The graph is called inft.

ttlp (' @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
  <c2> rdfs:subClassOf <c1> .
  <c3> rdfs:subClassOf <c2> .
  <c5> rdfs:subClassOf <c4> .
  <p1> rdfs:subPropertyOf <p0> .
  ', '', 'sc');

This loads a graph called sc that contains assertions about subclasses and subproperties.

rdfs_rule_set ('inft', 'sc');

This defines the rule context inft that is initialized from the contents of graph sc.

sparql define input:inference 'inft' select ?s from <inft> where {?s a <c1> }

This returns the instances of c1. Since c2 and c3 are subclasses of c1, instances of c1, c2 and c3 are all returned. This results in the subjects ic1, ic2 and ic3.

select id_to_iri (s)
from rdf_quad table option (with 'inft')
where g = iri_to_id ('inft',0)
  and p = iri_to_id ('http://www.w3.org/1999/02/22-rdf-syntax-ns#type', 0)
  and o = iri_to_id ('c1', 0);

This is the corresponding SQL query, internally generated by the SPARQL query.

Below we first look for all instances of c1 with some property set to ic2p1. We get the subject ic2 and the properties p1 and p0. The join involves both subclass and subproperty inference. Then we turn off the inference for the second pattern and only get the property p1. Then we do the same but now specify that inference should apply only to the first triple pattern.


SQL> sparql define input:inference  'inft' select * from <inft> where { ?s ?p <c1> . ?s ?p1 <ic2p1> . };

s                                                                                 p                                                                                 p1
VARCHAR                                                                           VARCHAR                                                                           VARCHAR
_________________________

ic2                                                                               http://www.w3.org/1999/02/22-rdf-syntax-ns#type                                   p1
ic2                                                                               http://www.w3.org/1999/02/22-rdf-syntax-ns#type                                   p0

2 Rows. -- 0 msec.
SQL> sparql define input:inference  'inft' select * from <inft> where { ?s ?p <c1> . ?s ?p1 <ic2p1> option (inference 'none') . };
s                                                                                 p                                                                                 p1
VARCHAR                                                                           VARCHAR                                                                           VARCHAR
_______________________________________________________________________________

ic2                                                                               http://www.w3.org/1999/02/22-rdf-syntax-ns#type                                   p1

1 Rows. -- 0 msec.
SQL> sparql  select * from <inft> where { ?s ?p <c1> option (inference 'inft') . ?s ?p1 <ic2p1> . };
s                                                                                 p                                                                                 p1
VARCHAR                                                                           VARCHAR                                                                           VARCHAR
_______________________________________________________________________________

ic2                                                                               http://www.w3.org/1999/02/22-rdf-syntax-ns#type                                   p1

1 Rows. -- 0 msec.