www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

RDF Database and SPARQL

Data Representation
RDF and SPARQL API and SQL
IRI Dereferencing
RDF Views -- Mapping Relational Data to RDF
Introduction Rationale Quad Map Patterns, Value and IRI Classes Configuring RDF Storages Translation Of SPARQL Triple Patterns To Quad Map Patterns Describing Source Relational Tables
SPARQL Implementation
RDF Inference in Virtuoso
Using Full Text Search in SPARQL
Aggregates in SPARQL
Virtuoso SPARQL Query Service

14.4. RDF Views -- Mapping Relational Data to RDF

RDF Views map relational data into RDF and allow customizing RDF representation of locally stored RDF data. To let SPARQL clients access relational data as well as physical RDF graphs in a single query, we introduce a declarative Meta Schema Language for mapping SQL Data to RDF Ontologies. As a result, all types of clients can efficiently access all data stored on the server. The mapping functionality dynamically generates RDF Data Sets for popular ontologies such as SIOC, SKOS, FOAF, and ATOM/OWL without disruption to the existing database infrastructure of Web 1.0 or Web 2.0 solutions. RDF views are also suitable for declaring custom representation for RDF triples, e.g. property tables, where one row holds many single-valued properties.

14.4.1. Introduction

The Virtuoso RDF Views meta schema is a built-in feature of Virtuoso's SPARQL to SQL translator. It recognizes triple patterns that refer to graphs for which an alternate representation is declared and translates these into SQL accordingly. The main purpose of this is evaluating SPARQL queries against existing relational databases. There exists previous work from many parties for rendering relational data as RDF and opening it to SPARQL access. We can mention D2RQ, SPASQL, Squirrel RDF, DBLP and others. The Virtuoso effort differs from these mainly in the following:


14.4.2. Rationale

Since most of the data that is of likely use for the emerging semantic web is stored in relational databases, the argument for exposing this to SPARQL access is clear. We note that historically, SQL access to relational data has essentially never been given to the public outside of the organization. If programmatic access to corporate IS has been available to partners or the public, it has been through dynamic web pages or more recently web services. There are reasons of performance, security, maintainability and so forth for this.

The culture of the emerging semantic web is however taking a different turn. Since RDF and OWL offer a mergeable and queriable model for heterogeneous data, it is more meaningful and maintainable to expose selected data for outside query than it would be with SQL. Advances in hardware make this also less of a performance issue than it would have been in the client-server database era.

In the context of Virtuoso, since Virtuoso is originally a virtual/federated database, incorporating SPARQL to relational mapping is an evident extension of the product's mission as a multi-protocol, multi-platform connector between information systems.


14.4.3. Quad Map Patterns, Value and IRI Classes

In the simplest sense, any relational schema can be rendered into RDF by converting all primary keys and foreign keys into IRI's, assigning a predicate IRI to each column, and an rdf:type predicate for each row linking it to a RDF class IRI corresponding to the table. Then a triple with the primary key IRI as subject, the column IRI as predicate and the column's value as object is considered to exist for each column that is neither part of a primary or foreign key.

Strictly equating a subject value to a row and each column to a predicate is often good but is too restrictive for the general case.

Thus in the most common case the RDF meta schema should consist of independent transformations; the domain of each transformation is a result-set of some SQL SELECT statement and range is a set of triples. The SELECT that produce the domain is quite simple: it does not use aggregate functions, joins and sorting, only inner joins and WHERE conditions. There is no need to support outer joins in the RDF meta schema because NULLs are usually bad inputs for functions that produce IRIs. In the rare cases when NULLs are OK for functions, outer joins can be encapsulated in SQL views. The range of mapping can be described by a SPARQL triple pattern: a pattern field is a variable if it depends on table columns, otherwise it is a constant. Values of variables in the pattern may have additional restrictions on datatypes, when datatypes of columns are known.

This common case of an RDF meta schema is implemented in Virtuoso, with one adjustment. Virtuoso stores quads, not triples, using the graph field (G) to indicate that a triple belongs to some particular application or resource. A SPARQL query may use quads from different graphs without large difference between G and the other three fields of a quad. E.g., variable ?g in expression GRAPH ?g {...} can be unbound. SPARQL has special syntax for "graph group patterns" that is convenient for sets of triple patterns with a common graph, but it also has shorthands for common subject and predicate, so the difference is no more than in syntax. There is only one feature that is specific for graphs but not for other fields: the SPARQL compiler can create restrictions on graphs according to FROM and FROM NAMED clauses.

Virtuoso RDF Views should offer the same flexibility with the graphs as SPARQL addressing physical triples. A transformation cannot always be identified by the graph used for ranges because graph may be composed from SQL data. The key element of the meta schema is a "quad map pattern". A simple quad map pattern fully defines one particular transformation from one set of relational columns into triples that match one SPARQL graph pattern. The main part of quad map pattern is four declarations of "quad map values", each declaration specifies how to calculate the value of the corresponding triple field from the SQL data. The pattern also lists boolean SQL expressions that should be used to filter out unwanted rows of source data (and to join multiple tables if source columns belong to different tables). There are also quad map patterns that group together similar quad patterns but do not specify any real transformation or even prevent unwanted transformations from being used, they are described in "Grouping Map Patterns" below.

Quad map values refer to schema elements of two further types: "IRI classes" and "literal classes".

14.4.3.1. IRI Classes

An IRI class declares that a column or set of columns gets converted into a IRI in a certain way. The conversion of this sort can be declared revertible (bijection) so an IRI can be parsed into original SQL values; this is useful when some equality of an IRI constant and a calculated IRI can be replaced with an equality of a parse result of a constant and an SQL column that is index criteria or simply faster. In addition, the SPARQL optimizer will eliminate redundand conversions if one IRI class is explicitly declared as a subclass of another. The most flexible declaration for conversion consists of specifying functions that assemble and disassemble from IRI into its constituent parts. This is overkill for typical conversions so it is possible to specify only one sprintf-style format string such that sprintf() SQL function will print an IRI using this format and sprintf_inverse() will be able to parse it back.

The use of sprintf_inverse() assumes that the format does not contain fragments like '%s%s' that make it impossible to separate parts of IRI from each other.

In the following, we shall map the Virtuoso users and user roles system tables into the SIOC ontology.

create iri class oplsioc:user_iri "http://myhost/sys/user?id=%d"
  (in uid integer not null) .
create iri class oplsioc:group_iri "http://myhost/sys/group?id=%d"
  (in gid integer not null) .
create iri class oplsioc:membership_iri
  "http://myhost/sys/membersip?super=%d&sub=%d"
  (in super integer not null, in sub integer not null) .
create iri class oplsioc:dav_iri "http://myhost%s"
  (in path varchar) .

These IRI classes are used for mapping data from the DB.DBA.SYS_USERS and DB.DBA.SYS_ROLE_GRANTS system tables that are defined in Virtuoso as follows:

create table DB.DBA.SYS_USERS (
  U_ID                integer not null unique,
  U_NAME              char (128) not null primary key,
  U_IS_ROLE           integer default 0,
  U_FULL_NAME         char (128),
  U_E_MAIL            char (128) default ",
  U_ACCOUNT_DISABLED  integer default 1,
  U_DAV_ENABLE        integer default 0,
  U_SQL_ENABLE        integer default 1,
  U_HOME              varchar (128),
. . .
 );

Single record in DB.DBA.SYS_USERS corresponds to a plain user or a group (role). Users and roles are collectively named "grantees". Thus a role may be granted to another role or to a user account. A role grant may be direct (explicit) or assigned by recursion.

create table SYS_ROLE_GRANTS (
  GI_SUPER   integer,
  GI_SUB     integer,
  GI_DIRECT  integer default 1,
. . .
  primary key (GI_SUPER, GI_SUB, GI_DIRECT));

The following example demonstrates two things: function-based IRI class and the idea of IRI subclasses. Classes oplsioc:user_iri and oplsioc:group_iri work fine for quad maps of U_ID if and only if the value of U_IS_ROLE is accordingly restricted to FALSE or TRUE, otherwise one may occasionally generate, say, user IRI for a group. To create and parse IRIs that correspond to any U_IDs, two functions should be created:

create function DB.DBA.GRANTEE_URI (in id integer)
returns varchar
{
  declare isrole integer;
  isrole := coalesce ((select top 1 U_IS_ROLE
      from DB.DBA.SYS_USERS where U_ID = id ) );
  if (isrole is null)
    return NULL;
  else if (isrole)
    return sprintf ('http://%s/sys/group?id=%d', id);
  else
    return sprintf ('http://%s/sys/user?id=%d', id);
};
create function DB.DBA.GRANTEE_URI_INVERSE (in id_iri varchar)
returns integer
{
  declare parts any;
  parts := sprintf_inverse (id_iri,
      'http://myhost/sys/user?id=%d', 1 );
  if (parts is not null)
    {
      if (exists (select top 1 1 from DB.DBA.SYS_USERS
          where U_ID = parts[0] and not U_IS_ROLE ) )
        return parts[0];
    }
  parts := sprintf_inverse (id_iri,
      'http://myhost/sys/group?id=%d', 1 );
  if (parts is not null)
    {
      if (exists (select top 1 1 from DB.DBA.SYS_USERS
          where U_ID = parts[0] and U_IS_ROLE ) )
        return parts[0];
    }
  return NULL;
};

The next declaration creates an IRI class based on these two functions:

create iri class oplsioc:grantee_iri using
  function DB.DBA.GRANTEE_URI (in id integer)
    returns varchar,
function DB.DBA.GRANTEE_URI_INVERSE (in id_iri varchar)
    returns integer .

In common case, IRI class declaration contains an N-ary function that composes IRIs and N inverse functions that gets an IRI as an argument and extracts the Nth SQL value. IRI composing function should silently return NULL on incorrect arguments instead of error signal. Inverse functions should return NULL if the argument has an incorrect type or value.

It is possible to specify only composing function without any of inverse functions. However option (bijection) can not be used in that case, obviously.


14.4.3.2. Literal Classes

A "literal class" declares that a column or set of columns gets converted into a literal instead of an IRI. More precisely, the result of conversion can be IRI_ID so it represents an IRI, but in current version of Virtuoso this is supported only for some internal built-in literal classes, not for classes declared by the user. So for user-defined literal class the result of the conversion is an RDF literal even if it is a string representation of a valid IRI.

In any case, a literal class can be used only in quad map values of O fields, because Virtuoso does not support literal values as subjects.

A special case of literal class is the identity class that converts a value from varchar column into an untyped literal and value from column of any other SQL datatype into a typed literal with type from XMLSchema set, i.e. xsd:integer, xsd:dateTime and so on. Columns of types ANY and IRI_ID are not supported.

The SPARQL optimizer knows that RDF literal types are pairwise disjoint so literal classes that produce literals of different types are known to be pairwise disjoint. The optimizer will replace a join on two disjoint literal classes with an empty statement, to simplify the resulting query.


14.4.3.3. BIJECTION and RETURNS Options

There is one subtle problem with IRI class declarations. To get benefit from a relational index, SPARQL optimizer should compose equality between table column and some known SQL value, not between return value of IRI class and a known composed IRI. In addition, redundant calculations of IRIs takes time. To enable this optimization, an IRI class declaration should end with option (bijection) clause.

The SPARQL compiler may produce big amounts of SQL code when the query contains equality of two calculated IRIs and these IRIs may come from many different IRI classes. It is possible to provide hints that will let the compiler check if two IRI classes form disjoint sets of possible IRI values. The more disjoint sets are found the less possible combinations remain so the resulting SQL query will contain fewer unions of joins. The SPARQL compiler can prove some properties of sprintf format strings. E.g., it can prove that set of all strings printed by "http://example.com/item%d" and the set of strings printed by "http://example.com/item%d/" are disjoint. It can prove some more complicated statements about unions and intersections of sets of strings. The IRI or literal class declaration may contain option (returns ...) clause that will specify one or more sprintf patterns that cover the set of generated values. Consider a better version of IRI class declaration listed above:

create iri class oplsioc:grantee_iri using
  function DB.DBA.GRANTEE_URI (in id integer)
    returns varchar ,
  function DB.DBA.GRANTEE_URI_INVERSE (in id_iri varchar)
    returns integer
  option ( bijection,
    returns "http://myhost/sys/group?id=%d"
    union   "http://myhost/sys/user?id=%d" ) .

It is very important to keep IRI classes easily distinguishable by the text of IRI string and easy to parse.

In some cases option (returns ...) can be used for IRI classes that are declared using sprintf format, but actual data have more specific format. Consider a literal class declaration that is used to output strings and the application knows that all these strings are ISBN numbers:

create literal class example:isbn_ref "%s" (in isbn varchar not null)
  option ( bijection, returns "%u-%u-%u-%u" union "%u-%u-%u-X" )

Sometimes interoperability restrictions will force you to violate these rules but please try to follow them as often as possible.


14.4.3.4. Subclasses

Additional problem appears when the equality is between two IRIs of two different IRI classes. Even if both of them are bijections, the compiler does not know if these IRI classes behave identically on the intersection of their domains. To let the optimizer know this fact, one IRI class can be explicitly declared as a subclass of another:

make oplsioc:user_iri subclass of oplsioc:grantee_iri .
make oplsioc:group_iri subclass of oplsioc:grantee_iri .

The SPARQL compiler can not check the validity of a subclass declaration. The developer should carefully test functions to ensure that transformations are really subclasses, as well as to ensure that functions of an IRI class declarations are really inverse to each other.

When declaring that a table's primary key is converted into a IRI according to one IRI class, one usually declares that all foreign keys referring to this class also get converted into an IRI as per this same class, or subclass of same class.

Subclasses can be declared for literal classes as well as for IRI classes, but this case is rare. The reason is that most of literals are made by identity literal classes that are disjoint to each other even if values may be equal in SQL sense, such as "2" of type xsd:integer and "2.0" of type xsd:double.


14.4.3.5. Simple Quad Map Patterns

The following declaration of quad map pattern is self-explanatory. The line for object uses identity literal class so there's no need to specify its name.

graph      <http://myhost/sys>
subject    oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID)
predicate  foaf:email
object     DB.DBA.SYS_USERS.U_E_MAIL

The description language also supports SPARQL-style notation that contains less keywords and eliminates duplicate graphs, subjects and predicates. The following add two patterns with constant graph IRI <http://myhost/sys> and subjects are made from column DB.DBA.SYS_USERS.U_ID by oplsioc:user_iri.

graph <http://myhost/sys>
  {
    oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID)
      a sioc:user ;
      oplsioc:name DB.DBA.SYS_USERS.U_FULL_NAME .
  }

14.4.3.6. Assigning Names To Quad Map Patterns

In real applications, quad map patterns should be named, for schema manipulation and keeping debug info readable. Thus it is much better to rewrite the previous example as

create virtrdf:SysUsers as graph <http://myhost/sys>
  {
    oplsioc:user_iri (DB.DBA.SYS_USERS.U_ID)
      a sioc:user
          as virtrdf:SysUserType-User;
      oplsioc:name DB.DBA.SYS_USERS.U_FULL_NAME
          as virtrdf:SysUsersFullName .
  }

Using these names, one may later write, say, drop quad map virtrdf:SysUserType-User.

One name, virtrdf:DefaultQuadMap is reserved. It is an internal quad map pattern used to access "native-form" quads from DB.DBA.RDF_QUAD:

create virtrdf:DefaultQuadMap as
graph rdfdf:default-iid-nonblank (DB.DBA.RDF_QUAD.G)
subject rdfdf:default-iid (DB.DBA.RDF_QUAD.S)
predicate rdfdf:default-iid-nonblank (DB.DBA.RDF_QUAD.P)
object rdfdf:default (DB.DBA.RDF_QUAD.O)

IRI classes from rdfdf:... namespace are also reserved.


14.4.3.7. Grouping Map Patterns

The previous example actually contains three map patterns, not two. The name virtrdf:SysUsers refers to a "group map pattern" that does not define any real transformation of relational data into RDF but helps organize quad map patterns into a tree. Group may contain both quad map patterns and other groups. A group can be manipulated as a whole, e.g. drop quad map virtrdf:SysUsers will remove all three map patterns.



14.4.4. Configuring RDF Storages

"Quad Storage" is a named set of quad patterns. The declaration define input:storage storage-name states that a SPARQL query will be executed using only quad patterns of the given quad storage. Declarations of IRI classes, literal classes and quad patterns are shared between all quad storages of an RDF meta schema but every quad storage contains only a subset of all available quad patterns. Two quad storages are always defined:

Three statements for manipulating storages are

A map pattern can be created only as a part of create quad storage or alter quad storage statement, so initially it is used by exactly one storage. It can be imported to some other storage using directive create map-id using storage source-storage. E.g., declarations of many storages create virtrdf:DefaultQuadMap using storage virtrdf:DefaultQuadStorage.

Only a "top-level" quad map pattern (standalone or a whole group with descendants) can be imported, member of a group can not. The import directive also can not be a part of some group declaration.

The directive drop quad map map-name removes a map from one storage when it appears inside alter quad storage statement. Otherwise it removes the map from all storages. There exists garbage collection for quad map patterns, so any unused map is immediately deleted. A group is deleted with all its descendants.


14.4.5. Translation Of SPARQL Triple Patterns To Quad Map Patterns

When a SPARQL query is compiled into SQL using a quad storage, every triple pattern should become a subquery that retrieves data from relational tables. This subquery is an UNION ALL of joins generated from appropriate quad map patterns. The complete SQL query is composed from these basic subqueries. Thus the first operation of the SQL generation for a triple pattern is searching for quad map patterns that may in principle produce triples that match the triple pattern.

The more restrictions contained in the triple pattern the fewer quad map patterns will be used. A triple pattern graph ?g { ?s ?p ?o } is common enough to invoke all data transformations of the storage. A triple pattern graph <g> { ?s <p> <o> } will usually intersect with the range of only one quad map. Sometimes it is possible to prove that the storage can not contain any data that matches the given triple pattern, hence zero number of members of UNION ALL will result in constantly empty result-set.

The search for quad maps for a given pair of triple pattern and quad map storage is quite simple. The storage is treated as a tree of map patterns where quad map patterns are leafs, grouping patterns are inner nodes and the whole storage is also treated as a grouping pattern that specify no fields and contains all top-level map patterns of the storage.

The tree is traversed from the root, left to right, non-leaf vertex are checked before their children. The check of a vertex consists of up to four field checks, for G, S, P and O. Every field check compares the field definition in the vertex and the corresponding field in the triple pattern, G and G, S and S and so on. Note that a non-leaf vertex defines less than four of its fields, e.g., the root vertex does not define any of its fields and top-level graph map { ... } defines only graph. Checks are performed only for defined fields and return one of three values: "failed", "passed", "full match", according to the following rules:

Table: 14.4.5.1. Matching Triple Field and Vertex Field
Field of vertex Field in triple pattern Result
constant same constant full match
constant different constant failed
constant variable of same type passed
constant variable of different type failed
quad map value constant of same type full match
quad map value constant of different type failed
quad map value of type X variable, X or subtype of X full match
quad map value of type X variable, supertype of X passed
quad map value of type X variable, type does not intersect with X failed

If any of the checks fails, the vertex and all its children are excluded from the rest of processing. Otherwise, if all four fields are defined for the quad map pattern, the map is added to the list of matching map patterns. The difference between "passed" and "full match" is significant only if the map is declared with option (exclusive) If all performed checks return "full match" and option (exclusive) is set then the traverse of the tree is stopped as soon as all children of the vertex are traversed. The most typical use of this option is when the application developer is sure that all triples of a graph belong to his application and they come from his own quad map patterns, not from DB.DBA.RDF_QUAD. This is to prevent the SPARQL compiler from generating redundant subqueries accessing DB.DBA.RDF_QUAD. The declaration may look like

create quad storage <mystorage>
  {
    graph <mygraph> option (exclusive) { . . . }
    create virtrdf:DefaultQuadMap
      using storage virtrdf:DefaultQuadStorage .
  }

Exclusive patterns make the order of declarations important, because an exclusive declaration may "throw a shadow" on declarations after it. Consider a database that have a special table RDF_TYPE that caches all RDF types of all subjects in all graphs. Consider two declarations: all triples from graph <http://myhost/sys> and all triples with rdf:type predicate, both exclusive:

graph <http://myhost/sys> option (exclusive)
  {
    . . . # mapping of DB.DBA.SYS_USERS as in previous examples.
  }
graph rdfdf:default-iid-nonblank (DB.DBA.RDF_TYPE.G)
subject rdfdf:default-iid (DB.DBA.RDF_TYPE.S)
predicate rdf:type
object rdfdf:default (DB.DBA.RDF_TYPE.O)
option (exclusive)

The order of these declarations dictates that triple pattern

graph <http://myhost/sys> {?s rdf:type ?o}

is compiled using only quad map patterns of the graph declaration, ignoring second declaration (and of course ignoring default mapping rule, if any). An explicit option (order N) at the end of quad map pattern will tweak the priority. By default, order will grow from 1000 for the first declaration in the statement to 1999 for the last, explicit configuration is especially useful to make order persistent to alter storage statements.

The option (exclusive) trick is ugly, low-level and prone to cause compilation errors after altering storage declarations. When misused, it is as bad as "red cut" in PROLOG, but one must use this trick to build scalable storages.

There is one exception from the rules described above. This exception is for virtrdf:DefaultQuadStorage only. If a graph variable of a quad map pattern is not bound and no source graph specified by FROM clauses then quad maps for specific constant graphs are ignored. In other words, if a default quad storage contains quad maps for specific graphs then the query in that storage should explicitely specify the graph in order to use a map for graph. This rule will not work if the default quad map is removed from the virtrdf:DefaultQuadStorage. This rule relates to the default storage itself, not to the containing patterns; copying some or all patterns into other storage will not reproduce there this special effect.


14.4.6. Describing Source Relational Tables

Quad map patterns of an application usually share a common set of source tables and quad map values of one pattern usually share either a single table or very small number of joined tables. Join and filtering conditions are also usually repeated in different patterns. It is not necessary to type table descriptions multiple times, they are declare once in the beginning of storage declaration statement and shared between all quad map declarations inside the statement. Names of aliases can be used instead of table names in quad map values.

from DB.DBA.SYS_USERS as user where (^{user.}^.U_IS_ROLE = 0)
from DB.DBA.SYS_USERS as group where (^{group.}^.U_IS_ROLE = 1)
from DB.DBA.SYS_USERS as account
from user as active_user
  where (^{active_user.}^.U_ACCOUNT_DISABLED = 0)
from DB.DBA.SYS_ROLE_GRANTS as grant
  where (^{grant.}^.GI_SUPER = ^{account.}^.U_ID)
  where (^{grant.}^.GI_SUB = ^{group.}^.U_ID)
  where (^{grant.}^.GI_SUPER = ^{user.}^.U_ID)

This declares five distinct aliases for two distinct tables, and six filtering conditions. Every condition is an SQL expression with placeholders where a reference to the table should be printed. The SPARQL compiler will not try to parse texts of these expressions (except dummy search for placeholders), so any logical expressions are acceptable. When a quad map pattern declaration refers to some aliases, the WHERE clause of the generated SQL code will contain a conjunction of all distinct texts of "relevant" conditions. A condition is relevant if every alias inside the condition is used in some quad map value of the map pattern, either directly or via clause like from user as active_user. (user is a "base alias" for active_user).

Consider a group of four declarations.

graph <http://myhost/sys>
  {
    oplsioc:user_iri (active_user.U_ID)
        a oplsioc:active-user .
    oplsioc:membership_iri (grant.GI_SUPER, grant.GI_SUB).
        oplsioc:is_direct
            grant.GI_DIRECT ;
        oplsioc:member-e-mail
            active_user.U_E_MAIL
               where (^{active_user.}^.U_E_MAIL like 'mailto:%').
    ldap:account-ref (account.U_NAME)
        ldap:belongs-to
            ldap:account-ref (group.U_NAME) option (using grant).
  }

The first declaration will extend <http://myhost/sys> graph with one imaginary triples { user a oplsioc:active-user } for every account record that is not a role and not disabled. The second declaration deals with membership records. A membership is a pair of a grantee ("super") and a granted role ("sub") stored as a row in DB.DBA.SYS_ROLE_GRANTS).

The second declaration states that every membership has oplsioc:is_direct property with value from GI_DIRECT column of that table (roles may be granted to other roles and users, so permissions are "direct" or "recursive").

The third declaration declares oplsioc:member-e-mail property of memberships. The value is a literal string from DB.DBA.SYS_USERS.U_E_MAIL, if the grantee is active (not disabled) and is not a role and its e-mail address starts with 'mailto:'. The join between DB.DBA.SYS_ROLE_GRANTS and DB.DBA.SYS_USERS is made by equality (GI_SUPER = U_ID) because the alias active_user in the declaration "inherits" all conditions specified for user. In addition, the SPARQL compiler will add one more condition to check if the U_E_MAIL is not null because the NULL value is not a valid object and it knows that U_E_MAIL is not declared as NOT NULL.

The last declaration contains an option clause. As usual, this indicates that the basic functionality is good for many tasks but not for all. In this declaration, the ldap:belongs-to property establishes a relation between grantee (subject) and a granted role (object). Both subject and object IRIs are based on account name, DB.DBA.SYS_USERS.U_NAME, so the quad map pattern contains two references to different aliases of DB.DBA.SYS_USERS but no alias for DB.DBA.SYS_ROLE_GRANTS. Hence the declaration could produce a triple for every row of the Cartesian product of the DB.DBA.SYS_USERS. To fix the problem, option (using alias-name) tells the compiler to process the alias-name as if it's used in some quad map value of the pattern.

It is an error to use an alias only in where clause of the quad map pattern but neither in values or in option (using alias-name). To detect more typos, an alias used in quad map values can not appear in option (using alias-name) clause.