www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

RDF Database and SPARQL

Data Representation
RDF and SPARQL API and SQL
IRI Dereferencing
RDF Views -- Mapping Relational Data to RDF
SPARQL Implementation
RDF Inference in Virtuoso
Using Full Text Search in SPARQL
Aggregates in SPARQL
Examples Note on Aggregates and Inference
Virtuoso SPARQL Query Service

14.8. Aggregates in SPARQL

Virtuoso extends SPARQL with SQL like aggregate and group by functionality. This functionality is also available through embedding SPARQL text inside SQL but the SPARQL extension syntax has the benefit of working also over the SPARQL protocol and of looking more SPARQL-like.

The supported aggregates are count, min, max, avg and sum. These can take an optional distinct keyword. These are permitted only in the selection part of a select query. If a selection list consists of a mix of variables and aggregates, the non-aggregate selected items are considered to be grouping columns and a group by over them is implicitly added at the end of the generated SQL query. There is no explicit syntax for group by or having in Virtuoso SPARQL.

If a selection consists of aggregates exclusively, the result set has one row with the values of the aggregates. If there are aggregates and variables in the selection, the result set as has many rows as there are distinct combinations of the variables and the aggregates are calculated over each such distinct combination, as if there were a SQL group by over all non-aggregates.

With the count aggregate the argument may be either *, meaning counting all rows or a variable name, meaning counting all the rows where this variable is bound. If there is no implicit group by, there can be an optional distinct keyword before the variable that is the argument of an aggregate.

Because SPARQL does not have derived tables, there is a special syntax for counting distinct combinations of selected variables. This is:

select count distinct ?v1 ... ?vn
  from ....

14.8.1. Examples

sparql
select count (*)
  from <g>
 where {?s ?p ?o}

-- Returns the count of physical triples in g.
sparql define input:inference 'inft'
select ?p count (?o)
  from <inft>
 where {?s ?p ?o};

-- Returns the count of O's for each distinct P.
sparql define input:inference 'inft'
select count (?p) count (?o) count (distinct ?o)
 from <inft>
where {?s ?p ?o};

-- returns the count of triples, including inferred triples and the count of distinct O values.
select count distinct ?s ?p ?o
  from <g>
 where {?s ?p ?o}

-- Returns the number of distinct bindings of ?s ?p ?o.

14.8.2. Note on Aggregates and Inference

Inference is added to a SPARQL query only for the variables whose value is actually used.Thus,

select count (*)
 from <g>
where {?s ?p ?o}

will not make inferred values since s, p, and o are actually not used. Instead,

select count (?s) count (?p) count (?o)
 from <g>
where {?s ?p ?o}

will get all the inferred triples also.

This may be confusing and will likely be corrected in the future.