© 2001 The dbXML Group L.L.C.
This documentation is a work in progress. Links to the most current version can be found at http://www.dbxml.org/docs/
$Id: UsersGuide.xml,v 1.15 2001/09/19 04:31:30 kstaken Exp $
The dbXML Core server is a database server designed from the ground up to store XML data. The dbXML server is what is termed by the XML:DB Initiative as a Native XML Database. You could also refer to it as a seamless XML database which might be an easier to understand description.
What this means is that to the largest extent possible you work with XML tools and technologies when working with the data in the server. All data that goes into and out of the server is XML. The query language used is XPath and the programming APIs support DOM and SAX. All things that should be familiar to a developer used to using XML in their applications. When working with XML data and dbXML there is no mapping between different data models. You simply design your data as XML and store it as XML.
What this gives you can be summed up in one word, flexibility. XML provides an extremely flexible mechanism for modeling application data and in many cases will allow you to model constructs that are difficult or impossible to model in more traditional systems. By using a native XML database such as dbXML to store this data you can focus on building your applications and not worry about how the complex XML construct maps to the underlying data store.
Native XML database technology is a very new area and dbXML is very much a project still in development. The server currently supports storing well formed XML documents. This means it does not have any schema that constrains what can be placed into a document collection. This makes dbXML a semi-structured database and provides tremendous flexiblity in how you store your data but also means you give up some common database functionality such as data types.[1] In its current state dbXML is already a powerful tool for managing XML data. However, there is still much that needs to be done. Feedback and contributions are actively encouraged.
This document attempts to describe those features that are working and can be used today. You should review the README file that is part of the dbXML distribution for the most current status on the project.
NOTE: Both the dbXML server and this document are works in progress. Any comments are welcome and encouraged.
Document Collections:
XPath Query Engine:
XML Indexing
XML:DB XUpdate Implementation:
Java XML:DB API Implementation:
XMLObjects:
Command Line Management Tools:
CORBA Network API:
Modular Architecture:
The dbXML server is designed to store collections of XML documents. Collections can be arranged in a hierarchy similar to that of a typical UNIX or Windows file system.
In dbXML the data store is rooted in a database instance that can also be used as a document collection. This database instance can then contain any number of child collections. In a default install of dbXML the database instance is called 'db' and all collection paths will begin with /db. It is possible to rename the database instance if desired though it is not necessary to do so.
In dbXML collections are referenced in a similar manner to how you would work with a hierarchical file system.
If you had a collection created under 'db' called my-collection and a collection under that called my-child-collection the path used when accessing the my-child-collection collection would be
/db/my-collection/my-child-collection
Within collections there are several types of objects that can be stored. You can store XML documents, XMLObjects and other collections. Each of these objects can also be referenced via a path.
1.2. Collection Path Referencing a Document
Extending the previous example by adding a document to my-child-collection named my-document it to could be referenced via a path.
/db/my-collection/my-child-collection/my-document
There is one catch to this however. Since you can have more then one object in a collection with the same name [2] there is an order of precedence that is applied when evaluating a path. The order of precendence is collection followed by XMLObject and then document. What this means is that if you have a document and a collection with the same name you will not be able to retrieve the document. [3]
The dbXML server includes two command line programs that can be used to work with the server. The two tools are the dbxml command and the dbxmladmin command. The dbxml command is intended for use by end users and the dbxmladmin tool is for use by admininistrators. All commands that are available through the dbxml command are also available to the dbxmladmin command. A complete list of available commands and more detail about each command can be found in the Command Line Tools Reference Guide.
The commands are located in the dbXML-Core/bin directory and it is probably a good idea to add this directory to your PATH environment variable. All examples in this manual will assume that the dbXML-Core/bin directory is on the operating system path.
[1] XML Schema support will be added to a later version of dbXML. A schema will always be optional but it will be possible to use it to constrain a collection to particular document types and will also enable data type support
[2] e.g. a child collection and a document
[3] This restriction will be fixed in a later release of dbXML
In many ways the dbXML database can be viewed as a simple file store. This is of course a highly simplified view of things but is a useful place to get started in learning the functionality of the server.
The dbXML server provides facilities to store, retrieve and delete well formed XML documents.
2.1. Adding a Document With a Given Key
The document fx102.xml will be added to the collection /db/data/products and will be stored under the key fx102.
dbxml add_document -c /db/data/products -f fx102.xml -n fx102
2.2. Adding a Document Without a Key
The document fx102.xml will be added to the collection /db/data/products. No key is provided so one will be generated automatically by the server. The generated key will look similar to this 0625df6b0001a5d4000bc49d0060b6f5
dbxml add_document -c /db/data/products -f fx102.xml
Documents can be retrieved from the database using the ID that they were inserted under.
dbXML currently supports XPath as a query language. In many applications XPath is only applied at the document level but in dbXML XPath queries are executed at the collection level. This means that a query can be run against multiple documents and the result set will contain all matching nodes from all documents in the collection. The dbXML server also supports the creation of indexes on XML documents to speed up commonly used XPath queries. Please refer to the Administrators Guide for more detail about configuring indexes.
You can execute XPath queries against the database using the command line tools and the result of the query will be displayed.
3.1. Executing an XPath query against a collection of XML documents
Here we assume we have a collection /db/data/products that contain documents that are similar to the following.
<?xml version="1.0"?> <product product_id="120320"> <description>Glazed Ham</description> </product>
The XPath /product[@product_id="120320"] will be executed against the collection /db/data/products and all matching product entries will be returned.
dbxml xpath_query -c /db/data/products -q /product[@product_id="120320"]
The result of the query is an XPath node-set that contains one node for each result. In this particular example there is only one result and the node that matches is the root element so you get back basically the whole document.
To make it easy to link the result node back to the originating document, dbXML adds a few attributes to the result. These attributes are added in the NodeSource namespace that has the URI http://www.dbxml.org/NodeSource. The col attribute specifies the collection where the document can be found and the key attribute provides the key of the original document. Using this information it is possible to retrieve the original document that this node was selected from for further processing.
<product product_id="120320" xmlns:src="http://www.dbxml.org/NodeSource" src:col="/db/data/products" src:key="120320"> <description>Glazed Ham</description> </product>
If more then one result is found the results look something like this. This could be the result of the query /product
<product product_id="120320" xmlns:src="http://www.dbxml.org/NodeSource" src:col="/db/data/products" src:key="120320"> <description>Glazed Ham</description> </product> <product product_id="120321" xmlns:src="http://www.dbxml.org/NodeSource" src:col="/db/data/products" src:key="120321"> <description>Boiled Ham</description> </product> <product product_id="120322" xmlns:src="http://www.dbxml.org/NodeSource" src:col="/db/data/products" src:key="120322"> <description>Honey Ham</description> </product>
While it is certainly useful to be able to query from the command line it is probably more useful to be able to use the results of a query in an application. For more information on building applications for dbXML, please refer to the Developers Guide.