org.opencyc.xml
Class OpenDirectoryToDaml

java.lang.Object
  |
  +--org.opencyc.xml.OpenDirectoryToDaml

public class OpenDirectoryToDaml
extends java.lang.Object

Translates non-compliant OpenDirectory RDF Structure file into DAML compliant format.

Author:
Stephen L. Reed

Copyright 2001 Cycorp, Inc., license is open source GNU LGPL.

the license

www.opencyc.org

OpenCyc at SourceForge

THIS SOFTWARE AND KNOWLEDGE BASE CONTENT ARE PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE OPENCYC ORGANIZATION OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE AND KNOWLEDGE BASE CONTENT, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Field Summary
protected  java.util.HashMap categoryIds
          dictionary of Open Directory topics to category ids
protected  java.io.PrintWriter damlOutput
          DAML output stream
 java.lang.String damlOutputPathName
          the name of the DAML output file.
protected static int DEFAULT_VERBOSITY
          The default verbosity of the DAML export output.
protected  long nbrOfTopics
          number of topics translated
protected  long nbrOfTriples
          number of RDF triples translated
protected  java.io.LineNumberReader openDirectoryInput
          Open Directory RDF structure input stream
 java.lang.String openDirectoryURLString
          the name of the ODP imput URL.
 boolean sample
          When true translates a sample of the Open Directory RDF structure file, which is useful in testing with a small editable file.
 int verbosity
          Sets verbosity of the DAML export output.
 
Constructor Summary
OpenDirectoryToDaml()
          Constructs a new OpenDirectoryToDaml object.
 
Method Summary
protected  void BypassInputRDFHeader()
          Bypasses the non-compliant RDF header information in the Open Directory input file.
protected  java.lang.String escape(java.lang.String text)
          Escapes characters in XML names.
protected  java.lang.String getCategoryId(java.lang.String topic)
          Substitutes category id for topic in resource references, as Open Directory resource ids are not valid XML names.
protected  void indexCategoryIds()
          Opens the input URL and indexes the input OpenDirectory RDF category ids.
protected  void indexRDF()
          Indexes the input OpenDirectory RDF category ids.
static void main(java.lang.String[] args)
          Provides the main class for this application.
protected  void translate()
          Translates the input OpenDirectory RDF (non-compliant) structure file into a DAML compliant format.
protected  void TranslateRDF()
          Translates the Open Directory RDF content to DAML.
protected  void WriteDAMLHeader()
          Writes the XML header information to the output DAML file.
protected  void WriteRDFClosingTag()
          Writes the RDF closing tag to the output DAML file.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_VERBOSITY

protected static final int DEFAULT_VERBOSITY
The default verbosity of the DAML export output. 0 --> quiet ... 9 -> maximum diagnostic input.

verbosity

public int verbosity
Sets verbosity of the DAML export output. 0 --> quiet ... 9 -> maximum diagnostic input.

sample

public boolean sample
When true translates a sample of the Open Directory RDF structure file, which is useful in testing with a small editable file.

openDirectoryURLString

public java.lang.String openDirectoryURLString
the name of the ODP imput URL.

damlOutputPathName

public java.lang.String damlOutputPathName
the name of the DAML output file.

openDirectoryInput

protected java.io.LineNumberReader openDirectoryInput
Open Directory RDF structure input stream

damlOutput

protected java.io.PrintWriter damlOutput
DAML output stream

categoryIds

protected java.util.HashMap categoryIds
dictionary of Open Directory topics to category ids

nbrOfTopics

protected long nbrOfTopics
number of topics translated

nbrOfTriples

protected long nbrOfTriples
number of RDF triples translated
Constructor Detail

OpenDirectoryToDaml

public OpenDirectoryToDaml()
Constructs a new OpenDirectoryToDaml object.
Method Detail

main

public static void main(java.lang.String[] args)
Provides the main class for this application.
Parameters:
args - the command line arguments are ignored

indexCategoryIds

protected void indexCategoryIds()
Opens the input URL and indexes the input OpenDirectory RDF category ids.

indexRDF

protected void indexRDF()
                 throws java.io.IOException
Indexes the input OpenDirectory RDF category ids.

translate

protected void translate()
Translates the input OpenDirectory RDF (non-compliant) structure file into a DAML compliant format. UTF-8 character encoding is used by Open Directory for alternate language strings.

WriteDAMLHeader

protected void WriteDAMLHeader()
                        throws java.io.IOException
Writes the XML header information to the output DAML file.

BypassInputRDFHeader

protected void BypassInputRDFHeader()
                             throws java.io.IOException
Bypasses the non-compliant RDF header information in the Open Directory input file.

TranslateRDF

protected void TranslateRDF()
                     throws java.io.IOException
Translates the Open Directory RDF content to DAML.

getCategoryId

protected java.lang.String getCategoryId(java.lang.String topic)
Substitutes category id for topic in resource references, as Open Directory resource ids are not valid XML names.
Parameters:
topic - the Open Directory topic
Returns:
the corresponding category identifier

escape

protected java.lang.String escape(java.lang.String text)
Escapes characters in XML names.
Parameters:
text -  
Returns:
the escaped text

WriteRDFClosingTag

protected void WriteRDFClosingTag()
                           throws java.io.IOException
Writes the RDF closing tag to the output DAML file.