4XPath supports user-defined extension functions as specified by the XPath (and XSLT) Recommendations. It comes with a library of convenient extension functions for a range of purposes. These are listed here.
All built-in extension functions have the namespace URI 'http://namespaces.4suite.org/xpath/extensions'
node-set(rtf)
Convert a result-tree fragment to a node-set
ParametersReturn Value
rtf
of type result tree fragment
A result tree fragment such as generated by the body of an XSLT variable or parameter.
- node set
A node set consisting of all of the top-level nodes in the result tree fragment, including all resursive tree elements.
match(pattern, arg)
Match a Python regular expression against a string
ParametersReturn Value
pattern
of type string
A string representing a regular expression such as used by the Python re module..
arg
of type string
A string to be matched against the pattern.
- boolean
true if the string matches, otherwise false
escape-url(url)
Escape illegal characters in a URL
ParametersReturn Value
url
of type string
The URL to be escaped
- string
URL with all illegal characters escaped according to RFC 1738
iso-time()
Get the current time in ISO 8601 format
ParametersNoneReturn Value
- string
Current time in the format YYYY-MM-DD HH:MM:SS
evaluate(expr)
Evaluate an XPath expression at XSLT run time
ParametersReturn Value
expr
of type string
XPath expression to be parsed and evaluated using the current context.
- boolean, number, string, or node set
The result of evaluating the expression
distinct(nodeset)
Eliminates duplicates from a node set according to the string value of each
ParametersReturn Value
nodeset
of type node set
The node set to be processed
- node set
A node set from which all duplicates have been removed. Two nodes in a node set are considered duplicates if their string values are equal. The last node in each distinct group is the one that is retained in the final list, and the order of the node set may be disrupted.
split(arg, delim)
Split a string into a node set of text nodes.
ParametersReturn Value
arg
of type string
The string to be split
delim
of type string
The delimiter by which the string is to be split.
- node set
a node set of text nodes each of which represents a segment of the split string.
range(lo, hi)
generate a node set of text nodes containing numbers ascending from the low value to the high value.
ParametersReturn Value
lo
of type number
The starting point for the sequence of numbers.
hi
of type number
The ending point for the sequence of numbers.
- node set
A node set of text nodes, each of which represents a number value, starting from the low value to the high value, incrementing by one.
if(cond, v1, v2)
Select from two values based on a condition
ParametersReturn Value
cond
of type boolean
The condition to be checked
v1
of type boolean, number, string, or node set
The first choice
v2
of type boolean, number, string, or node set
The second choice
- boolean, number, string, or node set
The first value if the condition is true, otherwise the second value.
find(outer, inner)
Return the index of a substring within a string
ParametersReturn Value
outer
of type string
The string to be searched
inner
of type string
The substring to seek
- number
The zero-based index at which the inner string is first located within the outer string. -1 if the inner string is not found.
To define your own extension functions, define equivalent Python functions. The module in which they are defined must have global dictionary named "ExtFunctions" mapping function names to function objects. Function names consist of a tuple of two strings, the first being the namespace URI for the unique function, and the second being the local name.
Note that if you are using the extension function from within 4XSLT, the namespace URI must be a valid, identifying (but not necessarily addressable) URI, and in particular, it cannot be an empty string. If you are using the extension function directly from 4XPath, the namespace URI can be the empty string.
Finally, modules containing any extension functions used must be indicated as such to the processor in one of two ways. (1) They are listed in the environment variable "EXTMODULES". "EXTMODULES" is a colon-separated list of modules. (2) They are registered with 4XPath using the xml.xpath.Util.RegisterExtensionFunctions() function, which takes a list of module names. In either case, all extension modules must should be in the "PYTHONPATH".
For example:
#demo.py import time, urlparse from xml.xpath import Conversions def GetCurrentTime(context): '''available in XPath as get-curent-time()''' return asctime(localtime()) def HashContextName(context, maxkey): ''' available in XPath as hash-context-name(maxkey), where maxkey is a numeric expression ''' #It is a good idea to use the appropriate core function to coerce #arguments to the expected type maxkey = Conversions.NumberValue(maxkey) key = reduce(lambda a, b: a + b, context.node.nodeName) return key % maxkey ExtFunctions = { ('http://spam.com', 'get-curent-time'): GetCurentTime, ('http://spam.com', 'hash-context-name'): HashContextName }
In order to use these functions, be sure that "demo" (the module name) is in the EXTMODULES environment variable, or that you call xml.xpath.Util.RegisterExtensionFunctions(). If you are using them directly from 4XPath, however, you need to do one more thing: you need to set up a prefix that maps to the namespace of the functions you've defined ("http://spam.com", in this case).
You can do this by setting the "processorNss" attribute on the context you pass to the appropriate XPath method. For instance:
from xml.dom import ext from xml.dom.ext.reader import Sax2 from xml.xpath import Evaluate, Util from xml.xpath.Context import Context try: doc = Sax2.FromXmlFile('myfile.xml', validate=0) except Sax2.saxlib.SAXException, msg: print "SAXException caught:", msg except Sax2.saxlib.SAXParseException, msg: print "SAXParseException caught:", msg Util.IndexDocument(doc) context = Context(doc, 1, 1, processorNss={'ext': 'http://spam.com'}) result = Evaluate("/transaction[@timestamp=ext:get-curent-time()]", doc) Util.FreeDocumentIndex(doc) ext.ReleaseNode(doc)
Note that you might choose to use the empty string for the extension function namespaces. If so, you don't need to specify the processorNss context attribute, but you might want to watch out for clashes with other extenstion function names, including the built-in library. Again, if you plan to use an extension function from within XSLT, it must have a non-null namespace URI.