Registering Protocol Headers
The Library provides a few powerful mechanisms to handle document metainformation and how to
generate and parse additional header information coming across the network. This section describes
how to handle metainformation and headers and how this can be used to experiment with existing
protocols by means of additional headers.
Header Generation
Outgoing metainformation describing preferences in requests or entities to be sent to a remote
server is handled in two ways: The Library supports a "native" set (called known headers of
headers which can be manipulated directly, but it also provides support for header extensions
defined by the application. This section describes how both the existing set of headers and the
extensions can be handled.
Generating Known Headers
The Library manages a "native" set of protocol headers which we will introduce in this section. The
default behavior for the Library is to use a representative set of headers on each request but all
headers can be explicitly enabled or disabled on a per request basic by the application. Here we
will mainly describe the set of native headers but leave the description of how to manipulate them
for the section on managing Request objects. The native set of headers
fall into the following three categories:
- General Headers
- There are a few header fields which have general applicability for both request and response
messages, but which do not apply to the communication parties or the entity being transferred. This
mask enables and disables these headers. If the bit is not turned on they are not sent. All headers
are optional and the default value is not to use any of these headers at all.
- Request Headers
- The request header fields allow the client to pass additional information about the request (and
about the client itself) to the server. All headers are optional but the default behavior is to use
all request headers except
From
and Pragma
. The reason is that
the former in general requires permission by the user and the latter has special meanings for proxy
servers.
- Entity Headers
- The entity headers contain information about the object sent in the HTTP transaction. See the anchor module, for the storage of entity headers. This
flag defines which headers are to be sent in a request together with an entity body. All headers are
optional but the default value is to use as many as possible.
As mentioned, the set of native headers are equivalent to the set of header defined by the HTTP/1.1 protocol specification. The Library also provides
functionality for registering additional headers which we will have a look at in the next section.
Generating Additional Headers
The API for handling extra headers is provided by the
Header Manager. The API is built in exactly the same way as we have seen in section Prefs.html, that is it uses lists of objects as the main
component. This time the elements registered is callback functions which the application provides
the definition of. Each time a request is to be generated, the Library looks to see if a list of
callback functions has been registered to provide additional metainformation to send along with the
request. If this is the case then each of these callback functions will be called in turn and the
resulting request is then the sum of the original response and the information provided by the
callback functions.
It should be mentioned, however, that this API is simple to use if you have a relative small amount
of extra metainformation to provide and that it easily fits into an existing protocol. It is not
suited for building entire new protocols, or to provide a massive amount of new information. In this
case you need a more powerful model which the Library also provides: building your own
stream. Actually this is exactly the way the the Library implements large parts of itself, but it
requires normally a bit more work before you can get an application pout together.
Let us jump right in to it and have a closer look at the API. Exactly as for the request preferences
you can add and delete an element, which in this case is a callback function. This
function has a special definition which is given by
typedef int HTPostCallback (HTRequest *request, HTStream * target);
We have already seen the Request object before, but the Stream object is new. Or
actually it isn't, it has just not been mentioned explicitly in the previous sections. We will hear
a lot more about the stream object later in this guide. For now it is sufficient to know that a
stream i an object that accepts streams of characters - much like an ANSI file stream object
does. The return value of the callback function is currently not used but is reserved for future
use. We can register a callback function of type HTPostCallback by using the following
function:
extern BOOL HTGenerator_add (HTList * gens, HTPostCallback * callback);
The first argument is the well-known list object and the second is the address of the function that
we want to be called each time a request is generated. When the callback function is called by the
Library it must generate its metainformation and send it down the stream which eventually will end
up on the network as part of the final request. In exactly the same way you can unregister a
callback function at any time by calling the following function:
extern BOOL HTGenerator_delete (HTList * gens, HTPostCallback * callback);
Header Parsing
The MIME parser stream parses MIME metainformation,
for example generated by MIME-like protocols, such as HTTP, NNTP, and soon SMTP as well. For HTTP it handles all headers as defined in HTTP/1.1 of the
specification. When a MIME header is parsed, the obtained metainformation about the document is
stored in the anchor object where it can be accessed by the application
using the methods of the Anchor module. The
metainformation in an anchor object can also be used to describe a data object that is to be
sent to a remote location, for example using HTTP or NNTP, but we will describe this in
more detail later in this guide. In this case the order is reversed as the application provides the
metainformation and the appropriate headers are generated instead of generating the entries in the
anchor object by parsing the headers.
Parsing Known Headers
The set of headers directly handled by the internal MIME
parser is the reader is referred to the actual implementation in order to see the exact
list. However, some of the more special headers are:
Allow
- Builds a list of allowed methods for this entity
ContentEncoding
-
ContentLanguage
- Builds a list of natural languages
ContentLength
- This parameter is now passed
ContentType
- The
ContentType
header now support the charset
parameter and the
level
parameter, however none of them are used by the HTML parser
Date
, Expires
, RetryAfter
, and LastModified
- All date and time headers are parsed understanding the following formats: RFC 1123, RFC 850,
ANSI C's asctime(), and delta time. The latter is a non-negative integer indicating seconds after
the message was received. Note, that it is always for the application to issue a new
request as a function of any of the date and time headers..
DerivedFrom, Version
- For handling version control when managing collaborative works using HTTP.
Parsing Additional Headers
In many cases, if you have registered an extra set of headers to be generated, you are also in a
situation where you would like to handle the result that is returned by the remote server. As we will
describe in this section, the Library provides a very similar interface to the one presented above
for generating extra headers.
Again, the API for handling extra headers is provided by the Header Manager and is based on managing list objects,
just like we have seen many times before. Each time a request is received, and a unknown header is
encountered by the internal MIME parser, the Library
looks to see if a list of callback functions has been registered to parse additional
metainformation. In case a parser is found for this particular header, the call back is called with
the header and all parameters that might follow it. As MIME headers can contain line wrappings, the
MIME parser canonicalizes the header line before the
callback function is called which makes the job easier for the callback function.
Exactly as for the header generators you can add and delete an element, which also
in this case is a callback function. This function has a special definition which is given by
typedef int HTParserCallback (HTRequest * request, CONST char * token);
The request object is the current request being handled and the token is the header that was
encountered together with all parameters following it. The callback can return a value to the
Library by using the return code of the callback function. Currently there are two return values
recognized by the Library:
HT_OK
if the token is received and understood
HT_ERROR
if the callback encounters a fatal error and any further parsing should be stopped.
While in the callback function, the application can start other requests or even kill the current
request if required. We can register a callback function by using the following function:
extern BOOL HTParser_add (HTList * parsers,
CONST char * token,
BOOL case_sensitive,
HTParserCallback * callback);
Again, the first argument is a list as we have seen before. The token is a specific token by which
the callback function should be called. This token can contain a wild card (*) which will match zero
or more arbitrary characters. You can also specify whether the token should be matched using a case
sensitive or case insensitive matching algorithm. Let's look at an example of how to register a
parser callback function:
HTParser_add(mylist, "PICS-*", NO, myparser);
This registers the myparser
function as being capable of handling all tokens starting
with "PICS", "PiCs", "pics", for example:
PICS-start
pics-Token
PICS
As for header generators, you can unregister a callback function by using the following function:
extern BOOL HTParser_delete (HTList * parsers, CONST char * token);
Enabling Preferences
Exactly as for Request Preferences, all we have done until now is to show
how we can register sets of preferences. However, we still need to define where and when to actually
let the Library use the preferences. Again, this can be done in two ways: Globally or locally. When
assigning a set of preferences, for example the set of natural languages, it can either be assigned
to all future requests (globally) or to a specific request (locally). The preferences can also
partly be assigned globally and partly locally so that the most common preferences are registered
globally and only some preferences specific to a single request is then added by registering the
sets locally.
Here we will only show how to handle the global registration as the local registration is part of
the description of the request object.
Additional Header Parsers
extern void HTHeader_setParser (HTList * list);
extern HTList * HTHeader_parser (void);
Additional Header Generatores
extern void HTHeader_setGenerator (HTList * list);
extern HTList * HTHeader_generator (void);
Cleaning up Preferences
As for request preferences, the application is responsible for setting up the sets of additional
header generation and parsing, and it is also responsible for deleting them once they are not needed
anymore, for example when the application i s closing down, or the user has changed them. The
Library provides two mechanisms for cleaning up old lists: It can either be done by invoking
separate methods on each set of preferences, or it can be done in a batch of all globally registered
preferences or all locally registered preferences relative to a single request. In this context, a
batch is the total set of registered converters, encoder, character sets, and languages. Here we
will only show how to cleanup preferences set-wise and as a globally batch of preferences. We leave
the local cleanup until we have described the request object later in this guide.
As for the other deletion methods, when they have been called you can nor more use the lists as they
are not pointing to valid places in the memory. The first mechanism for cleaning up lists is by
calling the cleanup method of each preference as indicated below:
Header Parsers
extern BOOL HTParser_delete (HTList * parsers, CONST char * token);
extern BOOL HTParser_deleteAll (HTList * parsers);
Header Generators
extern BOOL HTGenerator_delete (HTList * gens, HTPostCallback * callback);
extern BOOL HTGenerator_deleteAll (HTList * gens);
The easy way of cleaning up all global lists at once is calling the following function
extern void HTHeader_deleteAll (void);
Henrik Frystyk, libwww@w3.org, December 1995