The Cache Manager
Caching is a required part of any efficient Internet access
applications as it saves bandwidth and improves access performance
significantly in almost all types of accesses. The Library supports
two different types of cache: The memory cache and the file cache. The
two types differ in several ways which reflects their two main
purposes: The memory cache is for short term storage of graphic
objects whereas the file cache is for intermediate term storage of
data objects. Often it is desirable to have both a memory and a file
version of a cached document, so the two types do not exclude each
other. The following paragraphs explains how the two caches can be
maintained in the Library.
The memory cache is largely managed by the application as it simply
consists of keeping the graphic objects described by the
HyperDoc
object in memory as the user keeps requesting
new documents. The HyperDoc
object is only declared in
the Library - the real definition is left to the application as it is
for the application to handle graphic objects. The Line Mode Browser
has its own definition of the HyperDoc
object called
HText
. Before a request is processed over the net,
the anchor object is searched for a HyperDoc
object
and a new request is issued only if this is not present or the Library
explicitly has been asked to reload the document, which is described
in the section Short Circuiting the Cache
As the management of the graphic object is handled by the application,
it is also for the application to handle the garbage collection of the
memory cache. The Line Mode Browser
has a very simple memory management of how long graphic objects stay
around in memory. It is determined by a constant in the GridText
module and is by default set to 5 documents. This approach can be much
more advanced and the memory garbage collection can be determined by
the size of the graphic objects, when they expire etc., but the API is
the same no matter how the garbage collector is implemented.
File Cache
The file cache is intended for intermediate term storage of documents
or data objects that can not be represented by the
HyperDoc
object which is referenced by the
HTAnchor
object. As the definition of the
HyperDoc
object is done by the application there is no
explicit rule of what graphic objects that can not be described by the
HyperDoc
, but often it is binary objects, like images
etc.
The file cache in the Library is a very simple implementation in the
sense that no intelligent garbage collection has been defined. It has
been the goal to collect experience from the file cache in the W3C
proxy server before an intelligent garbage collector is implemented in
the Library. Currently the following functions can be used to control
the cache, which is disabled by default:
HTCache_enable()
, HTCache_disable()
, and
HTCache_isEnabled()
- Use these functions to enable and disable the cache
HTCache_setRoot()
and HTCache_getRoot()
- Use these functions to set and get the value of the cache root
An important difference between the memory cache and the file cache is
the format of the data. In the memory cache, the cached objects are
graphic objects ready to be displayed to the user. In the file cache
the data objects are stored along with their metainformation so that
important header information like Expires, Last-Modified, Language
etc. is a part of the stored object.
In situations where a cached document is known to be stale it is
desired to flush any existent version of a document in either the
memory cache or the file cache and perform a reload from the
authoritative server. This can for example be the case if an expires
header has been defined for the document when returned from the origin
server. Forcing a refresh from either the memory cache, the file
cache, or both can be done using the following function:
void HTRequest_setReload (HTRequest *request, HTReload mode);
HTReload HTRequest_reload (HTRequest *request);
where HTReload
can be either of the values
- HT_ANY_VERSION
- Use any version available, either from memory cache or from local
file cache
- HT_MEM_REFRESH
- Non-authoritative update of any version stored in
memory. The new version can either come from the local file cache, a
proxy cache or the network. If the request falls through to the
network, the Library issues a conditional GET using a
If-Modified-Since header. There are two main purposes for
this mode:
- If the disk cache is private to exactly one application then a
version stored in the local disk cache does normally not differ in
time from a version in memory - they have been created at the same
time. However, in a shared cache environment, the two versions can
differ and this flag can be used to force an update to the latest
version in the file cache.
- If the application wants to see the metainformation as received
from the network, then the object in the file cache provides this
information whereas the version in memory does not.
- HT_CACHE_REFRESH
- Authoritative update of any version stored in the local
file cache or a proxy cache. The Library issues a conditional GET
using a If-Modified-Since header and a Pragma:
no-cache to ensure that the response is authoritative.
- HT_FORCE_RELOAD
- Unconditinal reload from the network using the Pragma:
no-proxy directive in order to insure that the reload is passed
to any proxy server on the way to the origin server
If the Library receives either an authoritative or non-authoritative
"304 Not Modified" response upon any of the requests above, it
Handling Expired Documents
There are various ways of handling Expires header when met in a
history list. Either it can be ignored all together, the user can be
notified with a warning, or the document can be reloaded
automatically. The Libarry supports either way, as it should be up to
the user to decide. The default action is HT_EXPIRES_IGNORE, but other
modes are to notify the user that a document is stale without
reloading it, and to do an automatic relaod of the document. Th
functions to use are in this case:
void HTAccess_setExpiresMode (HTExpiresMode mode, char * notify);
HTExpiresMode HTAccess_expiresMode ();
where HTExpiresMode
can take any of the values:
HT_EXPIRES_IGNORE
HT_EXPIRES_NOTIFY
HT_EXPIRES_AUTO