xyplot {lattice} | R Documentation |
These are the most commonly used Trellis functions to look at pairs of
variables. By far the most common is xyplot
, designed mainly
for two continuous variates (though factors can be supplied as well,
in which case they will simply be coerced to numeric), which produces
Conditional Scatterplots. The others are useful when one of the
variates is a factor or a shingle. See details below.
Most of the arguments documented here are also applicable for many of the other Trellis functions. These are not described in any detail elsewhere, and this should be considered the canonical documentation for such arguments.
Note that any arguments passed to these functions and not recognized by them will be passed to the panel function. Most predefined panel functions have arguments that customize its output. These arguments are described only in the help pages for these panel functions, but can usually be supplied as arguments to the high level plot.
xyplot(formula, data = parent.frame(), panel = if (is.null(groups)) "panel.xyplot" else "panel.superpose", allow.multiple = TRUE, outer, aspect = "fill", as.table = FALSE, between, groups, key, auto.key = FALSE, layout, main, page, par.strip.text, prepanel, scales, skip, strip = "strip.default", sub, xlab, xlim, ylab, ylim, ..., subscripts, subset) dotplot(formula, data, panel = "panel.dotplot", groups = NULL, ..., subset = TRUE) barchart(formula, data, panel = "panel.barchart", box.ratio = 2, groups = NULL, ..., subset = TRUE) stripplot(formula, data, panel = "panel.stripplot", jitter = FALSE, factor = .5, box.ratio = if (jitter) 1 else 0, groups = NULL, ..., subset = TRUE) bwplot(formula, data, panel = "panel.bwplot", box.ratio = 1, ..., subset = TRUE)
formula |
a formula describing the form of conditioning plot. The formula is
generally of the form y ~ x | g1 * g2 * ... , indicating
that plots of y (on the y axis) versus x (on the x
axis) should be produced conditional on the variables
g1, g2, ... . However, the conditioning variables
g1,g2,... may be omitted. For S-Plus compatibility, the
formula can also be written as y ~ x | g1 + g2 + ... .
The conditioning variables g1, g2, ... must be either
factors or shingles (Shingles are a way of processing numeric
variables for use in conditioning. See documentation of
shingle for details. Like factors, they have a
`levels' attribute, which is used in producing the conditioning
plots). For each unique combination of the levels of the
conditioning variables g1, g2, ... , a separate panel is
produced using the points (x,y) for the subset of the data
(also called packet) defined by that combination.
Numeric conditioning variables are converted to shingles by the function shingle (however, using equal.count
might be more appropriate in many cases) and character vectors are
coerced to factors.
The formula can involve expressions, e.g. sqrt(),log() .
A special case is when the left and/or right sides of the formula (before the conditioning variables) contain a `+' sign, e.g., y1+y2 ~ x | a*b . Normally, this would result in the sum of
the vectors y1 and y2 being plotted against
x . However, if the allow.multiple flag is set to
TRUE , this is treated differently. The formula given above,
for instance, would be taken to mean that the user wants to plot
both y1~x | a*b and y2~x | a*b , but with the
y1~x and y2~x superposed in each panel (this is
slightly more complicated in barchart ). The two parts
would be distinguished by different graphical parameters. This is
essentially what the groups argument would produce, if
y1 and y2 were concatenated to produce a longer
vector, with the groups argument being an indicator of which
rows come from which variable. In fact, this is exactly what is done
internally using the reshape function. This feature
cannot be used in conjunction with the groups argument.
A variation on this feature is when the outer argument is set
to TRUE as well as allow.multiple . In that case, the
plots are not superposed in each panel, but instead separated into
different panels (as if a new conditioning variable has been added).
The x and y variables should both be numeric in
xyplot , and an attempt is made to coerce them if
not. However, if either is a factor, the levels of that factor are
used as axis labels. In the other four functions documented here,
exactly one of x and y should be numeric, and the
other a factor or shingle. Which of these will happen is determined
by the horizontal argument if horizontal=TRUE ,
then y will be coerced to be a factor or shingle, otherwise
x . The default value of horizontal is FALSE if
x is a factor or shingle, TRUE otherwise. (The
functionality provided by horizontal=FALSE is not
S-compatible.)
All points with at least one of its values missing (NA) in any of the variates involved are omitted from the plot. |
data |
a data frame containing values for any variables in the
formula, as well as groups and subset if applicable.
By default the environment where the function was called from is
used.
|
allow.multiple, outer |
logical flags to control what happens with formulas like y1 +
y2 ~ x . See the entry for formula for details.
|
box.ratio |
applicable to bwplot, bachart and
strpplot , specifies the ratio of the width of the rectangles
to the inter rectangle space.
|
horizontal |
logical, applicable to bwplot, dotplot,
barchart and stripplot . Determines which of x and
y is to be a factor or shingle (y if TRUE, x
otherwise). Defaults to FALSE if x is a factor or
shingle, TRUE otherwise. This argument is used to process the
arguments to these high level functions, but more importantly, it is
passed as an argument to the panel function, which is supposed to
use it as approporiate.
A potentially useful component of scales is this case might
be abbreviate = TRUE , in which case long labels which would
usually overlap will be abbreviated. scales could also
contain a minlength argument in this case, which would be
passed to the abbreviate function.
|
jitter |
logical specifying whether the values should be jittered by adding a random noise in stripplot. |
factor |
numeric controlling amount of jitter. Inverse effect compared to S ? |
panel |
Once the subset of rows defined by each unique
combination of the levels of the grouping variables are obtained
(see above), the corresponding x and y variables (or
some other variables, as appropriate, in the case of other
functions) are passed on to be plotted in each panel. The actual
plotting is done by the function specified by the panel
argument. Each high level function has its own default panel
function, which could depend on whether the groups argument
was supplied.
The panel function can be a function object or a character string giving the name of a predefined function. (The latter is preferred when possible, especially when the trellis object returned by the high level function is to be stored and plotted later.) Much of the power of Trellis Graphics comes from the ability to define customized panel functions. A panel function appropriate for the functions described here would usually expect arguments named x and y , which would be provided by the
conditioning process. It can also have other arguments. It might be
useful to know in this context that all arguments passed to a high
level Trellis function (such as xyplot ) that are not
recognized by it are passed through to the panel function. It is
thus generally good practice when defining panel functions to allow
a ... argument. Such extra arguments typically control
graphical parameters, but other uses are also common. See
documentation for individual panel functions for specifics.
Note that unlike in S-Plus, it is not guaranteed that panel functions will be supplied only numeric vectors for the x and
y arguments; they can be factors as well (but not
shingles). panel functions need to handle this case, which to get
the old behaviour could simply coerce them to numeric.
Technically speaking, panel functions must be written using Grid graphics functions. However, knowledge of Grid is usually not necessary to construct new custom panel functions, there are several predefined panel functions which can help; for example, panel.grid , panel.loess etc. There are also some
grid-compatible replacements of base R graphics functions useful for
this purpose, such as llines . (Note that the corresponding
base R graphics functions like lines would not work.) These
are usually sufficient to convert existing custom panel functions
written for S-Plus.
One case where a bit more is required of the panel function is when the groups argument is not null. In that case, the panel
function should also accept arguments named groups and
subscripts (see below for details). A very useful panel
function predefined for use in such cases is panel.superpose ,
which can be combined with different panel.groups
functions. See the examples section for an interaction plot
constructed this way. Several other panel functions can also handle
the groups argument, including the default ones for barchart,
dotplot and stripplot .
panel.xyplot has an argument called type which is
worth mentioning here because it is quite frequently used (and as
mentioned above, can be passed to xyplot directly). panel
functions for bwplot and friends should have an argument
called horizontal to account for the cases when x is
the factor or shingle.
|
panel.groups |
relevant for xyplot and densityplot
only, applies when panel is panel.superpose (which
happens by default in these cases if groups is non-null)
|
aspect |
controls physical aspect ratio of the panels (same for
all the panels). It can be specified as a ratio (vertical
size/horizontal size) or as a character string. Legitimate
values are "fill" (the default) which tries to make the panels as
big as possible to fill the available space, and "xy", which
tries to compute the aspect based on the 45 degree banking
rule (see Visualizing Data by William S. Cleveland for
details).
If a prepanel function is specified, the dx, dy
components returned by it are used to compute the aspect, otherwise
the default prepanel function is used. Currently, only the default
prepanel function for xyplot produces sensible banking
calculations.
The current implementation of banking is not very sophisticated, but is not totally vague either. See banking for details.
|
as.table |
logical that controls the order in which panels should be plotted: if FALSE, panels are drawn left to right, bottom to top (graph), if TRUE, left to right, top to bottom (matrix). |
between |
a list with components x and y (both
usually 0 by default), numeric vectors specifying the space between
the panels (units are character heights). x and y are
repeated to account for all panels in a page and any extra
components are ignored. The result is used for all pages in a
multipage display. (In other words, it is not possible to use
different between values for different pages).
|
groups |
used typically with panel=panel.superpose
to allow display controls (color, lty etc) to vary according
to a grouping variable. Formally, if groups is specified, then
groups along with subscripts is passed to the panel
function, which is expected to handle these arguments.
It is very common to use a key (legend) when a grouping variable is specified. See entries for key, auto.key and
simpleKey for how to draw a key.
|
auto.key |
A logical (indicating whether a key is to be drawn automatically when
a grouping variable is present in the plot), or a list of parameters
to be passed to simpleKey . If auto.key is not
FALSE and no key argument is specified in the call,
simpleKey is called with levels(groups) as the first
argument and the result is added as the key component of the
final ``trellis'' object. (Note: this may not work in all high level
functions.)
simpleKey uses the trellis settings to determine the
graphical parameters in the key, so this will be meaningful only if
the settings are used in the plot as well.
|
key |
A list of arguments that define a legend to be drawn on the plot
(see also the less flexible but usually sufficient shortcut function
simpleKey that can generate such a list).
The position of the legend can be controlled in either of two possible ways. If a component called space is present, the
key is positioned outside the plot region, in one of the four sides,
determined by the value of space , which can be one of
``top'', ``bottom'', ``left'' and ``right''. Alternatively, the key
can be positioned inside the plot region by specifying components
x,y and corner . x and y determine the
location of the corner of the key given by corner , which can
be one of c(0,0), c(1,0), c(1,1),c(0,1) , which denote the
corners of the unit square. x and y must be numbers
between 0 and 1, giving coordinates with respect to the whole
display area.
The key essentially consists of a number of columns, possibly divided into blocks, each containing some rows. The contents of the key are determined by (possibly repeated) components named ``rectangles'', ``lines'', ``points'' or ``text''. Each of these must be lists with relevant graphical parameters (see later) controlling their appearance. The key list itself can contain
graphical parameters, these would be used if relevant graphical
components are omitted from the other components.
The length (number of rows) of each such column is taken to be the largest of the lengths of the graphical components, including the ones specified outside. The ``text'' component has to have a character or expression vector as its first component, and the length of this vector determines the number of rows. The graphical components that can be included in key (and
also in the components named ``text'', ``lines'', ``points'' and
``rectangles'' when appropriate) are cex=1, col="black",
lty=1, lwd=1, font=1, pch=8, adj=0, type="l", size=5, angle=0,
density=-1 . adj, angle, density are
unimplemented. size determines the width of columns of
rectangles and lines in character widths. type is relevant
for lines; `"l"' denotes a line, `"p"' denotes a point, and `"b"'
and `"o"' both denote both together.
Other possible components of key are:
between : numeric vector giving the amount of space (character
widths) surrounding each column (split equally on both sides),
title : string or expression, title of the key,
cex.title
background : defaults to default background
border : color of border, black if TRUE, defaults to FALSE (no
border drawn)
transparent=FALSE : logical, whether key area should be cleared
columns : the number of columns column-blocks the key is to be
divided into, which are drawn side by side.
betwen.columns : Space between column blocks, in addition to
between .
divide Number of point symbols to divide each line when
type is `"b"' or `"o"' in lines .
|
layout |
In general, a Trellis conditioning plot consists of
several panels arranged in a rectangular array, possibly spanning
multiple pages. layout determines this arrangement.
layout is a numeric vector giving the number of columns, rows
and pages in a multipanel display. By default, the number of columns
is determined by the number of levels in the first given variable;
the number of rows is the number of levels of the second given
variable. If there is one given variable, the default layout vector
is c(0,n) , where n is the number of levels of the given vector. Any
time the first value in the layout vector is 0 , the second value is
used as the desired number of panels per page and the actual layout
is computed from this, taking into account the aspect ratio of the
panels and the device dimensions (via par("din") ). The number
of pages is by default set to as many as is required to plot all the
panels. In general, giving a high value of layout[3] is not
wasteful because blank pages are never created.
|
main |
character string or expression or list describing main
title to be placed on top of each page. Defaults to NULL . Can
be a character string or expression, or a list with components
label, cex, col, font . The label tag can be omitted if
it is the first element of the list. Expressions are treated as
specification of LaTeX-like markup as in plotmath
|
page |
a function of one argument (page number) to be called after drawing each page. The function must be `grid-compliant', and is called with the whole display area as the default viewport. |
par.strip.text |
list of graphical parameters to control the
strip text, possible components are col, cex, font, lines .
The first three control graphical parameters while the last is a
means of altering the height of the strips. This can be useful, for
example, if the strip labels (derived from factor levels, say) are
double height (i.e., contains ``\n''-s) or if the default height
seems too small or too large.
|
prepanel |
function that takes the same arguments as the
panel function and returns a list containing four components
xlim, ylim, dx, dy . If xlim and ylim are not
explicitly specified (possibly as components in scales ), then
the actual limits of the panels are guaranteed to include the limits
returned by the prepanel function. This happens globally if the
relation component of scales is "same", and on a panel
by panel basis otherwise. See xlim to see what forms of the
components xlim, ylim are allowed.
The dx and dy components are used for banking
computations in case aspect is specified as "xy". See
documentation for the function banking for details regarding
how this is done.
The return value of the prepanel function need not have all the components named above; in case some are missing, they are replaced by the usual componentwise defaults. The prepanel function is responsible for providing a meaningful return value when the x, y (etc.) variables are zero-length
vectors. When nothing is appropriate, values of NA should be
returned for the xlim and ylim components.
|
scales |
list determining how the x- and y-axes (tick marks and
labels) are drawn. The list contains parameters in name=value form,
and may also contain two other lists called x and y of
the same form (described below). Components of x and y
affect the respective axes only, while those in scales affect
both. (When parameters are specified in both lists, the values in
x or y are used.) The components are :
relation : determines limits of the axis. Possible values are "same" (default), "free" and "sliced". For relation="same", the same limits (determined by xlim, ylim, scales$limits etc) are used for
all the panels. For relation="free", limits for each panel is
determined by the points in that panel. Behaviour for relation =
"sliced" is similar, except that the length (max - min) of the
scales are constrained to remain the same across panels. The values
of xlim etc, even if specified explicitly, are ignored if
relation is different from "same".
tick.number: Suggested number of ticks. draw = TRUE: logical, whether to draw the axis at all. alternating = TRUE/c(1,2): logical specifying whether axes alternate from one side of the group of panels to the other. For more accurate control, alternating can be a vector (replicated to be as long as the number of rows or columns per page) consisting of the possible numbers 0=do not draw, 1=bottom/left, 2=top/right and 3=both. alternating applies only when relation="same". limits: same as xlim and ylim. at: location of tick marks along the axis (in native coordinates), or a list as long as the number of panels describing tick locations for each panel. labels: Labels (strings or expressions) to go along with at . Can be a list like at as well.
cex: factor to control character sizes for axis labels. Can be a vector of length 2, to control left/bottom and right/top separately. font: font face for axis labels (integer 1-4). tck: factor to control length of tick marks. Can be a vector of length 2, to control left/bottom and right/top separately. col: color of ticks and labels. rot: Angle by which the axis labels are to be rotated. Can be a vector of length 2, to control left/bottom and right/top separately. abbreviate: logical, whether to abbreviate the labels using abbreviate . Can be useful for long labels (e.g., in factors),
especially on the x-axis.
minlength: argument to abbreviate if abbreviate=TRUE .
log: Use a log scale. Defaults to FALSE , other possible
values are any number that works as a base for taking logarithm,
TRUE , equivalent to 10, and "e" (for natural
logarithm).
Note: the "axs" component is ignored. Much of the function of scales is accomplished by pscales in splom .
|
skip |
logical vector (default FALSE ), replicated to be as
long as the number of panels in each page. If TRUE , nothing
is plotted in the corresponding panel. Useful for arranging plots in
an informative manner.
|
strip |
logical flag or function. If FALSE , strips are
not drawn. Otherwise, strips are drawn using the strip
function, which defaults to strip.default . See documentation
of strip.default to see the form of a strip function.
|
sub |
character string or expression for a subtitle to be placed
at the bottom of each page. See entry for main for finer
control options.
|
subscripts |
logical specifying whether or not a vector named
subscripts should be passed to the panel function. Defaults to
FALSE, unless groups is specified, or if the panel function
accepts an argument named subscripts . (One should be careful
when defining the panel function on-the-fly.)
|
subset |
logical vector (can be specified in terms of variables
in data ). Everything will be done on the data points for
which subset=TRUE . In case subscripts is TRUE, the
subscripts will correspond to the original observations.
|
xlab |
character string or expression giving label for the
x-axis. Defaults to the expression for x in
formula . Specify as NULL to omit the label
altogether. Fine control is possible, see entry for sub .
|
xlim |
numeric vector of length 2 (possibly a DateTime object)
giving minimum and maximum for x-axis, or, a character vector,
expected to denote the levels of x . The latter form is
interpreted as a range containing c(1, length(xlim)), with the
character vector determining labels at tick positions
1:length(xlim)
|
ylab |
character string or expression giving label for the
y-axis. Defaults to the expression for y in
formula . Fine control possible, see entry for xlab .
|
ylim |
same as xlim , applied to the y-axis.
|
... |
other arguments, passed to the panel function |
These are for the most part decriptions generally applicable to all
high level Lattice functions, with special emphasis on xyplot,
bwplot
etc. For other functions, their individual documentation
should be studied in addition to this.
An object of class ``trellis''. The `update' method can be used to update components of the object and the `print' method (usually called by default) will plot it on an appropriate plotting device.
Deepayan Sarkar deepayan@stat.wisc.edu
Lattice
,
print.trellis
,
shingle
,
banking
,
reshape
panel.xyplot
,
panel.bwplot
,
panel.barchart
,
panel.dotplot
,
panel.stripplot
,
panel.superpose
,
panel.loess
,
panel.linejoin
,
strip.default
,
simpleKey
## Tonga Trench Earthquakes data(quakes) Depth <- equal.count(quakes$depth, number=8, overlap=.1) xyplot(lat ~ long | Depth, data = quakes) ## Examples with data from `Visualizing Data' (Cleveland) ## (obtained from Bill Cleveland's Homepage : ## http://cm.bell-labs.com/cm/ms/departments/sia/wsc/, also ## available at statlib) data(ethanol) EE <- equal.count(ethanol$E, number=9, overlap=1/4) ## Constructing panel functions on the fly; prepanel xyplot(NOx ~ C | EE, data = ethanol, prepanel = function(x, y) prepanel.loess(x, y, span = 1), xlab = "Compression Ratio", ylab = "NOx (micrograms/J)", panel = function(x, y) { panel.grid(h=-1, v= 2) panel.xyplot(x, y) panel.loess(x,y, span=1) }, aspect = "xy") ## banking to 45 degrees data(sunspot) xyplot(sunspot ~ 1:37 ,type = "l", aspect="xy", scales = list(y = list(log = TRUE)), sub = "log scales") ## Multiple variables in formula for grouped displays data(iris) xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species, data = iris, allow.multiple = TRUE, scales = "free", auto.key = list(x = .6, y = .7, corner = c(0, 0))) data(state) ## user defined panel functions states <- data.frame(state.x77, state.name = dimnames(state.x77)[[1]], state.region = state.region) xyplot(Murder ~ Population | state.region, data = states, groups = as.character(state.name), panel = function(x, y, subscripts, groups) ltext(x=x, y=y, label=groups[subscripts], cex=.7, font=3)) data(barley) barchart(yield ~ variety | site, data = barley, groups = year, layout = c(1,6), ylab = "Barley Yield (bushels/acre)", scales = list(x = list(abbreviate = TRUE, minlength = 5))) barchart(yield ~ variety | site, data = barley, groups = year, layout = c(1,6), stack = TRUE, auto.key = list(points = FALSE, rectangles = TRUE, space = "right"), ylab = "Barley Yield (bushels/acre)", scales = list(x = list(rot = 45))) data(singer) bwplot(voice.part ~ height, data=singer, xlab="Height (inches)") dotplot(variety ~ yield | year * site, data=barley) dotplot(variety ~ yield | site, data = barley, groups = year, key = simpleKey(levels(barley$year), space = "right"), xlab = "Barley Yield (bushels/acre) ", aspect=0.5, layout = c(1,6), ylab=NULL) stripplot(voice.part ~ jitter(height), data = singer, aspect = 1, jitter = TRUE, xlab = "Height (inches)") ## Interaction Plot data(OrchardSprays) bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos, panel = "panel.superpose", panel.groups = "panel.linejoin", xlab = "treatment", key = list(lines = Rows(trellis.par.get("superpose.line"), c(1:7, 1)), text = list(lab = as.character(unique(OrchardSprays$rowpos))), columns = 4, title = "Row position"))