This help topic is for R version 1.1. For the current version of R, try https://stat.ethz.ch/R-manual/R-patched/library/base/html/data.html
data {base}R Documentation

Data Sets

Description

data loads a data set or lists (via show.data) the available data sets.

Usage

data(..., list = character(0), package = .packages(),
     lib.loc = .lib.loc)
show.data(package = .packages(), lib.loc = .lib.loc)

Arguments

...

a sequence of names or character strings.

list

a character vector.

package

a name or character vector giving the packages to look into for data sets. By default, all packages in the search path are used.

lib.loc

a character vector of directory names of R libraries. Defaults to all libraries currently known.

Details

Currently, four formats of data files are supported:

  1. files ending ‘.RData’ or ‘.rda’ are load()ed.

  2. files ending ‘.R’ or ‘.r’ are source()d in, with the R working directory changed temporarily to the directory containing the respective file.

  3. files ending ‘.tab’ or ‘.txt’ are read using read.table(..., header = TRUE), and hence result in a data frame.

  4. files ending ‘.csv’ are read using read.table(..., header = TRUE, sep = ";"), and also result in a data frame.

The data sets to be loaded can be specified as a sequence of names or character strings, or as the character vector list, or as both. If no data sets are specified or show.data is called directly, the available data sets are displayed.

If no data sets are specified, data calls show.data. show.data looks for a file ‘00Index’ in a ‘data’ directory of each specified package, and uses these files to prepare a listing. If there is a ‘data’ area but no index a warning is given: such packages are incomplete.

If lib.loc is not specified, the packages are searched for amongst those already loaded, followed by the ‘data’ directory (if any) of the current working directory. If lib.loc is specified, they are searched for in the specified libraries, even if they are already loaded from another library.

To just look in the ‘data’ directory of the current working directory, set package = NULL.

Value

data() returns a character vector of all data sets specified, an empty character vector if none were specified.

Note

The data files can be many small files. On some file systems it is desirable to save space, and the files in the ‘data’ directory of an installed package can be zipped up as a zip archive ‘Rdata.zip’. You will need to provide a single-column file ‘filelist’ of file names in that directory.

One can take advantage of the search order and the fact that a ‘.R’ file will change directory. If raw data are stored in ‘mydata.txt’ then one can set up ‘mydata.R’ to read ‘mydata.txt’ and preprocess it, e.g. using transform. For instance one can convert numeric vectors to factors with the appropriate labels. Thus, the ‘.R’ file can effectively contain a metadata specification for the plaintext formats.

See Also

help for obtaining documentation on data sets.

Examples

data()                       # list all available data sets
data(package = base)         # list the data sets in the base package
data(USArrests, "VADeaths")  # load the data sets `USArrests' and `VADeaths'
help(USArrests)              # give information on data set `USArrests'

[Package base version 1.1 ]