| data.frame {base} | R Documentation |
Data Frames
Description
This function creates data frames, tightly coupled collections of variables which share many of the properties of matrices and of lists, used as the fundamental data structure by most of R's modeling software.
Usage
data.frame(..., row.names = NULL, check.rows = FALSE,
check.names = TRUE,
stringsAsFactors = default.stringsAsFactors())
default.stringsAsFactors()
Arguments
... |
these arguments are of either the form |
row.names |
|
check.rows |
if |
check.names |
logical. If |
stringsAsFactors |
logical: should character vectors be converted to factors? |
Details
A data frame is a list of variables of the same length with unique row
names, given class "data.frame". If there are zero variables,
the row names determine the number of rows.
data.frame converts each of its arguments to a data frame by
calling as.data.frame(optional=TRUE). As that is a
generic function, methods can be written to change the behaviour of
arguments according to their classes: R comes with many such methods.
Character variables passed to data.frame are converted to
factor columns unless protected by I. If a list or data
frame or matrix is passed to data.frame it is as if each
component or column had been passed as a separate argument.
Objects passed to data.frame should have the same number of
rows, but atomic vectors, factors and character vectors protected by
I will be recycled a whole number of times if necessary.
If row names are not supplied in the call to data.frame, the
row names are taken from the first component that has suitable names,
for example a named vector or a matrix with rownames or a data frame.
(If that component is subsequently recycled, the names are discarded
with a warning.) If row.names was supplied as NULL or no
suitable component was found the row names are the integer sequence
starting at one (and such row names are considered to be
‘automatic’, and not preserved by as.matrix).
If row names are supplied of length one and the data frame has a
single row, the row.names is taken to specify the row names and
not a column (by name or number).
Names are removed from vector inputs not protected by I.
default.stringsAsFactors is a utility that takes
getOption("stringsAsFactors") and ensures the result is
TRUE or FALSE.
Value
A data frame, a matrix-like structure whose columns may be of differing types (numeric, logical, factor and character and so on).
Note
In versions of R prior to 2.4.0 row.names had to be
character: to ensure compatibility with earlier versions of R, supply
a character vector as the row.names argument.
References
Chambers, J. M. (1992) Data for models. Chapter 3 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth \& Brooks/Cole.
See Also
I,
plot.data.frame,
print.data.frame,
row.names, names (for the column names),
[.data.frame for subsetting methods,
Math.data.frame etc, about
Group methods for data.frames;
read.table,
make.names.
Examples
L3 <- LETTERS[1:3]
(d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, repl=TRUE)))
## The same with automatic column names:
data.frame(cbind( 1, 1:10), sample(L3, 10, repl=TRUE))
is.data.frame(d)
## do not convert to factor, using I() :
(dd <- cbind(d, char = I(letters[1:10])))
rbind(class=sapply(dd, class), mode=sapply(dd, mode))
stopifnot(1:10 == row.names(d))# {coercion}
(d0 <- d[, FALSE]) # NULL data frame with 10 rows
(d.0 <- d[FALSE, ]) # <0 rows> data frame (3 cols)
(d00 <- d0[FALSE,]) # NULL data frame with 0 rows