split {base} | R Documentation |
split
divides the data in the vector x
into the groups
defined by f
. The replacement forms replace values
corresponding to such a division. unsplit
reverses the effect of
split
.
split(x, f, drop = FALSE, ...)
split(x, f, drop = FALSE, ...) <- value
unsplit(value, f, drop = FALSE)
x |
vector or data frame containing values to be divided into groups. |
f |
a ‘factor’ in the sense that |
drop |
logical indicating if levels that do not occur should be dropped
(if |
value |
a list of vectors or data frames compatible with a
splitting of |
... |
further potential arguments passed to methods. |
split
and split<-
are generic functions with default and
data.frame
methods.
The data frame
method can also be used to split a matrix into a list of matrices,
and the replacement form likewise, provided they are invoked
explicitly.
unsplit
works with lists of vectors or data frames (assumed to
have compatible structure, as if created by split
). It puts
elements or rows back in the positions given by f
. In the data
frame case, row names are obtained by unsplitting the row name
vectors from the elements of value
.
f
is recycled as necessary and if the length of x
is not
a multiple of the length of f
a warning is printed.
Any missing values in f
are dropped together with the
corresponding values of x
.
The value returned from split
is a list of vectors containing
the values for the groups. The components of the list are named by
the levels of f
(after converting to a factor, or if already a
factor and drop=TRUE
, dropping unused levels).
The replacement forms return their right hand side. unsplit
returns a vector or data frame for which split(x, f)
equals
value
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
cut
require(stats); require(graphics)
n <- 10; nn <- 100
g <- factor(round(n * stats::runif(n * nn)))
x <- rnorm(n * nn) + sqrt(as.numeric(g))
xg <- split(x, g)
boxplot(xg, col = "lavender", notch = TRUE, varwidth = TRUE)
sapply(xg, length)
sapply(xg, mean)
### Calculate z-scores by group
z <- unsplit(lapply(split(x, g), scale), g)
tapply(z, g, mean)
# or
z <- x
split(z, g) <- lapply(split(x, g), scale)
tapply(z, g, sd)
### data frame variation
## Notice that assignment form is not used since a variable is being added
g <- airquality$Month
l <- split(airquality, g)
l <- lapply(l, transform, Oz.Z = scale(Ozone))
aq2 <- unsplit(l, g)
head(aq2)
with(aq2, tapply(Oz.Z, Month, sd, na.rm=TRUE))
### Split a matrix into a list by columns
ma <- cbind(x = 1:10, y = (-4:5)^2)
split(ma, col(ma))
split(1:10, 1:2)