This help topic is for R version 1.5.0. For the current version of R, try https://stat.ethz.ch/R-manual/R-patched/library/base/html/reshape.html
reshape {base}R Documentation

Reshape grouped data

Description

This function reshapes a dataframe between ‘wide’ format with repeated measurements in separate columns of the same record and ‘long’ format with the repeated measurements in separate records.

Usage

reshape(data, varying = NULL, v.names = NULL, timevar = "time", 
    idvar = "id", ids = 1:NROW(data),
    times = seq(length = length(varying[[1]])), 
    drop = NULL, direction, fix.row.names = TRUE,
    split=list(regexp="\.",include=FALSE)

Arguments

data

A data frame

varying

Names of sets of variables in the wide format that correspond to single variables in long format (‘time-varying’). A list of vectors (or optionally a matrix for direction="wide"). See below for more details and options

v.names

Names of variables in the long format that correspond to multiple variables in the wide format .

timevar

The variable in long format that differentiates multiple records from the same group/individual

idvar

The variable in long format that identifies multiple records from the same group/individual. This variable may also be present in wide format

ids

The values to use for a newly created idvar variable in long format

times

The values to use for a newly created timevar variable in long format

drop

A vector of names of variables to drop before reshaping

direction

"wide" to reshape to wide format, "long" to reshape to long format

fix.row.names

if TRUE and direction="wide", create new row names in long format from the values of the id and time variables

split

information for guessing the varying, v.names, and times arguments. See below for details

Details

The arguments to this function are described in terms of longitudinal data, as that is the application motivating the functions. A ‘wide’ longitudinal dataset will have one record for each individual with some time-constant variables that occupy single columns and some time-varying variables that occupy a column for each time point. In ‘long’ format there will be multiple records for each individual, with some variables being constant across these records and others varying across the records. A ‘long’ format dataset also needs a ‘time’ variable identifying which time point each record comes from and an ‘id’ variable showing which records refer to the same person.

If the data frame resulted from a previous reshape then the operation can be reversed by specifying just the direction argument. The other arguments are stored as attributes on the data frame.

If direction="long" and no varying or v.names arguments are supplied it is assumed that all variables except idvar and timevar are time-varying. They are all expanded into multiple variables in wide format.

If direction="wide" the varying argument can be a vector of column names or column numbers (converted to column names). The function will attempt to guess the v.names and times from these names. The default is variable names like x.1, x.2,where split=list(regexp="\.",include=FALSE) to specifies to split at the dot and drop it from the name. To have alphabetic followed by numeric times use split=list(regexp="[A-Za-z][0-9]",include=TRUE). This splits between the alphabetic and numeric parts of the name and does not drop the regular expression.

Value

The reshaped data frame with added attributes to simplify reshaping back to the original form.

See Also

stack, aperm

Examples

data(Indometh,package="nls")
summary(Indometh)
wide<-reshape(Indometh,v.names="conc",idvar="Subject",
               timevar="time",direction="wide")
wide

reshape(wide, direction="long")
reshape(wide, idvar="Subject",varying=list(names(wide)[2:12]),
          v.names="conc",direction="long")

## times need not be numeric
df<-data.frame(id=rep(1:4,rep(2,4)),visit=I(rep(c("Before","After"),4)),
              x=rnorm(4),y=runif(4))
df
reshape(df,timevar="visit",idvar="id",direction="wide")
## warns that y is really varying
reshape(df,timevar="visit",idvar="id",direction="wide",v.names="x")  


##  unbalanced `long' data leads to NA fill in `wide' form
df2<-df[1:7,]
df2
reshape(df2,timevar="visit",idvar="id",direction="wide")

## Alternative regular expressions for guessing names
df3<-data.frame(id=1:4,age=c(40,50,60,50),dose1=c(1,2,1,2),
                    dose2=c(2,1,2,1),dose4=c(3,3,3,3))
reshape(df3,direction="long",varying=3:5,
         split=list(regexp="[a-z][0-9]",include=TRUE))


## an example that isn't longitudinal data
data(state)
state.x77<-as.data.frame(state.x77)
long<-reshape(state.x77,idvar="state",ids=row.names(state.x77),
       times=names(state.x77),timevar="Characteristic",
       varying=list(names(state.x77)),direction="long")

reshape(long,direction="wide")

reshape(long,direction="wide",new.row.names=unique(long$state))



[Package base version 1.5.0 ]