cor {base} | R Documentation |
var
, cov
and cor
compute the variance of x
and the covariance or correlation of x
and y
if these
are vectors. If x
and y
are matrices then the
covariances (or correlations) between the columns of x
and the
columns of y
are computed.
var(x, y = NULL, na.rm = FALSE, use)
cor(x, y = NULL, use = "all.obs")
cov(x, y = NULL, use = "all.obs")
x |
a numeric vector, matrix or data frame. |
y |
|
use |
an optional character string giving a
method for computing covariances in the presence
of missing values. This must be (an abbreviation of) one of the strings
|
na.rm |
logical. Should missing values be removed? |
For cov
and cor
one must either give a matrix or data
frame for x
or give both x
and y
.
var
just another interface to cov
, where
na.rm
is used to determine the default for use
when that
is unspecified. If na.rm
is TRUE
then the complete
observations (rows) are used (use = "complete"
) to compute the
variance. Otherwise (use = "all"
), var
will give an
error if there are missing values.
If use
is "all.obs"
, then the presence
of missing observations will produce an error.
If use
is "complete.obs"
then missing values
are handled by casewise deletion. Finally, if use
has the
value "pairwise.complete.obs"
then the correlation between
each pair of variables is computed using all complete pairs
of observations on those variables.
This can result in covariance or correlation matrices which are not
positive semidefinite.
The denominator n - 1
is used which gives an unbiased estimator
of the (co)variance for i.i.d. observations.
These functions return NA
when there is only one
observation, and from R 1.2.3 fail if x
has length zero.
For r <- cor(*, use = "all.obs")
, it is now guaranteed that
all(r <= 1)
.
cov.wt
for weighted covariance
computation, sd
for standard deviation (vectors).
var(1:10)# 9.166667
var(1:5,1:5)# 2.5
## Two simple vectors
cor(1:10,2:11)# == 1
## Correlation Matrix of Multivariate sample:
data(longley)
(Cl <- cor(longley))
## Graphical Correlation Matrix:
symnum(Cl) # highly correlated
##--- Missing value treatment:
data(swiss)
C1 <- cov(swiss)
range(eigen(C1, only=TRUE)$val) # 6.19 1921
swiss[1,2] <- swiss[7,3] <- swiss[25,5] <- NA # create 3 "missing"
## Not run:
C2 <- cov(swiss) # Error: missing obs...
## End(Not run)
C2 <- cov(swiss, use = "complete")
range(eigen(C2, only=TRUE)$val) # 6.46 1930
C3 <- cov(swiss, use = "pairwise")
range(eigen(C3, only=TRUE)$val) # 6.19 1938