step {base} | R Documentation |
Choose a model by AIC in a Stepwise Algorithm
Description
Select a formula-based model by AIC.
Usage
step(object, scope, scale = 0,
direction = c("both", "backward", "forward"),
trace = 1, keep = NULL, steps = 1000, k = 2, ...)
Arguments
object |
an object representing a model of an appropriate class. This is used as the initial model in the stepwise search. |
scope |
defines the range of models examined in the stepwise search. |
scale |
used in the definition of the AIC statistic for selecting the models,
currently only for |
direction |
the mode of stepwise search, can be one of |
trace |
if positive, information is printed during the running of |
keep |
a filter function whose input is a fitted model object and the
associated |
steps |
the maximum number of steps to be considered. The default is 1000 (essentially as many as required). It is typically used to stop the process early. |
k |
the multiple of the number of degrees of freedom used for the penalty.
Only |
... |
any additional arguments to |
Details
step
uses add1
and drop1
repeatedly; it will work for any method for which they work, and that
is determined by having a valid method for extractAIC
.
When the additive constant can be chosen so that AIC is equal to
Mallows' C_p
, this is done and the tables are labelled
appropriately.
There is a potential problem in using glm
fits with a variable
scale
, as in that case the deviance is not simply related to the
maximized log-likelihood. The function extractAIC.glm
makes the
appropriate adjustment for a gaussian
family, but may need to be
amended for other cases. (The binomial
and poisson
families have fixed scale
by default and do not correspond
to a particular maximum-likelihood problem for variable scale
.)
Value
the stepwise-selected model is returned, with up to two additional
components. There is an "anova"
component corresponding to the
steps taken in the search, as well as a "keep"
component if the
keep=
argument was supplied in the call. The
"Resid. Dev"
column of the analysis of deviance table refers
to a constant minus twice the maximized log likelihood: it will be a
deviance only in cases where a saturated model is well-defined
(thus excluding lm
, aov
and survreg
fits, for example).
Warning
The model fitting must apply the models to the same dataset. This
may be a problem if there are missing values and R's default of
na.action = na.omit
is used. We suggest you remove the
missing values first.
Note
This function differs considerably from the function in S, which uses a number of approximations and does not compute the correct AIC.
Author(s)
B. D. Ripley
See Also
add1
, drop1
Examples
example(lm)
step(lm.D9)
data(swiss)
summary(lm1 <- lm(Fertility ~ ., data = swiss))
slm1 <- step(lm1)
summary(slm1)
slm1$anova