1 min read

Concerns with F-tests on multiple regression models in R

This post is pending

anova(mr_lm)
car::Anova(mr_lm, type = 2)

We can see from the \(P\) values that hp and wt are significant terms, whereas drat is not.

But remember that \(P\) values are derived from \(F\) statistics, and the \(F\) values in this ANOVA table are interesting. Note that the \(F\) value for hp is an order of magnitude greater than that of wt. Does mean that it is an order of magnitude more important?

It definitely doesn’t, and this example reveals an issue with the default behavior of anova() when called on multiple regression models that is present but irrelevant with only a single term is included. By default, base::anova() performs what is called Type \(I\) ANOVA, which is sensitive to the order that terms are added. Compare:

lm(mpg ~ hp + wt + drat, mtcars) %>% anova()
lm(mpg ~ drat + hp + wt, mtcars) %>% anova()