data(foxes, package = "rethinking") # foxes dataset
<- as_tibble(foxes)
foxes # stardardize
<- foxes %>%
foxes mutate(F = as.numeric(scale(avgfood)),
G = as.numeric(scale(groupsize)),
A = as.numeric(scale(area)),
W = as.numeric(scale(weight)))
07 - Fitting Over & Under
Homework
Revisit the urban fox data, data(foxes)
from the last class. Use WAIC or PSIS based model comparison on five different models, each using weight
as the outcome, and containing these sets of predictor variables
- avgfood + groupsize + area
- avgfood + groupsize
- groupsize + area
- avgfood
- area
Can you explain the relative differences in WAIC scores, using the fox DAG from the previous class? Be sure to pay attention to the standard error of the score differences.
<- brm(
mfox1 data = foxes, family = gaussian,
~ F + G + A,
W prior = c(
prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 0.5), class = b),
prior(exponential(1), class = sigma)),
file = here::here(models, "model07_foxhw1"))
<- brm(
mfox2 data = foxes, family = gaussian,
~ F + G,
W prior = c(
prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 0.5), class = b),
prior(exponential(1), class = sigma)),
file = here::here(models, "model07_foxhw2"))
<- brm(
mfox3 data = foxes, family = gaussian,
~ G + A,
W prior = c(
prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 0.5), class = b),
prior(exponential(1), class = sigma)),
file = here::here(models, "model07_foxhw3"))
<- brm(
mfox4 data = foxes, family = gaussian,
~ F,
W prior = c(
prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 0.5), class = b),
prior(exponential(1), class = sigma)),
file = here::here(models, "model07_foxhw4"))
<- brm(
mfox5 data = foxes, family = gaussian,
~ A,
W prior = c(
prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 0.5), class = b),
prior(exponential(1), class = sigma)),
file = here::here(models, "model07_foxhw5"))
<- add_criterion(mfox1, c("waic", "loo"))
mfox1 <- add_criterion(mfox2, c("waic", "loo"))
mfox2 <- add_criterion(mfox3, c("waic", "loo"))
mfox3 <- add_criterion(mfox4, c("waic", "loo"))
mfox4 <- add_criterion(mfox5, c("waic", "loo"))
mfox5
<- loo_compare(
w
mfox1, mfox2, mfox3, mfox4, mfox5, criterion = "waic")
print(w, simplify = F)
elpd_diff se_diff elpd_waic se_elpd_waic p_waic se_p_waic waic se_waic
mfox1 0.0 0.0 -161.4 7.8 4.4 0.8 322.8 15.5
mfox3 -0.5 1.3 -161.9 7.6 3.5 0.6 323.7 15.2
mfox2 -0.5 1.7 -161.9 7.7 3.5 0.6 323.7 15.4
mfox4 -5.2 3.4 -166.6 6.7 2.3 0.3 333.3 13.4
mfox5 -5.3 3.4 -166.7 6.7 2.4 0.4 333.4 13.3
Comment from the textbook
Notice that the top three models are m1, m3, and m2. They have very similar WAIC values. The dif- ferences are small and smaller in all cases than the standard error of the difference. WAIC sees these models are tied. This makes sense, given the DAG, because as long as a model has groupsize in it, we can include either avgfood or area or both and get the same inferences. Another way to think of this is that the influence of good, adjusting for group size, is (according to the DAG) the same as the influence of area, adjusting for group size, because the influence of area is routed entirely through food and group size. There are no backdoor paths.
What about the other two models, m4 and m5? These models are tied with one another, and both omit group size. Again, the influence of area passes entirely through food. So including only food or only area should produce the same inference—the total causal influence of area (or food) is just about zero.
library(dagitty)
library(ggdag)
Attaching package: 'ggdag'
The following object is masked from 'package:stats':
filter
<- dagitty(
fox_dag "dag {
area -> avgfood
avgfood -> groupsize
avgfood -> weight
groupsize -> weight
}")
coordinates(fox_dag) <-
list(x=c(area=2, avgfood=0, groupsize=4, weight=2),
y=c(area=0, avgfood=1, groupsize=1, weight=2))
tidy_dagitty(fox_dag) %>% ggdag()