Exercise 11 Solutions

11E1. If an event has probability 0.35, what are the log-odds of this event?

log(0.35/(1-0.35))

## [1] -0.6190392

11E2. If an event has log-odds 3.2, what is the probability of this event?

log(p/1-p)=3.2 p/(1-p)=exp(3.2) p=exp(3.2)-pexp(3.2) p+pexp(3.2)=exp(3.2) p(1+exp(3.2))=exp(3.2) p=exp(3.2)/(1+exp(3.2))

exp(3.2)/(1+exp(3.2))

## [1] 0.9608343

library(rethinking)

## Loading required package: rstan

## Loading required package: StanHeaders

## Loading required package: ggplot2

## rstan (Version 2.21.8, GitRev: 2e1f913d3ca3)

## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)

## Loading required package: cmdstanr

## This is cmdstanr version 0.5.3

## - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr

## - CmdStan path: /Users/brianbeckage/.cmdstan/cmdstan-2.31.0

## - CmdStan version: 2.31.0

## 
## A newer version of CmdStan is available. See ?install_cmdstan() to install it.
## To disable this check set option or environment variable CMDSTANR_NO_VER_CHECK=TRUE.

## Loading required package: parallel

## rethinking (Version 2.31)

## 
## Attaching package: 'rethinking'

## The following object is masked from 'package:rstan':
## 
##     stan

## The following object is masked from 'package:stats':
## 
##     rstudent

inv_logit(3.2) # equivalent function

## [1] 0.9608343

11E3. Suppose that a coefficient in a logistic regression has value 1.7. What does this imply about the proportional change in odds of the outcome?

exp(1.7)

## [1] 5.473947

This means that each unit change in the predictor variable multiplies the odds of the event by 5.5. To demystify this relationship a little, if the linear model L is the log-odds of the event, then the odds of the event are just exp(L). Now we want to compare the odds before and after increasing a predictor by one unit. We want to know how much the odds increase, as a result of the unit increase in the predictor. We can use our dear friend algebra to solve this problem: exp(α + βx)Z = exp(α + β(x + 1))

The left side is the odds of the event, before increasing x. The Z represents the proportional change in odds that we’re going to solve for. It’s unknown value will make the left side equal to the right side. The right side is the odds of the event, after increasing x by 1 unit. So we just solve for Z now. The answer is Z = exp(β). And that’s where the formula comes from.

11M3. Explain why the logit link is appropriate for a binomial generalized linear model.

It is conventional to use a logit link for a binomial GLM because we need to map the con- tinuous linear model value to a probability parameter that is bounded between zero and one. The inverse-logit function, often known as the logistic, is one way to do this. There are deeper reasons for using the logistic. It arises naturally when working with multinomial probability densities. There was a hint of this in one of the Overthinking boxes in Chapter 9, in which you saw how to derive when the binomial distribution has maximum entropy.

11H3. The data contained in data(salamanders) are counts of salamanders (Plethodon elongatus) from 47 different 49-m2 plots in northern California.181 The column SALAMAN is the count in each plot, and the columns PCTCOVER and FORESTAGE are percent of ground cover and age of trees in the plot, respectively. You will model SALAMAN as a Poisson variable.

Model the relationship between density and percent cover, using a log-link (same as the example in the book and lecture). Use weakly informative priors of your choosing. Check the quadratic approximation again, by comparing quap to ulam. Then plot the expected counts and their 89% interval against percent cover. In which ways does the model do a good job? A bad job?

Loading and standardizing the predictors

data(salamanders)
d <- salamanders
d$C <- standardize(d$PCTCOVER)
d$A <- standardize(d$FORESTAGE)
head(d)

f <- alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bC*C,
    a ~ dnorm(0,1),
    bC ~ dnorm(0,0.5) )
m1 <- ulam( f , data=d , chains=4 )

## In file included from /var/folders/4f/_h6ql5191cl7bc390kh2xd380000gn/T/RtmpLy1oF3/model-102d6304b3f6a.hpp:1:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/src/stan/model/model_header.hpp:4:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math.hpp:19:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/rev.hpp:10:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/rev/fun.hpp:198:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/prim/functor.hpp:14:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/prim/functor/integrate_ode_rk45.hpp:6:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/prim/functor/ode_rk45.hpp:9:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint.hpp:76:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint/integrate/observer_collection.hpp:23:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/function.hpp:30:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/function/detail/prologue.hpp:17:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/function/function_base.hpp:21:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/type_index.hpp:29:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/type_index/stl_type_index.hpp:47:
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:132:33: warni

## ng: 'unary_function<const std::error_category *, unsigned long>' is deprecated [-Wdeprecated-declarations]
##         struct hash_base : std::unary_function<T, std::size_t> {};
##                                 ^
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:692:18: note: in instantiation of template class 'boost::hash_detail::hash_base<const std::error_category *>' requested here
##         : public boost::hash_detail::hash_base<T*>
##                  ^
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:420:24: note: in instantiation of template class 'boost::hash<const std::error_category *>' requested here
##         boost::hash<T> hasher;
##                        ^
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:551:9: note: in instantiation of function template specialization 'boost::hash_combine<const std::error_category *>' requested here
##         hash_combine(seed, &v.category());
##         ^

## /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__functional/unary_function.h:23:29: note: 'unary_function<const std::error_category *, unsigned long>' has been explicitly marked deprecated here
## struct _LIBCPP_TEMPLATE_VIS _LIBCPP_DEPRECATED_IN_CXX11 unary_function
##                             ^
## /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__config:825:41: note: expanded from macro '_LIBCPP_DEPRECATED_IN_CXX11'
## #    define _LIBCPP_DEPRECATED_IN_CXX11 _LIBCPP_DEPRECATED
##                                         ^
## /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__config:810:49: note: expanded from macro '_LIBCPP_DEPRECATED'
## #      define _LIBCPP_DEPRECATED __attribute__((deprecated))
##                                                 ^

## 1 warning generated.

## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.0 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.0 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.0 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.5 seconds.

precis(m1)

plot( d$C , d$SALAMAN , col=rangi2 , lwd=2 ,
    xlab="cover (standardized)" , ylab="salamanders observed" )
C_seq <- seq( from=-2 , to=2 , length.out=30 )
l <- link( m1 , data=list(C=C_seq) )
lines( C_seq , colMeans( l ) )
shade( apply( l , 2 , PI ) , C_seq )

This does seem like a case in which the variance is much greater than the mean, i.e. over-dispersion at the very high end of forest cover. This is a case where a negative binomial model might be appropriate.

Can you improve the model by using the other predictor, FORESTAGE? Try any models you think useful. Can you explain why FORESTAGE helps or does not help with prediction?

f2 <- alist(
    SALAMAN ~ dpois( lambda ),
    log(lambda) <- a + bC*C + bA*A,
    a ~ dnorm(0,1),
    c(bC,bA) ~ dnorm(0,0.5) )
m2 <- ulam( f2 , data=d , chains=4 )

## In file included from /var/folders/4f/_h6ql5191cl7bc390kh2xd380000gn/T/RtmpLy1oF3/model-102d647ee7980.hpp:1:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/src/stan/model/model_header.hpp:4:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math.hpp:19:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/rev.hpp:10:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/rev/fun.hpp:198:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/prim/functor.hpp:14:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/prim/functor/integrate_ode_rk45.hpp:6:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/stan/math/prim/functor/ode_rk45.hpp:9:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint.hpp:76:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/numeric/odeint/integrate/observer_collection.hpp:23:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/function.hpp:30:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/function/detail/prologue.hpp:17:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/function/function_base.hpp:21:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/type_index.hpp:29:
## In file included from /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/type_index/stl_type_index.hpp:47:
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:132:33: warni

## ng: 'unary_function<const std::error_category *, unsigned long>' is deprecated [-Wdeprecated-declarations]
##         struct hash_base : std::unary_function<T, std::size_t> {};
##                                 ^
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:692:18: note: in instantiation of template class 'boost::hash_detail::hash_base<const std::error_category *>' requested here
##         : public boost::hash_detail::hash_base<T*>
##                  ^
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:420:24: note: in instantiation of template class 'boost::hash<const std::error_category *>' requested here
##         boost::hash<T> hasher;
##                        ^
## /Users/brianbeckage/.cmdstan/cmdstan-2.31.0/stan/lib/stan_math/lib/boost_1.78.0/boost/container_hash/hash.hpp:551:9: note: in instantiation of function template specialization 'boost::hash_combine<const std::error_category *>' requested here
##         hash_combine(seed, &v.category());
##         ^
## /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__functional/unary_function.h:23:29: note: 'unary_function<const std::error_category *, unsigned long>' has been explicitly marked deprecated here
## struct _LIBCPP_TEMPLATE_VIS _LIBCPP_DEPRECATED_IN_CXX11 unary_function
##                             ^

## /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__config:825:41: note: expanded from macro '_LIBCPP_DEPRECATED_IN_CXX11'
## #    define _LIBCPP_DEPRECATED_IN_CXX11 _LIBCPP_DEPRECATED
##                                         ^
## /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__config:810:49: note: expanded from macro '_LIBCPP_DEPRECATED'
## #      define _LIBCPP_DEPRECATED __attribute__((deprecated))
##                                                 ^

## 1 warning generated.

## Running MCMC with 4 sequential chains, with 1 thread(s) per chain...
## 
## Chain 1 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 1 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 1 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 1 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 1 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 1 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 1 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 1 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 1 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 1 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 1 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 1 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 1 finished in 0.0 seconds.
## Chain 2 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 2 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 2 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 2 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 2 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 2 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 2 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 2 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 2 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 2 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 2 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 2 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 2 finished in 0.0 seconds.
## Chain 3 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 3 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 3 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 3 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 3 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 3 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 3 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 3 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 3 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 3 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 3 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 3 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 3 finished in 0.0 seconds.
## Chain 4 Iteration:   1 / 1000 [  0%]  (Warmup) 
## Chain 4 Iteration: 100 / 1000 [ 10%]  (Warmup) 
## Chain 4 Iteration: 200 / 1000 [ 20%]  (Warmup) 
## Chain 4 Iteration: 300 / 1000 [ 30%]  (Warmup) 
## Chain 4 Iteration: 400 / 1000 [ 40%]  (Warmup) 
## Chain 4 Iteration: 500 / 1000 [ 50%]  (Warmup) 
## Chain 4 Iteration: 501 / 1000 [ 50%]  (Sampling) 
## Chain 4 Iteration: 600 / 1000 [ 60%]  (Sampling) 
## Chain 4 Iteration: 700 / 1000 [ 70%]  (Sampling) 
## Chain 4 Iteration: 800 / 1000 [ 80%]  (Sampling) 
## Chain 4 Iteration: 900 / 1000 [ 90%]  (Sampling) 
## Chain 4 Iteration: 1000 / 1000 [100%]  (Sampling) 
## Chain 4 finished in 0.0 seconds.
## 
## All 4 chains finished successfully.
## Mean chain execution time: 0.0 seconds.
## Total execution time: 0.5 seconds.

precis(m2)

Notice that the estimate for bA is now nearly zero, with a small interval around it. There isn’t much association between forest age and salamander density, while also conditioning on percent cover.

Why doesn’t forest age help much? It does improve predictions, in the absence of percent cover—check. If all we knew was forest age, it would be a good predictor. But compared to percent cover, forest age doesn’t add much.

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.