Bayesian Posterior

DATA: Loading the weight vs height data

library(rethinking)

## Loading required package: rstan

## Loading required package: StanHeaders

## Loading required package: ggplot2

## rstan (Version 2.21.8, GitRev: 2e1f913d3ca3)

## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)

## Loading required package: cmdstanr

## This is cmdstanr version 0.5.3

## - CmdStanR documentation and vignettes: mc-stan.org/cmdstanr

## - CmdStan path: /Users/brianbeckage/.cmdstan/cmdstan-2.31.0

## - CmdStan version: 2.31.0

## Loading required package: parallel

## rethinking (Version 2.21)

## 
## Attaching package: 'rethinking'

## The following object is masked from 'package:rstan':
## 
##     stan

## The following object is masked from 'package:stats':
## 
##     rstudent

data("Howell1")
myData<-Howell1
myData<-myData[myData$age>18,]
head(myData)

plot(myData$height,myData$weight)

LIKELIHOOD FUNCTION: Here is the likelihood function that we’ve used to fit a linear regession to these data.

nllRegressionLinear<-function(parmVector,weightDat,heightDat,print=F){

  intercept<-parmVector[1]
  B<-parmVector[2]
  mySD<-parmVector[3]
  
  w<-weightDat
  h<-heightDat
  mu<-intercept + B*h
  
  nllik<- -sum(dnorm(x=w,mean=mu,sd=mySD,log=TRUE))

  if (print) cat("nllik= ",nllik,sep=" ",fill=T)
  return(nllik)}

Below is the code that we use to find the Maximum Likelihood Estimate (MLE) for the linear model: w ~ normal(mu, sd) mu<- intercept + B*height

We have three parameters to estimate: intercept, B, and sd

parVecLinear<-c(0.1,0.1,10) # Initial parameter values 
outRegLinear<-optim(par=parVecLinear,fn=nllRegressionLinear,method="L-BFGS-B",lower=c(-Inf,-Inf, 0.01),upper=c(Inf,Inf, Inf), weightDat=myData$weight, heightDat=myData$height,print=F)
outRegLinear$par

## [1] -51.6295921   0.6251449   4.2428686

outRegLinear$value

## [1] 991.0056

We can verify the accuracy of our fit by comparison to the built in lm() function. We can see that the estimates are very similar.

lmFit<-lm(weight~height,data=myData)
summary(lmFit)

## 
## Call:
## lm(formula = weight ~ height, data = myData)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.8344  -3.0747  -0.2775   2.8402  14.6456 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -51.62959    4.56320  -11.31   <2e-16 ***
## height        0.62514    0.02947   21.21   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.255 on 344 degrees of freedom
## Multiple R-squared:  0.5667, Adjusted R-squared:  0.5655 
## F-statistic:   450 on 1 and 344 DF,  p-value: < 2.2e-16

MXIMUM A POSTERIOR (MAP) PROBABILITY ESIMATE: This is what the quap function in the Rethinking library uses to estimate the posterior distribution. In this approach, we just need to modify our likelihood function to include the prior distribution. Recall that the posterior ~ likelihood * prior. Since we are working on the log scale, the likelihood and prior are added together rather than multiplied. So the main change in the function below is that I have added the priors into the calculation, then combine the likelihood and priors to get the posterior density.

prior(intercept) ~ normal(0, 10) prior(B) ~ normal(0, 10) prior(sd) ~ exp(1)

nlpRegressionLinear<-function(parmVector,weightDat,heightDat,priorVect,print=F){

  intercept<-parmVector[1]
  B<-parmVector[2]
  mySD<-parmVector[3]
  
  w<-weightDat
  h<-heightDat
  mu<-intercept + B*h
  
  nllik<- -sum(dnorm(x=w,mean=mu,sd=mySD,log=TRUE))
  
  meanNormalPrior<-priorVect[1]
  sdNormalPrior<-priorVect[2]
  expPrior<-priorVect[3]
  
  nlPriorIntercept<- -dnorm(intercept, mean =  meanNormalPrior, sd = sdNormalPrior, log=TRUE)
  nlPriorB<- -dnorm(B, mean =  meanNormalPrior, sd = sdNormalPrior, log=TRUE)
  nlPriorSD<- -dexp(mySD, rate=expPrior, log=TRUE)
  
  nllPosterior<- nllik + nlPriorIntercept + nlPriorB + nlPriorSD
   # browser() # for debugging
  if (print) cat("nllik= ",nllik,sep=" ", "nllPosterior= ", nllPosterior,fill=T)
  return(nllPosterior)}

Now we use optim to find the MAP just as we did above for the MLE.

parVecLinear<-c(0.1,0.1,10) # Initial parameter values 
priorVectLinear<-c(0,10,1) # these are parameter values for the priors
outBayesRegLinear<-optim(par=parVecLinear,fn=nlpRegressionLinear,method="L-BFGS-B",lower=c(-Inf,-Inf, 0.01),upper=c(Inf,Inf, Inf), weightDat=myData$weight, heightDat=myData$height,priorVect=priorVectLinear,print=F)
outBayesRegLinear$par

## [1] -42.7814838   0.5680739   4.2399743

outBayesRegLinear$value

## [1] 1012.735

Finally, let’s compare our fit above to that produced by the quap function within the Rethinking package.

library(rethinking)
lm.qa <- quap(
    alist(
        weight ~ dnorm(mu,sd) ,  # normal likelihood
        sd ~ dexp(1),     # exponential prior
        mu<-intercept + beta*height,
        intercept~dnorm(0,10),
        beta~dnorm(0,10)
    ) ,
    data=list(weight=myData$weight,height=myData$height) )

precis(lm.qa)

Looking a little closer at the quap fit. quap uses the same optim function that we used above and you can access details of the optim fit by examining components of the fit. Note that we are digging deeper here and these components are not normally visible to users, but we are access the slots of the returned object. (No need to worry about this as it is a bit more advanced computational concept.)

The bottom like is that our estimates of the MAP and the posterior density at the MAP from quap (i.e. ‘value’ below -> 1012) are very similar.

lm.qa

## 
## Quadratic approximate posterior distribution
## 
## Formula:
## weight ~ dnorm(mu, sd)
## sd ~ dexp(1)
## mu <- intercept + beta * height
## intercept ~ dnorm(0, 10)
## beta ~ dnorm(0, 10)
## 
## Posterior means:
##          sd   intercept        beta 
##   4.2399355 -42.7807873   0.5680683 
## 
## Log-likelihood: -992.9

slot(lm.qa, 'optim')

## $par
##          sd   intercept        beta 
##   4.2399355 -42.7807873   0.5680683 
## 
## $value
## [1] 1012.735
## 
## $counts
## function gradient 
##      175       95 
## 
## $convergence
## [1] 0
## 
## $message
## NULL
## 
## $hessian
##                      sd    intercept         beta
## sd        39.2054992631   -0.2018988 1.171969e-04
## intercept -0.2018988425   19.2567589 2.976403e+03
## beta       0.0001171969 2976.4028746 4.614436e+05
## 
## $minuslogl
## [1] 992.8995

Scratch Code #########################################################################################################

Calculating the likelihood and posteior density for different sets of paramaters and priors.

parVecLinearMLE<-c(-51.6295921,0.6251449,4.2428686)

nlpRegressionLinear(parVecLinearMLE,weightDat=myData$weight,heightDat=myData$height,priorVectLinear,print=F)

## [1] 1015.022

nllRegressionLinear(parVecLinearMLE,weightDat=myData$weight,heightDat=myData$height,print=F)

## [1] 991.0056

parVecLinearBayes<-c(-42.79,0.57,4.24)
nlpRegressionLinear(parVecLinearBayes,weightDat=myData$weight,heightDat=myData$height,priorVectLinear,print=F)

## [1] 1013.544

nllRegressionLinear(parVecLinearBayes,weightDat=myData$weight,heightDat=myData$height,print=F)

## [1] 993.7042

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).

The preview shows you a rendered HTML copy of the contents of the editor. Consequently, unlike Knit, Preview does not run any R code chunks. Instead, the output of the chunk when it was last run in the editor is displayed.