Generating some data with a know probability of germination ->
myP
myP<-0.67
myData<-rbinom(n=10,size=1,prob=myP)
myData
[1] 1 1 1 0 0 0 1 1 1 1
mean(myData)
[1] 0.7
Here is the form of the Likelihood Functions that we have been using
to fit models. This is specific to the Binomial probability
function.
nllBinomial<-function(parmEstimate,seedlingDat){
nllik<- -sum(dbinom(seedlingDat,size=1,parmEstimate,log=TRUE))
cat("nllik= ",nllik,sep=" ",fill=T)
return(nllik)
}
nllBinomial(0.6,myData)
nllik= 6.324652
[1] 6.324652
nllBinomial(0.1,myData)
nllik= 16.43418
[1] 16.43418
We would then minimize the function using optim().
parVec<-c(0.1) # Initial parameter values
outLogReg<-optim(par=parVec,fn=nllBinomial,method="L-BFGS-B",lower=c(0.01),upper=c(0.99),seedlingDat=myData)
nllik= 16.43418
nllik= 16.36786
nllik= 16.5012
nllik= 13.88586
nllik= 13.88586
nllik= 13.60701
nllik= 10.07198
nllik= 10.05039
nllik= 10.09367
nllik= 8.576073
nllik= 8.560991
nllik= 8.591218
nllik= 6.609054
nllik= 6.602835
nllik= 6.61531
nllik= 6.11829
nllik= 6.117376
nllik= 6.119248
nllik= 6.108907
nllik= 6.109091
nllik= 6.108772
nllik= 6.108643
nllik= 6.108661
nllik= 6.108673
nllik= 6.108643
nllik= 6.108667
nllik= 6.108667
nllik= 6.108643
nllik= 6.108667
nllik= 6.108667
outLogReg$par #
[1] 0.6999994
The core portion of the likelihood function is this line of code:
nllik<- -sum(dbinom(seedlingDat,size=1,parmEstimate,log=TRUE))
I want to step through this line of code so that its clear what its
doing.
dataVector<-c(1,0,1)
where 1 indicates a seed produced a seedling and 0 indicates no
seedling emerged.
So the likelihood is the likelihood of observing a 1, 0, and 1 given
some germination probability,G.
Likelihood = P(1|G) * P(0|G) * P(1|G) where the P() is the binomial
distribution. For the binomial distribution, we have to specify the size
of the trials. For each of our ‘trials’, we plant a single seed and
observe if a seedling emerges, So the size of our trial is 1, because we
plant a single seed.
We next choose a value of G, then calculate the likelihood of
observing our data as follows:
G<-0.10
dbinom(1,size=1,G) * dbinom(0,size=1,G) * dbinom(1,size=1,G)
[1] 0.009
G<-0.7
dbinom(1,size=1,G) * dbinom(0,size=1,G) * dbinom(1,size=1,G)
[1] 0.147
G<-0.9
dbinom(1,size=1,G) * dbinom(0,size=1,G) * dbinom(1,size=1,G)
[1] 0.081
By taking the logs of each of these probabilities, we add them
together rather than multiply them. This helps avoid numerical issues
associated with getting very small numbers from multiplying long
sequences of probabilities together. Eventually, the computer will just
say that the final product is 0.
So we modify the above code by setting log=T and then adding the
terms together.
G<-0.10
dbinom(1,size=1,G, log=T) + dbinom(0,size=1,G, log=T) + dbinom(1,size=1,G, log=T)
[1] -4.710531
G<-0.7
dbinom(1,size=1,G, log=T) + dbinom(0,size=1,G, log=T) + dbinom(1,size=1,G, log=T)
[1] -1.917323
G<-0.9
dbinom(1,size=1,G, log=T) + dbinom(0,size=1,G, log=T) + dbinom(1,size=1,G, log=T)
[1] -2.513306
Finally, we want to be able to compute the likelihood over a set of
alternative values of G. The value of G that produces the highest
likelihood is the most likely value or maximum likelihood estimate
(mle).
In the code below, I create a sequence of values of G, then calculate
the likelihood associated with each of these values.
Note that I am using the map_dbl function from the purrr library. The
map_dbl function applies the myLike function to each value of G.
library(purrr)
G<-seq(from=0, to=1,by=0.10)
myLike<-function(G) {dbinom(1,size=1,G, log=T) + dbinom(0,size=1,G, log=T) + dbinom(1,size=1,G, log=T) }
map_dbl(G,myLike)
[1] -Inf -4.710531 -3.442019 -2.764621 -2.343407 -2.079442 -1.937942 -1.917323 -2.055725 -2.513306 -Inf
Since the dbinom function can be vectorized, we can simplify this
calculation as follows
G<-0.1
dataVector<-c(1,0,1)
sum(dbinom(dataVector,size=1, G, log=T))
[1] -4.710531
so that our code above becomes
library(purrr)
dataVector<-c(1,0,1)
G<-seq(from=0, to=1,by=0.10)
myLike2<-function(G) {sum(dbinom(dataVector,size=1, G, log=T)) }
map_dbl(G,myLike2)
[1] -Inf -4.710531 -3.442019 -2.764621 -2.343407 -2.079442 -1.937942 -1.917323 -2.055725 -2.513306 -Inf
Finally, we add a minus sign in front of the sum to calculate the
negative log likelihood, so that we can minimize the function using
optim() rather than maximize it.
library(purrr)
dataVector<-c(1,0,1)
G<-seq(from=0, to=1,by=0.10)
myLike2<-function(G) {-sum(dbinom(dataVector,size=1, G, log=T)) }
nllVector<-map_dbl(G,myLike2)
nllVector
[1] Inf 4.710531 3.442019 2.764621 2.343407 2.079442 1.937942 1.917323 2.055725 2.513306 Inf
Now we can plot the likelihood function. Its not smooth because we
have a small number of values of G that we are using. The more values of
G that we use, the smoother the function will become.
plot(G,nllVector,typ='l')
This is an R Markdown
Notebook. When you execute code within the notebook, the results appear
beneath the code.
Try executing this chunk by clicking the Run button within
the chunk or by placing your cursor inside it and pressing
Cmd+Shift+Enter.
plot(cars)
Add a new chunk by clicking the Insert Chunk button on the
toolbar or by pressing Cmd+Option+I.
When you save the notebook, an HTML file containing the code and
output will be saved alongside it (click the Preview button or
press Cmd+Shift+K to preview the HTML file).
The preview shows you a rendered HTML copy of the contents of the
editor. Consequently, unlike Knit, Preview does not
run any R code chunks. Instead, the output of the chunk when it was last
run in the editor is displayed.
LS0tCnRpdGxlOiAiTGlrZWxpaG9vZCBGdW5jdGlvbiBDb3JlIGNvZGUiCm91dHB1dDoKICBodG1sX2RvY3VtZW50OgogICAgZGZfcHJpbnQ6IHBhZ2VkCi0tLQoKCgpHZW5lcmF0aW5nIHNvbWUgZGF0YSB3aXRoIGEga25vdyBwcm9iYWJpbGl0eSBvZiBnZXJtaW5hdGlvbiAtPiBteVAKYGBge3J9Cm15UDwtMC42NwpteURhdGE8LXJiaW5vbShuPTEwLHNpemU9MSxwcm9iPW15UCkKbXlEYXRhCm1lYW4obXlEYXRhKQpgYGAKCkhlcmUgaXMgdGhlIGZvcm0gb2YgdGhlIExpa2VsaWhvb2QgRnVuY3Rpb25zIHRoYXQgd2UgaGF2ZSBiZWVuIHVzaW5nIHRvIGZpdAptb2RlbHMuICBUaGlzIGlzIHNwZWNpZmljIHRvIHRoZSBCaW5vbWlhbCBwcm9iYWJpbGl0eSBmdW5jdGlvbi4KCmBgYHtyfQoKbmxsQmlub21pYWw8LWZ1bmN0aW9uKHBhcm1Fc3RpbWF0ZSxzZWVkbGluZ0RhdCl7CgogIG5sbGlrPC0gLXN1bShkYmlub20oc2VlZGxpbmdEYXQsc2l6ZT0xLHBhcm1Fc3RpbWF0ZSxsb2c9VFJVRSkpCgogIGNhdCgibmxsaWs9ICIsbmxsaWssc2VwPSIgIixmaWxsPVQpCiAgcmV0dXJuKG5sbGlrKQp9CgpubGxCaW5vbWlhbCgwLjYsbXlEYXRhKQpubGxCaW5vbWlhbCgwLjEsbXlEYXRhKQoKYGBgCgoKV2Ugd291bGQgdGhlbiBtaW5pbWl6ZSB0aGUgZnVuY3Rpb24gdXNpbmcgb3B0aW0oKS4KCmBgYHtyfQpwYXJWZWM8LWMoMC4xKSAjIEluaXRpYWwgcGFyYW1ldGVyIHZhbHVlcyAKb3V0TG9nUmVnPC1vcHRpbShwYXI9cGFyVmVjLGZuPW5sbEJpbm9taWFsLG1ldGhvZD0iTC1CRkdTLUIiLGxvd2VyPWMoMC4wMSksdXBwZXI9YygwLjk5KSxzZWVkbGluZ0RhdD1teURhdGEpCm91dExvZ1JlZyRwYXIgIyAKYGBgCgoKVGhlIGNvcmUgcG9ydGlvbiBvZiB0aGUgbGlrZWxpaG9vZCBmdW5jdGlvbiBpcyB0aGlzIGxpbmUgb2YgY29kZToKCiAgbmxsaWs8LSAtc3VtKGRiaW5vbShzZWVkbGluZ0RhdCxzaXplPTEscGFybUVzdGltYXRlLGxvZz1UUlVFKSkKCkkgd2FudCB0byBzdGVwIHRocm91Z2ggdGhpcyBsaW5lIG9mIGNvZGUgc28gdGhhdCBpdHMgY2xlYXIgd2hhdCBpdHMgZG9pbmcuCgoKZGF0YVZlY3RvcjwtYygxLDAsMSkKCndoZXJlIDEgaW5kaWNhdGVzIGEgc2VlZCBwcm9kdWNlZCBhIHNlZWRsaW5nIGFuZCAwIGluZGljYXRlcyBubyBzZWVkbGluZyBlbWVyZ2VkLgoKClNvIHRoZSBsaWtlbGlob29kIGlzIHRoZSBsaWtlbGlob29kIG9mIG9ic2VydmluZyBhIDEsIDAsIGFuZCAxIGdpdmVuIHNvbWUgZ2VybWluYXRpb24gcHJvYmFiaWxpdHksRy4KCkxpa2VsaWhvb2QgPSBQKDF8RykgKiBQKDB8RykgKiBQKDF8Rykgd2hlcmUgdGhlIFAoKSBpcyB0aGUgYmlub21pYWwgZGlzdHJpYnV0aW9uLgpGb3IgdGhlIGJpbm9taWFsIGRpc3RyaWJ1dGlvbiwgd2UgaGF2ZSB0byBzcGVjaWZ5IHRoZSBzaXplIG9mIHRoZSB0cmlhbHMuCkZvciBlYWNoIG9mIG91ciAndHJpYWxzJywgd2UgcGxhbnQgYSBzaW5nbGUgc2VlZCBhbmQgb2JzZXJ2ZSBpZiBhIHNlZWRsaW5nIGVtZXJnZXMsClNvIHRoZSBzaXplIG9mIG91ciB0cmlhbCBpcyAxLCBiZWNhdXNlIHdlIHBsYW50IGEgc2luZ2xlIHNlZWQuCgpXZSBuZXh0IGNob29zZSBhIHZhbHVlIG9mIEcsIHRoZW4gY2FsY3VsYXRlIHRoZSBsaWtlbGlob29kIG9mIG9ic2VydmluZyBvdXIgZGF0YSAKYXMgZm9sbG93czoKCmBgYHtyfQpHPC0wLjEwCmRiaW5vbSgxLHNpemU9MSxHKSAqIGRiaW5vbSgwLHNpemU9MSxHKSAqIGRiaW5vbSgxLHNpemU9MSxHKSAKCkc8LTAuNwpkYmlub20oMSxzaXplPTEsRykgKiBkYmlub20oMCxzaXplPTEsRykgKiBkYmlub20oMSxzaXplPTEsRykgCgpHPC0wLjkKZGJpbm9tKDEsc2l6ZT0xLEcpICogZGJpbm9tKDAsc2l6ZT0xLEcpICogZGJpbm9tKDEsc2l6ZT0xLEcpIAoKYGBgCgpCeSB0YWtpbmcgdGhlIGxvZ3Mgb2YgZWFjaCBvZiB0aGVzZSBwcm9iYWJpbGl0aWVzLCB3ZSBhZGQgdGhlbSB0b2dldGhlciByYXRoZXIKdGhhbiBtdWx0aXBseSB0aGVtLiAgVGhpcyBoZWxwcyBhdm9pZCBudW1lcmljYWwgaXNzdWVzIGFzc29jaWF0ZWQgd2l0aCBnZXR0aW5nIHZlcnkgc21hbGwKbnVtYmVycyBmcm9tIG11bHRpcGx5aW5nIGxvbmcgc2VxdWVuY2VzIG9mIHByb2JhYmlsaXRpZXMgdG9nZXRoZXIuIEV2ZW50dWFsbHksIHRoZSAKY29tcHV0ZXIgd2lsbCBqdXN0IHNheSB0aGF0IHRoZSBmaW5hbCBwcm9kdWN0IGlzIDAuCgpTbyB3ZSBtb2RpZnkgdGhlIGFib3ZlIGNvZGUgYnkgc2V0dGluZyBsb2c9VCBhbmQgdGhlbiBhZGRpbmcgdGhlIHRlcm1zIHRvZ2V0aGVyLgoKYGBge3J9Ckc8LTAuMTAKZGJpbm9tKDEsc2l6ZT0xLEcsIGxvZz1UKSArIGRiaW5vbSgwLHNpemU9MSxHLCBsb2c9VCkgKyBkYmlub20oMSxzaXplPTEsRywgbG9nPVQpIAoKRzwtMC43CmRiaW5vbSgxLHNpemU9MSxHLCBsb2c9VCkgKyBkYmlub20oMCxzaXplPTEsRywgbG9nPVQpICsgZGJpbm9tKDEsc2l6ZT0xLEcsIGxvZz1UKQoKRzwtMC45CmRiaW5vbSgxLHNpemU9MSxHLCBsb2c9VCkgKyBkYmlub20oMCxzaXplPTEsRywgbG9nPVQpICsgZGJpbm9tKDEsc2l6ZT0xLEcsIGxvZz1UKQpgYGAKCkZpbmFsbHksIHdlIHdhbnQgdG8gYmUgYWJsZSB0byBjb21wdXRlIHRoZSBsaWtlbGlob29kIG92ZXIgYSBzZXQgb2YgCmFsdGVybmF0aXZlIHZhbHVlcyBvZiBHLiBUaGUgdmFsdWUgb2YgRyB0aGF0IHByb2R1Y2VzIHRoZSBoaWdoZXN0IGxpa2VsaWhvb2QKaXMgdGhlIG1vc3QgbGlrZWx5IHZhbHVlIG9yIG1heGltdW0gbGlrZWxpaG9vZCBlc3RpbWF0ZSAobWxlKS4KCkluIHRoZSBjb2RlIGJlbG93LCBJICBjcmVhdGUgYSBzZXF1ZW5jZSBvZiB2YWx1ZXMgb2YgRywgdGhlbiBjYWxjdWxhdGUKdGhlIGxpa2VsaWhvb2QgYXNzb2NpYXRlZCB3aXRoIGVhY2ggb2YgdGhlc2UgdmFsdWVzLgoKTm90ZSB0aGF0IEkgYW0gdXNpbmcgdGhlIG1hcF9kYmwgZnVuY3Rpb24gZnJvbSB0aGUgcHVycnIgbGlicmFyeS4gVGhlIG1hcF9kYmwgZnVuY3Rpb24KYXBwbGllcyB0aGUgbXlMaWtlIGZ1bmN0aW9uIHRvIGVhY2ggdmFsdWUgb2YgRy4gCgpgYGB7cn0KbGlicmFyeShwdXJycikKCkc8LXNlcShmcm9tPTAsIHRvPTEsYnk9MC4xMCkKCm15TGlrZTwtZnVuY3Rpb24oRykge2RiaW5vbSgxLHNpemU9MSxHLCBsb2c9VCkgKyBkYmlub20oMCxzaXplPTEsRywgbG9nPVQpICsgZGJpbm9tKDEsc2l6ZT0xLEcsIGxvZz1UKSB9CgptYXBfZGJsKEcsbXlMaWtlKQpgYGAKCgoKU2luY2UgdGhlIGRiaW5vbSBmdW5jdGlvbiBjYW4gYmUgdmVjdG9yaXplZCwgd2UgY2FuIHNpbXBsaWZ5IHRoaXMgY2FsY3VsYXRpb24gYXMgZm9sbG93cwoKYGBge3J9Ckc8LTAuMQpkYXRhVmVjdG9yPC1jKDEsMCwxKQpzdW0oZGJpbm9tKGRhdGFWZWN0b3Isc2l6ZT0xLCBHLCBsb2c9VCkpCmBgYAoKc28gdGhhdCBvdXIgY29kZSBhYm92ZSBiZWNvbWVzCgpgYGB7cn0KbGlicmFyeShwdXJycikKCmRhdGFWZWN0b3I8LWMoMSwwLDEpCkc8LXNlcShmcm9tPTAsIHRvPTEsYnk9MC4xMCkKCm15TGlrZTI8LWZ1bmN0aW9uKEcpIHtzdW0oZGJpbm9tKGRhdGFWZWN0b3Isc2l6ZT0xLCBHLCBsb2c9VCkpIH0KCm1hcF9kYmwoRyxteUxpa2UyKQoKYGBgCgpGaW5hbGx5LCB3ZSBhZGQgYSBtaW51cyBzaWduIGluIGZyb250IG9mIHRoZSBzdW0gdG8gY2FsY3VsYXRlIHRoZSBuZWdhdGl2ZSBsb2cKbGlrZWxpaG9vZCwgc28gdGhhdCB3ZSBjYW4gbWluaW1pemUgdGhlIGZ1bmN0aW9uIHVzaW5nIG9wdGltKCkgcmF0aGVyIHRoYW4gbWF4aW1pemUgaXQuCgoKYGBge3J9CmxpYnJhcnkocHVycnIpCgpkYXRhVmVjdG9yPC1jKDEsMCwxKQpHPC1zZXEoZnJvbT0wLCB0bz0xLGJ5PTAuMTApCgpteUxpa2UyPC1mdW5jdGlvbihHKSB7LXN1bShkYmlub20oZGF0YVZlY3RvcixzaXplPTEsIEcsIGxvZz1UKSkgfQoKbmxsVmVjdG9yPC1tYXBfZGJsKEcsbXlMaWtlMikKbmxsVmVjdG9yCmBgYAoKCk5vdyB3ZSBjYW4gcGxvdCB0aGUgbGlrZWxpaG9vZCBmdW5jdGlvbi4gIEl0cyBub3Qgc21vb3RoIGJlY2F1c2Ugd2UgaGF2ZQphIHNtYWxsIG51bWJlciBvZiB2YWx1ZXMgb2YgRyB0aGF0IHdlIGFyZSB1c2luZy4gIFRoZSBtb3JlIHZhbHVlcyBvZiBHIHRoYXQKd2UgdXNlLCB0aGUgc21vb3RoZXIgdGhlIGZ1bmN0aW9uIHdpbGwgYmVjb21lLgoKYGBge3J9CgpwbG90KEcsbmxsVmVjdG9yLHR5cD0nbCcpCmBgYAoKCgoKCgoKCgpUaGlzIGlzIGFuIFtSIE1hcmtkb3duXShodHRwOi8vcm1hcmtkb3duLnJzdHVkaW8uY29tKSBOb3RlYm9vay4gV2hlbiB5b3UgZXhlY3V0ZSBjb2RlIHdpdGhpbiB0aGUgbm90ZWJvb2ssIHRoZSByZXN1bHRzIGFwcGVhciBiZW5lYXRoIHRoZSBjb2RlLiAKClRyeSBleGVjdXRpbmcgdGhpcyBjaHVuayBieSBjbGlja2luZyB0aGUgKlJ1biogYnV0dG9uIHdpdGhpbiB0aGUgY2h1bmsgb3IgYnkgcGxhY2luZyB5b3VyIGN1cnNvciBpbnNpZGUgaXQgYW5kIHByZXNzaW5nICpDbWQrU2hpZnQrRW50ZXIqLiAKCmBgYHtyfQpwbG90KGNhcnMpCmBgYAoKQWRkIGEgbmV3IGNodW5rIGJ5IGNsaWNraW5nIHRoZSAqSW5zZXJ0IENodW5rKiBidXR0b24gb24gdGhlIHRvb2xiYXIgb3IgYnkgcHJlc3NpbmcgKkNtZCtPcHRpb24rSSouCgpXaGVuIHlvdSBzYXZlIHRoZSBub3RlYm9vaywgYW4gSFRNTCBmaWxlIGNvbnRhaW5pbmcgdGhlIGNvZGUgYW5kIG91dHB1dCB3aWxsIGJlIHNhdmVkIGFsb25nc2lkZSBpdCAoY2xpY2sgdGhlICpQcmV2aWV3KiBidXR0b24gb3IgcHJlc3MgKkNtZCtTaGlmdCtLKiB0byBwcmV2aWV3IHRoZSBIVE1MIGZpbGUpLiAKClRoZSBwcmV2aWV3IHNob3dzIHlvdSBhIHJlbmRlcmVkIEhUTUwgY29weSBvZiB0aGUgY29udGVudHMgb2YgdGhlIGVkaXRvci4gQ29uc2VxdWVudGx5LCB1bmxpa2UgKktuaXQqLCAqUHJldmlldyogZG9lcyBub3QgcnVuIGFueSBSIGNvZGUgY2h1bmtzLiBJbnN0ZWFkLCB0aGUgb3V0cHV0IG9mIHRoZSBjaHVuayB3aGVuIGl0IHdhcyBsYXN0IHJ1biBpbiB0aGUgZWRpdG9yIGlzIGRpc3BsYXllZC4KCg==