Great Expectations
When fitting statistical models, a number of features are commonly assumed by users. Chief amongst these assumptions is that the expected number of events according to the model will equal the actual number in the data. This strikes most people as a thoroughly reasonable expectation. Reasonable, but often wrong.
For example, in the field of Generalised Linear Models (GLMs), the user has a choice of socalled link functions to specify the model. For binomial data, the default is the canonical link, the logit, which gives the following function for the rate of mortality, q_{x}:
q_{x} = exp(α + βx) / (1 + exp(α + βx))
This is known to actuaries as a simplied version of Perks Law when applied to mortality data. However, there are several other choices of link function. The interesting thing about these link functions is that only one of them guarantees that the GLM produces the same number of expected deaths as were actually observed. The following table summarises the results for five alternative GLMs fitted to the same data with 4739 deaths:
GLM link function
 Expected deaths

Logit
 4739 
Log  4733.503 
Cauchy
 4797.986 
Complementary loglog  4738.649 
Probit  4741.33

We can see that only the logit link produces the same number of expected deaths as in the actual data. What's going on with the rest? Why don't they all produce the same number of expected deaths? The answer lies with the fact that GLMs use the loglikelihood function to determine maximumlikelihood estimates (MLEs) for the model parameters. Only some forms of the loglikelihood result in MLEs which coincidentally result in the same number of expected events.
If you want to experiment with this yourself, you can use the file on the right to reproduce the results in R, a freely available statistical package which fits a variety of GLMs
Comments