### Lost in translation (reprise)

#### (Oct 31, 2011)

Late last year I drew up a table of actuarial terms and their translation for statisticians.  I had thought that it was a uniquely actuarial trait to use different names compared to other disciplines.  It turns out that statisticians are almost as guilty.  Table 1 shows some common statistical terms in mortality modelling and their description for non-statisticians.

Table 1. Some statistical terms and their definition for mathematicians and engineers.

Statistical term Notation
Description
hazard function
varies The instantaneous failure rate.
observed information matrix The curvature of the log-likelihood function, i.e. the negative of the matrix of second partial derivatives.  This is the same…

### Laying down the law

#### (Dec 14, 2010)

In actuarial terminology, a mortality "law" is simply a parametric formula used to describe the risk.  A major benefit of this is automatic smoothing and in-filling for areas where data is sparse.  A common example in modern annuity portfolios is that there is often plenty of data up to age 75 (say), but relatively little data above age 90.

For example, if we use a parametric formula like the Gompertz law:

log μx = α + βx

then we can use a procedure like the method of maximum likelihood to estimate α and β.  Once we have these values, we can generate mortality rates at any age we require, not just the ages at which we have data.

But which mortality law should one use?  In a recent paper (Richards,…

### One small step

#### (Dec 7, 2010)

When fitting mortality models, the foundation of modern statistical inference is the log-likelihood function. The point at which the log-likelihood has its maximum value gives you the maximum-likelihood estimates of your parameters, while the curvature of the log-likelihood tells you about the standard errors of those parameter estimates.

The log-likelihood function is maximised by finding where the gradient is zero. To find this, one requires the first derivatives of the function with respect to each parameter. Similarly, the curvature is measured by the second partial derivatives with respect to each possible pair of parameters. In both cases, one requires either the derivatives themselves, or…

### A likely story

#### (Nov 27, 2008)

The foundation for most modern statistical inference is the log-likelihood function.  By maximising the value of this function, we find the maximum-likelihood estimate (MLE) for a given parameter, i.e. the most likely value given the model and data.  For models with more than one parameter, we find the set of values which jointly maximise the log-likelihood.

This much is basic statistics.  However, the log-likelihood function can give you more insight than just yielding MLEs.  In particular the shape and curvature of the log-likelihood tells you how much confidence you can have in a particular MLE.  By way of example, consider fitting a simple Makeham model for the force of mortality, μx:

μx =…

Tags: Makeham, log-likelihood

### Choosing between models

#### (Aug 13, 2008)

In any model-fitting exercise you will be faced with choices. What shape of mortality curve to use? Which risk factors to include? How many size bands for benefit amount? In each case there is a balance to be struck between improving the model fit and making the model more complicated.

Our preferred method of measuring model fit is the log-likelihood function, but this on its own does not take account of model complexity. For example it is usually possible to make a model fit better - i.e. increase the log-likelihood value - by adding extra parameters and risk factors. But is this extra complexity justified? Are those extra parameters and risk factors earning their keep in the model?

There are a number of different…

Tags: AIC, log-likelihood, model fit

### Choosing between models - a business view

#### (Aug 13, 2008)

We discussed how we use the AIC to choose between models. The standard definition of the AIC is:

AIC = -2 * log-likelihood + 2 * number of parameters

However, this is a statistician's view of a model, where the only criterion for including a parameter is whether it is statistically significant. A business view might be different, as each extra parameter in a system will cost you money. IT systems have to be specified, programmed, tested and maintained, for example, and IT staff are not cheap. Each extra parameter might therefore cost you £5,000 in development costs (say), so you might be inclined to only include parameters if they are really significant. One way of doing this is to increase the penalty for the number…

Tags: AIC, log-likelihood, model fit