Functions of a random variable

(May 9, 2018)

Assume we have a random variable, $$X$$, with expected value $$\eta$$ and variance $$\sigma^2$$.  Often we find ourselves wanting to know the expected value and variance of a function of that random variable, $$f(X)$$.  Fortunately there are some workable approximations involving only $$\eta$$, $$\sigma^2$$ and the derivatives of $$f$$.  In both cases we make use of a Taylor-series expansion of $$f(X)$$ around $$\eta$$:

$f(X)=\sum_{n=0}^\infty \frac{f^{(n)}(\eta)}{n!}(X-\eta)^n$

where $$f^{(n)}$$ denotes the $$n^{\rm th}$$ derivative of $$f$$ with respect to $$X$$.  For the expected value of $$f(X)$$ we then have the following second-order approximation:

${\rm E}[f(X)] \approx f(\eta)+\frac{f''(\eta)}{2}\sigma^2\qquad(1)$

Tags: GLM, log link, logit link

Working with constraints

(Feb 9, 2016)

Regular readers of this blog will be aware of the importance of stochastic mortality models in insurance work.  Of these models, the best-known is that from Lee & Carter (1992):

$\log \mu_{x,y} = \alpha_x + \beta_x\kappa_y\qquad(1)$

where $$\mu_{x,y}$$ is the force of mortality at age $$x$$ in year $$y$$ and $$\alpha_x$$, $$\beta_x$$ and $$\kappa_y$$ are parameters to be estimated.  Lee & Carter used singular value decomposition (SVD) to estimate their parameters, but the modern approach is to use the method of maximum likelihood - by making an explicit distributional assumption for the number of deaths, the fitting process can make proper allowance for the amount of information available…

Out of line

(Aug 20, 2013)

Regular readers of this blog will be in no doubt of the advantages of survival models over models for the annual mortality rate, qx. However, what if an analyst wants to stick to the historical actuarial tradition of modelling annualised mortality rates? Figure 1 shows a GLM for qx fitted to some mortality data for a large UK pension scheme.

Figure 1. Observed mortality rates (•) and fitted values (-) using a binomial GLM with default canonical link (logit scale). Source: Own calculations using the mortality experience of a large UK pension scheme for the single calendar year 2009.

Figure 1 shows that the GLM provides a good approximation of the mortality patterns. A check of the deviance residuals (not shown) yields…

Tags: GLM, linearity, survival model

Groups v. individuals

(Sep 28, 2012)

We have previously shown how survival models based around the force of mortality, μx, have the ability to use more of your data.  We have also seen that attempting to use fractional years of exposure in a qx model can lead to potential mistakes. However, the Poisson distribution also uses μx, so why don't we use a Poisson model for the grouped count of deaths in each cell?  After all, a model using grouped counts sounds like it might fit faster.  In this article we will show why survival models constructed at the level of the individual are still preferable.

The first step when using the Poisson model is to decide on the width of the age interval.  This is necessary because the Poisson model for grouped counts…

Part of the story

(Oct 9, 2009)

The Institute of Actuaries' sessional meeting on 28th September 2009 discussed an interesting paper.  It covered similar material to that in Richards (2008), but used different methods and different data.  Nevertheless, some important results were confirmed: geodemographic type codes are important predictors of mortality, and a combination of geodemographic profile and pension size is better than either factor on its own.  The authors also added an important new insight, namely that last-known salary was a much better predictor than pension size.

The models in the paper were GLMs for qx, which require complete years of exposure.  The authors were rightly concerned that just using complete years would…

Out for the count

(Jul 31, 2009)

In an earlier post we described a problem when fitting GLMs for qx over multiple years.  The key mistake is to divide up the period over which the individual was observed in a model for individual mortality.  This violates the independence assumption and leads to parameter bias (amongst other undesirable consequences). If someone has three records aged 60, 61 and 62 initially, then these are not independent trials: the mere existence of the record at age 62 tells you that there was no death at age 60 or 61.

Life-company data often comes as a series of in-force extracts, together with a list of movements.  The usual procedure is to re-assemble the data to create a single record for each policy, using the policy number…

Logistical nightmares

(Jan 24, 2009)

A common Generalised Linear Model (GLM) for mortality modelling is logistic regression, also sometimes described as a Bernoulli GLM with a logistic link function.  This models mortality at the level of the individual, and models the rate of mortality over a single year.  When age is used as a continuous covariate, logistic regression has some very useful properties for pensioner mortality: exponentially increasing mortality from age 60 to 90 (say), with slower, non-exponential increases at higher ages.  Logistic regression was the foundation of the models presented in a SIAS paper on annuitant mortality.

Although logistic regression for the rate of mortality is nowadays superceded by more-powerful…

Tags: GLM, logistic regression

Great Expectations

(Dec 8, 2008)

When fitting statistical models, a number of features are commonly assumed by users.  Chief amongst these assumptions is that the expected number of events according to the model will equal the actual number in the data.  This strikes most people as a thoroughly reasonable expectation.  Reasonable, but often wrong.

For example, in the field of Generalised Linear Models (GLMs), the user has a choice of so-called link functions to specify the model.  For binomial data, the default is the canonical link, the logit, which gives the following function for the rate of mortality, qx:

qx = exp(α + βx) / (1 + exp(α + βx))

This is known to actuaries as a simplied version of Perks Law when applied to mortality…

Tags: GLM

Do we need standard tables any more?

(Nov 1, 2008)

Actuaries are long used to using standard tables. In the UK these are created by the Continuous Mortality Investigation Bureau (CMIB), and the use of certain tables is often prescribed in legislation. As actuaries increasingly move to using statistical models for mortality, it is perhaps natural that they should first consider incorporating standard tables into these models. But are standard tables necessary, or even useful, in such a context?

Although we normally prefer to model the force of mortality, here we will use a model for the rate of mortality, qx, since this is how many actuaries still approach mortality. Our model is actually a generalised linear model (GLM) where the rate of mortality is:

exp(α+βx)qx

Survival models v. GLMs?

(Aug 12, 2008)

At some point you may be challenged to decide whether to use survival models or the older generalised linear models (GLMs). You could be forgiven for thinking that the two were mutually exclusive, especially since some commercial commentators have tried to frame the debate that way.

In fact, survival models and GLMs are not necessarily mutually exclusive. It is true that GLMs are more commonly used for modelling the rate of mortality, qx, whereas survival models are always used for modelling the force of mortality, μx. Indeed, a survival model can be defined as a model for μx.

However, there are GLMs for the force of mortality as well. One notable example is the Poisson model for the number of deaths,…