Pension size as a factor
In a previous blog I showed that there was often a statistically significant link between pension size and mortality. It is clearly necessary to account for such a link in an actuarial mortality model, not least because people with larger pensions account for a disproportionate share of portfolio risk.
Pension size is essentially a continuous covariate, and a simple approach is to discretise, i.e. create nonoverlapping ranges. We sort the pensions within a portfolio, then define breakpoints such that equal numbers of lives fall into each band (or as close to equal as we can achieve). Each life can then be assigned a sizeband as a risk factor in the same way that they can have a gender; the only difference is that sizeband is an ordinal factor, whereas gender is a categorical factor. Table 1 shows the model fits for a pension scheme using various numbers of sizebands, where the fit is measured by Akaike's Information Criterion (AIC) .
Table 1. AIC using selected numbers of sizebands in a mortality model. Source: Richards (2020), ENG portfolio for a localauthority pension scheme in England.
Number of sizebands  Lives per sizeband  AIC 
2 

42,193 
150,179 
3 

28,128 
150,129 
4 

21,094 
150,085 
5 

16,876 
150,042 
10 

8,437 
149,973 
20 

4,208 
149,972 
Table 1 shows that some material improvements in fit are possible, but that there is little to be gained from using more than ten sizebands. One problem with Table 1 is that many of the sizebands have similar mortality levels (not shown), so a natural question is whether we can improve the fit by using bands with unequal numbers of lives?
An alternative approach to sizebands with equal numbers of lives would be to optimise the breakpoints by starting with 20 bands (say) and merging adjacent bands with similar mortality. We do this by searching for merges that produce the lowest AIC for a given target number of levels (targeting the BIC would produce the same result, as the number of parameters is held constant). The process is iterative and the time taken to consider all possible merges increases with both the target number of levels and the initial number of sizebands. For this reason, we perform an exhaustive search of breakpoints when dealing with 2 or 3 target factor levels, but avoid a combinatorial explosion by adopting a more limited timesaving search algorithm for targeting four or more factor levels. In our implementation we have further reduced runtimes by using parallel processing to spread calculations over 63 threads (Butenhof, 1997). The results of this optimisation process are shown in Table 2.
Table 2. AIC from optimising the breakpoints between sizebands in a mortality model. Source: Richards (2020), ENG portfolio for a localauthority pension scheme in England.
Number of sizebands  Lives in highest income sizeband  AIC 
2 
12,648 
150,019 
3 
12,648 
149,994 
4 
12,648 
149,997 
5 
12,648 
149,986 
6 
4,208 
149,962 
7 
4,208 
149,962 
8 
4,208 
149,962 
9 
4,208 
149,963 
Table 2 shows that, for a given number of sizebands, the model fit is materially improved compared with the same number of levels in the equallysized bands of Table 1. With lives distributed unequally across six sizebands in Table 2 we have a better fit than an equal distribution of lives over ten bands in Table 1. A further refinement might be to optimise from (say) 100 initial discretised sizebands instead of 20, although this would substantially increase the runtime in searching for the optimal breakpoints.
Turning a realvalued variate like pension size into a discrete factor is a neat approach, but it has drawbacks:
 Information loss. Creating such a discrete factor throws away information, which is seldom a good thing. Specifically, it introduces discretisation error — mortality might not necessarily be homogeneous within a range, and there will also be jumps in mortality when crossing range boundaries. For example, the boundary between the 18th and 19th sizebands for the portfolio used here is £11,115 p.a.; this means that a hypothetical Pensioner A receiving £11,100 p.a. might be treated as having quite different pensionrelated mortality than Pensioner B with £11,200 p.a., despite the fact that they receive nearidentical incomes.
 Computation time. Note in Table 2 the anomalous jump in AIC moving from three to four sizebands even though lower AICs are subsequently found for five and then six bands. It is possible to conclude you have found an optimal number of bands and breakpoints, when a more exhaustive (but far more timeconsuming) search might have found a further improvement, as shown here.
 Extrapolation. It is not obvious how to extrapolate mortality effects to pension sizes beyond those observed in the data set, say when creating an actuarial pricing basis.
While discretisation is a handy simplification, actuaries would therefore ideally like to treat pensionrelated mortality on a continuous basis (Richards, 2020).
References
Butenhof, D. R. (1997) Programming with POSIX threads, AddisonWesley, Boston, ISBN 9780201633924.
Richards, S. J. (2020) Modelling mortality by continuous benefit amount, Longevitas working paper.
Comments