Over-dispersion (reprise for actuaries)

(Jan 3, 2010)

In my previous post I illustrated the effects of over-dispersion in population data.  Of course, an actuary could quite properly ask: why use ONS data?  The CMI data set on assured lives might be felt to be a better guide to the mortality of pensioners, although Stephen has raised a question mark over this assumption in the past.

Figure 1 illustrates what happens with the CMI data set. The over-dispersion parameter is much smaller at 1.82, so the Poisson model gives a reasonable forecast.  Note that the over-dispersion in the CMI data comes from a different source, namely the presence of duplicates causing extra variability in death counts.  However, the same approach to over-dispersion works regardless of the…

Read more

Tags: over-dispersion, duplicates, mortality projections, ICA, Solvency II

Double trouble

(Jan 22, 2009)

Scientists strongly prefer ideas and processes which have undergone anonymous peer review in published, refereed journals.  At Longevitas we not only use peer-reviewed materials in our work, but we also publish our own research and results in academic papers.  We find it a great discipline, and our work is all the better for it.

One example cropped up recently during anonymous peer-review of a paper we had written.  We had included text on the importance of deduplication, which is essential in statistical work with insured data due to to the existence of people in portfolios with multiple policies.  The scrutineers of our paper accepted the importance of deduplication, but one of them challenged us with the…

Read more

Tags: deduplication, duplicates, annuities

Confounding Compounding

(Dec 8, 2008)

Earlier posts discussed the importance of deduplication in annuity portfolios and pension schemes and some of the issues around the deduplication of names, specifically the use of double metaphone to look through common variant spellings of the surname or family name.

One problem is that often the surname data is prepended by first or middle names as well. Or it might be suffixed with a post-nominal term as in Douglas Fairbanks Junior. Even trickier is the presence of compound names like Simon Van der Valk, and the fact that in teleservicing Van der Valk sounds awfully like Vandervalk or even Vander Valk.

So trying to match Mr Simon Piet Van der Valk with S VanderValk Senior PHD isn't a walk in the park. If we try a metaphone…

Read more

Tags: deduplication, duplicates

What's in a name?

(Aug 10, 2008)

We have already mentioned the problem of duplication in pension schemes and annuities, and as an issue we encounter frequently it is worth talking a little about some technology that can be used to counter the problem.

What we find in practice is that the unique member identifiers used within financial administration systems are all too frequently, well, not unique. We know that converting policy or benefit orientated data into individual person orientated data is vital statistically, but how can this be done reliably?

The answer is to use a combination of other data attributes present for each member to create a deduplication key around which multiple records can be merged. One common case would be to merge…

Read more

Tags: deduplication, duplicates, metaphone

Deduplication and pension schemes

(Aug 7, 2008)

Deduplication is an essential part of data preparation for statistical modelling. The phenomenon of multiple policies per person is a major issue for annuity portfolios, and arises from life companies' policy-orientated view of the world. This makes perfect sense for insurers, of course, since their legal liability is the policy.

My expectation was that it would be less of an issue for pension schemes, whom I thought would naturally have a more person-orientated view of their liabilities. However, I recently analysed the mortality of a UK pension scheme with over 38,000 benefit records, of which over 1,300 were clear duplicates. I didn't reckon on the frequency with which people can return to a former employer,…

Read more

Tags: deduplication, duplicates, pensions

Deduplication and annuities

(Jul 30, 2008)

Deduplication is an important step in data preparation for mortality modelling (or any other kind of modelling for that matter). If people in your data set have multiple benefit records, then the crucial independence assumption for statistical modelling in invalidated. An effective algorithm for identifying duplicates is described in a paper presented to the Institute of Actuaries.

The problem of duplicates is a major issue for annuity portfolios, where it is very common for people to have multiple policies. On average I expect around 1.2 annuities per person, although this is obviously portfolio-specific. I also find that the average number of annuities per person tends to increase with age. This might…

Read more

Tags: deduplication, duplicates, annuities

Find by key-word


Find by date


Find by tag (show all )