The Curse of Cause of Death Models

(Mar 22, 2018)

Stephen's earlier blog explained the origin of the very useful result relating the life-table survival probability \({}_tp_x\) and the hazard rate \(\mu_{x+t}\), namely:

\[ {}_tp_x = \exp \left( - \int_0^t \mu_{x+s} \, ds \right). \qquad (1) \]

To complete the picture, we add the assumption that the future lifetime of a person now aged \(x\) is a random variable, denoted by \(T_x\), and the connection with expression (1) which is:

\[ {}_tp_x = \Pr[ T_x > t ]. \qquad (2) \]

The 'package' of random lifetime \(T_x\), hazard rate \(\mu_{x+t}\), survival function \({}_tp_x\), and expression (1) tying everything together, sums up the mathematics of a survival model.  For the statistics, we have observations…

Read more

Tags: cause of death, competing risks

Label without a cause

(May 25, 2016)

To talk informally about a concept, we need only give it a recognisable name. For example, we use the label "medical error" and we all know what is meant - or at least we think we do. However, there are clearly large differences between mis-diagnosing a condition, prescribing an incorrect dosage or removing the wrong internal organ, so our informal certainty is of limited practical use. We know that to analyse or measure a problem robustly over time (and ultimately to improve it), we need more than a high-level label. We need a rigorous classification, universally adhered to.

Rigorous classification is what the World Health Organisation's ICD system seeks to provide for analysts working with mortality data.…

Read more

Tags: cause of death, ICD, medical error, research

A shaky foundation?

(Jul 7, 2015)

As with anything that must combine reliable data with hard maths and sound judgement, forecasting mortality is difficult. When using stochastic projection models, reliable data is critical, since without it, we are building an ambitious structure on a shaky foundation. Even with seemingly straightforward all-cause data, problems can and do exist, but when working with mortality data classified by cause-of-death, the challenges are both numerous and difficult to navigate.

A key extra step in the generation of cause-of-death data is the classification process itself. We've previously discussed issues around changes in the classification systems used through time making it difficult to create a consistent…

Read more

Tags: cause of death, autopsy

Correlation complications

(Nov 25, 2012)

A basic result in probability theory is that the variance of the sum of two random variables is not necessarily the same as the sum of their variances. Mathematically, the variance of the sum of two random variables, A and B, is as follows:

Var(A+B) = Var(A) + Var(B) + 2*Cov(A,B)                (1)

where Var() denotes the variance and Cov() denotes the covariance.  The above result shows that the variance of A+B is only equal to the sum of the variances when their covariance (or correlation) is zero, i.e. when A and B are independent.  If A and B are positively correlated, for example, then ignoring the covariance term will cause the total variance to be under-estimated.  This basic result is relevant to two common scenarios…

Read more

Tags: cause of death, correlation, covariance

Summary judgement

(Jul 18, 2011)

In previous posts we have looked at problems with the quality and reliability of cause-of-death data and a list of hurdles for mortality projections based on such data.  One other issue is that of detail.  While cause-of-death data is spread over literally thousands of individual causes, important detail is lost on the most important mortality risk factor of all.  Oeppen (2008) states the problem:

"deaths are often tabulated by 5 year age groups and the open age interval into which the deaths of the oldest-old are aggregated is often defined at a relatively young age such as 85. Unfortunately, it is at these high ages where most of the temporal dynamics are occurring."

Read more

Tags: cause of death, missing data

Shifting sands

(May 16, 2011)

In civil engineering, no building can be sounder than the foundation on which it rests.  A similar comment applies to statistical analysis, which is obviously limited by the quality of the underlying data.  This is an issue for mortality projections, too, since these are ideally based on a historical record of good-quality data.

In the case of all-cause mortality rates, the quality of deaths data in the developed world is high.  In England and Wales, for example, the nationwide registration system means that the majority of deaths are recorded within a week of actual occurrence.  Without being flippant, there is never any real doubt as to whether someone is dead or not.  Projections of all-cause mortality rates…

Read more

Tags: cause of death, data quality, mortality projections

Seven questions for projections by cause of death

(Apr 18, 2011)

I have written several times about the challenges in creating mortality projections based on cause-of-death data.  Those interested in the details can consult my recent paper published in a special edition of the British Actuarial Journal.  For anyone else looking to build (or buy) a methodology based on cause-of-death projections, here is a quick checklist of questions to ask yourself (or your supplier):

  1. How is the inherent bias towards projecting lower improvements corrected?  This point is well documented for both cause-of-death projections and also expectations based on expert opinion.  This issue is of critical importance for reserving for pensions and annuities.
  2. How is socio-economic bias handled? …

Read more

Tags: cause of death

Cutting the bias

(Aug 12, 2010)

With the exception of dressmaking, bias is generally undesirable.  This is particularly the case when projecting future mortality rates for reserving for pension liabilities.  More precisely, a bias towards over-stating mortality rates would be particularly bad because it would lead to under-reserving.

One method for forecasting future mortality is to project changes by cause of death.  I have written previously about the numerous technical challenges facing the cause-of-death approach.  However, there is also a fundamental academic objection to this approach: it is biased.  This is covered by a number of researchers, two of whom we quote below:

"Mortality projections disaggregated by cause…

Read more

Tags: cause of death, mortality projections

For the record

(Jun 16, 2010)

Stephen has written about the challenges in using population cause-of-death data for mortality analysis and forecasting.   Another potential source of data is computerised patient records such as the General Practice Research Database (GPRD).  However, when using them you need to know their strengths and weaknesses.

These databases are derived by extracting anonymized data from general practice (GP) computer records.  When GP computer systems emerged in the mid 1980s, they were used for keeping registers of patients and for administering repeat prescribing.  Recording of other aspects of the record has evolved incrementally since.  Consequently, data about prescribing is highly reliable, but…

Read more

Tags: cause of death

What's in a word?

(Apr 17, 2010)

Trends in cause of death can be an instructive way of looking at past mortality, although we have previously seen that we have to be very careful that an apparent "trend" is not due to changes in recording.  Leaving aside the problems of shifting classification over time, what of the categories themselves?  Table 1 shows the top ten causes of death for males aged 70-74 in England and Wales in 2000.

Table 1. Top ten causes of death for males aged 70-74 in England & Wales in 2000.  Source: 20th Century Mortality.

Proportion Description Secondary description
13.1% Acute myocardial infarction  
9.7% Malignant neoplasm of trachea, bronchus and lung Bronchus and lung, unspecified
8.3% Other forms of…

Read more

Tags: cause of death

Older Posts »

Find by key-word

Find by date

Find by tag (show all )