More than one kind of information

This collection of blogs is called Information Matrix, and it is named after an important quantity in statistics. If we are fitting a parametric model of the hazard rate, with log-likelihood:

\[ \ell( \alpha_1, \ldots, \alpha_n ) \]

as a function of \(n\) parameters \(\alpha_1, \ldots, \alpha_n\), then the information matrix is the matrix of second-order partial derivatives of \(\ell\). That is, the matrix \({\cal I}\) with \(ij\)th component:

\[ {\cal I}_{ij} = \frac{\partial^2 \ell}{\partial \alpha_i \partial \alpha_j}. \]

It is important because \(-{\cal I}^{-1}\) evaluated at the fitted maximum \((\hat{\alpha}_1, \ldots, \hat{\alpha}_n)\) approximates the variance-covariance matrix of the estimated parameters. This matrix is useful for investigating mis-estimation risk (a.k.a. level risk) in insurance portfolios.

Information arises in one other quite different way in a survival model, if we view a person's life history as a story unfolding over time. Then choosing any time \(t\) to be the present 'now' divides all times other than \(t\) into the past and the future. In the random world inhabited by a survival model, we assume that at time \(t\), everything that happened in the past is known, but what may happen in the future may be described only probabilistically. This knowledge of the past, at time \(t\), is conventionally denoted by \({\cal F}_{t^-}\) (we say \(t^-\) to indicate times up to, but not including, time \(t\)). Then the best quantitive statements we can make at time \(t\) about future events will use probabilities conditioned on \({\cal F}_{t^-}\). We see that, as time passes and the future becomes the past, we add to our stock of information and can update and improve our probabilistic statements about events that still lie in the future.

Here is a simple example. What is the probability that a person alive just before age 40 (so 'now' is \(t=40\)) will be alive at age 90? To let us write mathematical statements, define a random variable \(X\) which takes the value 1 if the person is alive at age 90 and 0 otherwise. \(X\) just now is unknown. Similarly, we represent information at time \(t\) by defining an indicator process, denoted \(Y(t)\), as follows:

\[Y(t)=\begin{cases}\mbox{1 if the person is alive just before age \(t\)} \\ \mbox{0 otherwise}.\end{cases}\]

In this example, the value of \(Y(t)\) can be taken to represent the information \({\cal F}_{40^-}\). We have translated our question into mathematical language: what is \(\Pr[X=1|Y(40)=1]\)? To which the answer, if we have a life table, is \({}_{50}p_{40}\).

At \(t=40\), we don't know what \(Y(41)\) will be. But we can write:

\begin{eqnarray*} \Pr[ X=1|Y(40)=1] & = & \Pr[X=1|Y(41)=0] \times \Pr[Y(41)=0|Y(40)=1] \\ & & + \Pr[X=1|Y(41)=1] \times \Pr[Y(41)=1|Y(40)=1]. \end{eqnarray*}

This says that between ages 40 and 41, the future may take either of two branches, but the probability calculated now conditioned on \({\cal F}_{40^-}\) includes both possible futures. One year later, a little bit of the future has become the past, and we have acquired more information. We now know either:

\(Y(41)=0\), in which case \(\Pr[X=1|Y(41)=0]=0\), or
\(Y(41)=1\), in which case \(\Pr[X=1|Y(41)=1]={}_{49}p_{41}\).

In either case, we have updated (improved) our probabilistic statement about \(X\).

Very similar probability statements can be made if we replace probabilities of future events with expected values of random variables, whose values will be known only in the future, for example \({\rm E}[X|Y(40)=1]\).

Of course, this particular example is trivial, if we are familiar with the life table. But the idea behind it, namely:

acquire information dynamically from observation as time passes; and
update probabilistic statements about future events as new information is acquired,

certainly is not trivial. It is the framework we need when the hazard rates in a survival model may be altered by events that are themselves part of the life history, for example changing states of health, or the death of a spouse, or simply right-censoring. This idea underlies the modern counting-process approach to survival models. Chapter 17 of our book, Modelling Mortality with Actuarial Applications, is an introduction to counting processes for survival models and the ideas outlined above play a key rôle.

References:

Macdonald, A. S., Richards. S. J. and Currie, I. D. Modelling Mortality with Actuarial Applications. Cambridge University Press (forthcoming).

Written by: Angus Macdonald

Publication Date: 19 July 2018

Last Updated: 19 July 2018

Tags: information, indicator process

Testing the tests

01 July 2018

Examining residuals is a key aspect of testing a model's fit. In two previous blogs I first introduced two competing definitions of a residual for a grouped count, while later I showed how deviance residuals were superior to the older-style Pearson residuals. If a model is correct, then the deviance residuals by age should look like random N(0,1) variables.

Tags: deviance residuals, autocorrelation, Fisher transform

Socio-economic differentials: convergence and divergence

18 June 2018

Many western countries, including the UK, have recently experienced a slowdown in mortality improvements. This might lead to the conclusion that the age of increasing life expectancies is over. But is that the case for everyone?

Tags: mortality convergence, mortality improvements, concentration risk, basis risk

View all posts

More than one kind of information

Previous posts

Testing the tests

Socio-economic differentials: convergence and divergence

Add new comment

Restricted HTML