Deterministics Anonymous

In Macdonald & Richards (2025), Stephen and I pointed out some benefits of models built up from instantaneous Bernoulli trials by product-integration (both of which have featured in previous blogs). We highlighted three cases, beginning with the one with fewest assumptions: (1) individual life-history data; (2) the pseudo-Poisson model for grouped data; and (3) the true Poisson model for grouped data.

It is also instructive to consider the likelihoods in the reverse order. They are listed thus below, with random quantities highlighted in blue (the rest of the notation is unimportant here${}^*$).

\[\begin{eqnarray*}
\mbox{True Poisson} & \propto & \prod_x \exp \big( - E_x^c \, \mu_x^* \big) \, \big( \mu_x^* \big)^{\color{blue} {D_x}} \\
\mbox{Pseudo-Poisson} & \propto & \prod_x \exp \big( - {\color{blue} {E_x^c}} \, \mu_x^* \big) \, \big( \mu_x^* \big)^{\color{blue} {D_x}} \\
\mbox{Individual Data} & \propto & \prod_i \exp \left( - \int_{0}^{\infty} {\color{blue} {Y^i(t)} \, \mu_{t}^{\boldsymbol{\theta}}} \, dt \right) ({\color{blue}{Y^i(b_i)}} \, \mu_{b_i}^{\boldsymbol{\theta}} \, dt)^{\color{blue} {\Delta N_i(b_i)}}.
\end{eqnarray*} \]

It is clear at once why the Poisson model appears so attractive in applications. There is only one random variable for each age $x$, the total number of deaths ${\color{blue} {D_x}}$. The other two likelihoods contain more random quantities in blue, so they look more complicated.

However, this pleasing parsimony comes at a price, namely that the true Poisson model is a dead end. Look at the exposures $E_x^c$ in the first likelihood above. They are not blue, because exposure is not treated as random in a Poisson model. It can't be — the only random variable allowed in a Poisson model is the quantity being counted, here deaths. If there are any other random variables, then whatever the model is, it ain't Poisson.

Think, then, about what the true Poisson model would require of mortality data. It would require the exposures $E_x^c$ collected by the observational scheme to be non-random. We would have to know, in advance, what all the $E_x^c$ would be. That condition is sometimes met in controlled experimental set-ups (Andersen et al. 1993) but almost never in actuarial data. There, exposures are as they come, they are not under the actuary's control. In other words, they are random.

The upside is that random exposures, once we accept the idea, are the key to the whole class of models mentioned at the start of this blog. Our path may be eased by two further steps, also illustrated by the three likelihoods above.

In the middle likelihood, for the pseudo-Poisson model, all we have done is colour the exposures ${\color{blue} {E_x^c}}$ blue to show that they are random. Numerically we have the same likelihoods as in the true Poisson model, so for inference and parameter estimation we can treat the pseudo-Poisson model as if it were a true Poisson model, and re-use without change any software we have for Poisson-related estimation. Conceptually the likelihoods are different because we acknowledge the true nature of the exposures.
In the third likelihood, for individual life-history data, we see the actual mechanism at work. The stochastic process ${\color{blue} {Y^i(t)}}$, also starring in earlier blogs, is the zero-one indicator:

\[ {\color{blue} {Y^i(t)}} = \left\{ \begin{array}{ll} 1 & \mbox{if life $i$ is alive and under observation at time $t^-$} \\ 0 & \mbox{otherwise}. \end{array} \right. \]

Integrating the product ${\color{blue} {Y^i(t)}} \, \mu_t^{\boldsymbol{\theta}}$ (where $\mu_t^{\boldsymbol{\theta}}$ is the hazard rate of death, with parameter vector $\boldsymbol{\theta}$) automatically gives us the integrated hazard over the period during which life $i$ is observed.

The necessity of taking these steps may be shown by adding one more likelihood, that for a model of a continuous hazard rate $\mu_t^{\boldsymbol{\theta}, \boldsymbol{\beta}' {\color{blue} {\mathbf{z}_i}}}$ depending also on a vector ${\color{blue}{\mathbf{z}_i}}$ of covariates for each life $i$, linearly through a vector of coefficients $\boldsymbol{\beta}$:

\[\begin{eqnarray*}
\mbox{With Covariates} & \propto & \prod_i \exp \left( - \int_{0}^{\infty} {\color{blue} {Y^i(t)}} \, \mu_{t}^{\boldsymbol{\theta}, \boldsymbol{\beta}' {\color{blue} {\mathbf{z}_i}}} \, dt \right) ({\color{blue} {Y^i(b_i)}} \, \mu_{b_i}^{\boldsymbol{\theta}, \boldsymbol{\beta}' {\color{blue} {\mathbf{z}_i}}} \, dt)^{\color{blue} {\Delta N_i(b_i)}}.
\end{eqnarray*}\]

For actuaries accustomed to Poisson-distributed deaths, accepting random exposures in the pseudo-Poisson model is a step towards individual life histories. In fact, acceptance can be seen as just the first of several steps. Next time your group meets to discuss mortality models, arrange the chairs in a circle, and one by one stand up and say: ``My name is [state your name], and I have been treating exposures as deterministic $\ldots$''.

${}^*$For completeness, both true and pseudo-Poisson models have a single parameter $\mu_x^*$ for each age group labelled $x$, while the individual data model has a continuous hazard rate $\mu_t^{\boldsymbol{\theta}}$ parameterized by a vector $\boldsymbol{\theta}$.

References:

Andersen, P. K., Borgan, Ø., Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer, New York.

Macdonald, A. S. & Richards, S. J. (2025). On Contemporary Mortality Models for Actuarial Use II: Principles. British Actuarial Journal, 30, e19. Preprint available.

Written by: Angus Macdonald

Publication Date: 07 October 2025

Last Updated: 07 October 2025

Services: Survival Modelling, Projections Toolkit

Tags: Poisson distribution, survival models

Johannes Karup

28 September 2025

As discussed in earlier blogs, trailblazing actuaries Benjamin Gompertz and William Makeham used parametric models for the mortality hazard. However, the data they worked with were typically grouped into wide age ranges, which involves a loss of information if mortality rates are continually increasing.

Tags: survival models, force of mortality

Dealing with dates in actuarial mortality investigations

09 September 2025

When we first wrote our survival-modelling software in late 2005, we had to decide how to represent dates for the purpose of calculating exposure times. We decided to adopt a real-valued approach, e.g. 14th March 1968 would be represented as 1968.177596 (the fractional part is $\frac{31+29+14}{366}$, since 1968 is a leap year).

Tags: data representation

View all posts

Deterministics Anonymous

Previous posts

Johannes Karup

Dealing with dates in actuarial mortality investigations

Add new comment

Restricted HTML