Hi Iain,

Your posts on over-dispersion are very interesting. A good teaser to learn more. The finding in this analysis of CMI data, however, seems to me to be counter-intuitive from a stochastic point of view.

I have an inkling why overdispersion would improve your p-spline fit. However, if you were to do it "the other way round", i.e. assume that the process which gives rise to the data points is a stochastic process and project it forward using time series methods, your relaxation of the Poisson assumption leads to a widening of the confidence intervals, not a narrowing. Intuitively, my guess is that the overdispersion (or Negative Binomial assumption) increases parameter uncertainty, but for a given parameter set narrows the range of stochastic projections. In aggregate, however, the total range of outcomes should be substantially wider.

A comment on the other topic which your post touches on: ONS versus CMI.

I find it intuitively gratifying to see that CMI data is less dispersed than ONS data, which is as it should be, given the wider range of socio-economic strata in population data. However, again, parameter uncertainty in your stochastic model must be orders of magnitude greater for CMI data, because your ensemble is so much smaller. Again, this would lead to wider confidence intervals than the stochastic projection based on a best estimate parameter set implies.

Have you done any analysis on this issue?

BTW, you will have seen the ASTIN Bulletin (May 2009) article by Li, Hardy and Tan on overdispersion in mortality forecasts, won't you?

Greetings, Kai