How much data do you need?

(Mar 26, 2010)

We have written before about how survival models make better use of available data.  Another way of viewing this is that survival models can make do with smaller data volumes than methods based on the rate of mortality, qx.  But what do we mean by "data volumes"?  Should we measure this by claim events, by number of lives or by exposure time?  And how much is enough?

For survival models the most sensible measure is a combination of claim events and exposure time.  The number of lives is of secondary importance for survival models, since they naturally and easily span multi-year investigations.  For a survival model it is less important if 10,000 life-years of exposure is observed amongst 10,000 people…

