Excel's limits
We have written in the past about some of the reasons why we don't use Excel to fit our models. However, we do use Excel for validation purposes — fitting models using two entirely separate tools is a good way of checking production code. That said, there are some important limits to Excel, especially when it comes to fitting projection models. Some of these limits are rather subtle, so it is important that an analyst is aware of all of Excel's limitations.
The first issue is that Excel's standard Solver feature won't work with more than 200 variables, i.e. parameters which have to be optimised in order to fit the model. This is a problem for a number of important stochastic projection models, as shown in Table 1. One option is to trim the data such that the number of parameters drops below the critical threshold of 200, but this would be a clear case of the tail wagging the dog.
Table 1. Parameters required for various mortality models applied to data for ages 55–104 over the period 1961–2012, i.e. 50 ages, 52 years and 101 cohorts.
Model | Parameters |
---|---|
Lee-Carter | 152 |
Age-Period-Cohort (APC) | 203 |
M5 | 104 |
M6 | 205 |
M7 | 257 |
A more acceptable option is the practice of dropping cohort terms supported by little data, e.g. not estimating cohort parameters with fewer than five observations. This would reduce the parameter count for the APC and M6 models below the 200-parameter threshold and Excel's Solver function would then permit you to fit those models. In the case of M7, however, you would be forced to drop a fairly large section of data in order to get below this limit.
Another issue is that Excel's Solver function will often claim an optimal solution has been found when this is not the case. This can be verified by re-running the Solver several times in succession, where it will often manage to find a better-fitting set of parameters on the second and third attempts. It is therefore important that the analyst re-runs the Solver a few times until no further change is found.
However, we have also noticed a more insidious problem with Excel's Solver that is harder to detect. The issue arises when the number of parameters is large, but still within the 200-variable limit — after several runs of the Solver there are no further changes, but the final solution is not the optimal one. When testing our implementation of M7, for example, we found that pasting in the answers produced by the production code produced a lower deviance than Excel was able to find on its own. It other words, Excel agreed that the alternative parameter set was a better fit, but was unable to find that set on its own.
One option would be to consider one of the commercially supported alternative plug-ins for Excel's Solver, although analysts would need to check that it was indeed capable of finding the solutions that Excel cannot. Another option would be to use the algorithms described by Currie (2014), which can cope with any number of parameters. Either way, it is important that analysts do not rely uncritically on Excel's standard Solver function when fitting stochastic projection models.
Add new comment