In my previous post, I compared two sources of data regarding death counts (one from SPF, Santé Publique France, for covid deaths in hospitals; one from INSEE, for all-cause deaths), in order to get a better idea of the actual death toll of covid-19 in France.
In this post, I would like to do the following:
- describe briefly a new, richer data-set recently published by INSEE (and do some graphs);
- use the updated data (from both sources) to repeat my analysis, with some variants (weekly aggregates, separating men and women);
- reply to a few comments I got on LinkedIn and elsewhere;
- provide a few pointers regarding death counts in other countries (particularly the UK).
New INSEE data
INSEE now provides every Friday an exhaustive data-set that records, for each death that has occurred since 01-01-2018, the following variables: date of birth, date of death, sex, département of death, and so on. Neat. Let’s take this opportunity to do a few plots, such as this one:
(it’s nice to observe this sharp drop) or that one:
The latter plot covers the same period (weeks 13 to 15, 23rd March to 12th April) as in the analysis below. As expected, over-mortality seems to affect mostly people above 60.
Ok, now let’s repeat my previous analysis, based on merging the SPF data (daily covid death counts in hospitals, in each département and each sex) and the aforementioned INSEE data (all-cause deaths). Except this time:
- The overlap between the two datasets now covers more than three weeks (18th March, first date in SPF dataset, to 12th April, latest date in INSEE dataset) so I decided to consider weekly aggregates, for two reasons: they are more stable than daily aggregates, and less affected by artifacts such as delays (e.g. a death occurring during a week-end is reported to the next Monday).
- I also separated men and women.
- I am going to simplify a bit the model, and simply regress excess deaths (number of deaths in 2020 minus the average over 2018 and 2019) on hospital deaths.
First, a joint plot:
So, to recap, each point in this plot corresponds to a pair of death counts, for each département in France, each week between 13 and 15, and for each sex. The corresponding linear regression (without an intercept) gives a slope estimate of 1.79 (95% confidence interval: [1.73, 1.85]). The basic interpretation would be: in each département, when 100 covid deaths occur in hospitals, the number of covid-related (see below) deaths should be approximately 179. The current total number of covid deaths reported by SPF is 22, 614, which is 60% above that the number of covid deaths in hospitals (14050). So this estimate suggests the actual death toll might be a tad larger. More about the interpretation below.
Now for something more interesting: let’s redo the previous plot, but with a different colour for each sex:
Clearly the two linear trends are different; see below the OLS estimates.
|sex||slope estimate||slope 95% confidence interval||R^2|
What is going on? Well, women tend to live longer than men. And the proportion of women in EHPADS (French retirement homes) is 74%. Since the main reason behind the discrepancy between hospital deaths and excess deaths is covid death occurring in pension homes, these results make sense.
Reply to previous comments
What’s the point?
Fair enough, since 4th April, SPF does include in its total estimate both hospital deaths and pension home deaths, and the proportion of the latter is not too far from my estimate. Note that however that:
- it’s really hard to estimate properly the number of covid deaths occurring in pension homes. Apparently several pension homes did not provide any data, while others marked as “covid” all the deaths that have occurred after the first covid deaths.
- My estimate might measure other direct or indirect effects of the pandemic, such as people dying at home, people not receiving proper care because the health system is at capacity and so on.
- The fact that data from two different institutions may be compared, and seem to be somehow consistent, is, in my opinion, a good piece of news which deserves to be reported!
Accounting for car accidents
Boy, that one was popular. Please have a look a the plot on the front page of ONIRS (click on “tués”): yes, the number car-related deaths dropped sharply thanks to the lock-down… But in March of last year, this number was around 250, that is, 1% of the current covid death count. “Fun” fact: this point would have been quite relevant in the 70s! In those years, the number of car-related deaths was about five times larger (18 034 deaths in 1972).
Better predictive model
The idea of comparing the 2020 deaths to the average of the two previous years is a bit crude, and demographers have better models to predict death counts based on age repartition and so on. That said, the notion of “excess deaths” seems quite popular in various countries, as I explain below, so I guess that my approach is not so daft after all.
In other countries
To be honest, I was hoping to apply the same approach to the UK, a country where the official estimate is still limited to hospital deaths, and thus clearly quite biased; see e.g. this Guardian paper. Sadly, Public Health England only reports daily hospital death counts … per nation (nation=England, Scotland, Wales, or Northern Ireland). On the other hand, the Office of National Statistics reports every week the number of “excess deaths” (relative to the five year average), and the proportion of these deaths where the word “covid” is mentioned on the death certificate.
Interestingly, the Guardian paper I mentioned above first complains that the UK only reports hospital deaths, and then claims erroneously that the UK is still behind France in terms of covid mortality. It’s not, if you compare in terms of hospital deaths (UK: 20,319 on Saturday, while France: 14,050). The fact that even journalists reporting on this issue may get it wrong seems indicative of how confusing are covid death data.
More generally, my impression is that looking at “excess deaths” makes far more sense for most countries at the moment: it’s easier to measure (albeit with a delay of course), and easier to interpret. This is also more or less the point made by this NYT paper. (Notice how their plot for France only covers January to April; for the complete plot, see my first plot above!).