Interview: interpreting numbers and data around the coronavirus pandemic

On 28 October I gave an interview on formula TV‘s „business formula“ programme about how to interpret numbers and data around the coronavirus pandemic. It was aired last Tuesday, 3 November. Since it is difficult for a five-minute-long TV edit to do justice to the complexity of the topic, I am going to share a more complete version of the conversation I had with Nino Kvintradze. We touched on a number of topics that I have not yet written about on this blog. Most of it is still very relevant today, and I hope it will stimulate some discussion.

Comments in [square brackets] are mine, as of today.

N.K.: You wrote in your piece last week [on 21 October] that Georgia will have to impose more restrictions again. What are the reasons that lead you to this conclusion?

L.H.: There are three indicators that point in this direction:

First, the situation in other countries. We are still in the growing phase of the epidemic and can observe how countries are doing that are one, two, three weeks ahead of us. As it turns out, almost all countries in Europe with a similar or higher incidence than Georgia have now imposed tougher restrictions than we are seeing here. [With the notable exceptions of Switzerland and Luxembourg, as I pointed out in my follow-up blog post on 2 November.]

Secondly, the reproduction number which you can monitor and which is a more reliable metric than the daily number of cases [if you use an appropriate method and account for a number of biases – more on that in a separate post later]. Actually, it has been surprisingly low in most regions of Georgia over the past weeks – between 1.2 and 1.4 in most regions, and 1.1 in Adjara. The question is: will it drop below one? That is what you need in order to reduce the spread, or even just to keep it at the current level. But I have not seen much of a decrease in this metric lately. I am not saying that it‘s impossible to get it below one just by means of increased public awareness etc., but I am sceptical.

Thirdly, mobility data such as the one that facebook is collecting by tracking the location of smartphones. The average level of individual mobility is of course not equivalent to infectious activity, but it is another indicator that might tell us something about whether a population is reacting to the pandemic, and whether we might get the necessary spontaneous reaction without additional government interventions. Since I wrote the piece last week there has been some change: it is now possible to see a little bit of an effect from the new measures in Imereti and Tbilisi [which were imposed on 16 October]. In Tbilisi, average mobility is now almost as low as in Batumi, which is around 7% lower than in February. So there is a reaction, but is it strong enough? Again, comparison with other countries makes me sceptical.

Since the beginning of the pandemic, we have had a debate in Georgia about how the daily confirmed cases are connected to the number of tests that are being conducted. Can you elaborate on the correlation between these two numbers?

You are touching on the issue of possible underreporting, the phenomenon that the confirmed case statistics might not show all of the actual Covid-19 infections. Of course there is a relation between the number of tests conducted and the number of cases that are officially registered. But what is more important is the question: how many of the tests are positive? If we test 100 people who might have been exposed to the virus, and only one of them returns positive, then that is a good ratio. But if 20 of them return positive, chances are high that you are missing people out there who have the virus but are simply not being tested.

There are many people here in Georgia who think they had Coronavirus in January, February or March this year, but they have not been tested.

I would say some of that is due to placebo effect – people had some other infection and think it was Covid. [‚Placebo‘ is of course not technically the correct term in this context.]

But indeed, one of the big revelations back in spring was that in many countries with large outbreaks – in Europe, North America, etc. – there had in fact been around ten times as many infections in March as were reflected in the confirmed cases. This was the conclusion of randomised antibody testing, and maybe we can talk about that in more detail later. Clearly, this level of underreporting was related to the fact that not enough tests were conducted in the early stages of the pandemic [and there was little or no contact tracing in many countries]. The whole discussion of underreporting developed in this context.

So we have to talk about tests. According to the EU, a country in the „green zone“ is supposed to have no more than 3 out of 100 positive tests, and the WHO standard is no more than 5 out of 100 positive tests. If we are looking at Georgia right now, and if we just consider PCR tests, the test positivity rate has risen to around 20 out of 100. That is a level we are also currently seeing in Central and Eastern Europe, and, by the way, also in Switzerland. That is a number where I would not be confident anymore that we are detecting most of the infections.

We should mention that Georgia has, like other countries, started to use more antigen tests as a replacement for PCR in the past couple of days. However, I have not seen any numbers published on how many of these are being conducted. In any case, the PCR tests seem to be stuck at around 10‘000 tests a day, which seems to be the maximum capacity of laboratories. This means that a rising number of cases translates directly into a higher test positivity, at least with respect to PCR.

What was the test positivity in Georgia in the early stages of the pandemic, for instance in March or April?

Georgia always managed to keep this number relatively low until recently. It never really surpassed 7 or 8 percent in March and April. At the end of April the testing capacities were increased massively, and from then onwards the testing positivity always stayed below one percent until September. That was a top ranking number in international comparison. But now, although roughly the same number of tests are conducted every day as during summer, the high number of cases means that test positivity is not looking good anymore.

Speaking of the high number of cases, you made an interesting connection with citizens‘ mobility. You have looked at mobility data by facebook, and you have observed that in summer Batumi was in first place in terms of mobility not only in Georgia, but also among other countries. Can you elaborate on that?

I have not checked out all the cities in the world. So this is just based on me more or less randomly probing 20 or 30 cities, mostly in Europe and North America [and East Asia]. I have not found a city where mobility this summer was remotely as high as in Batumi compared to February base level. By the way, I also visited Batumi myself this summer [from the end of August until early September] and I saw a city that was bustling with life. I was not surprised to see this reflected in the data.

So to some extent it is not a coincidence that the pandemic in Georgia restarted in Batumi, although I think there was some bad luck involved in the precise moment at which this happened. It could have happened two weeks later, or a month later. There was a superspreading event which re-ignited it. As we now know much better than back in March, this is actually one of the characteristics of this virus: it spreads mostly due to a few infected people transmitting it to a lot of other people, while most infected people end up not transmitting it to anybody at all. So there is an element of chance involved in it, and we were a bit unlucky to have experienced that in Batumi.

But couldn‘t the government have foreseen this level of mobility in Batumi and done something about it?

I think it was to some extent taking a gamble. Maybe with hindsight you could say that enforcement of mask wearing was not exactly as good as it should have been. But we also have to keep in mind that incidence in summer was really very low in most parts of the country.

Do you think there was any chance to prevent this kind of spread of the virus in early September?

The way to prevent such superspreading events from happening is to prevent people from having a large number of contacts, for example by imposing restrictions on large gatherings. When contact is hard to avoid, as in the cake shop where the superspreading in Batumi occurred, you probably should at least enforce mask wearing which makes it much less likely that one person infects 100 people at once.

In your article you wrote that Georgia is experiencing its „first wave“ of Covid right now. What about the wave in spring?

Sure, there was a small first wave in spring, and thanks to the timely reaction it did not really take off. Then the NCDC had time to scale up the testing and contact tracing and get things under control rather quickly. So, in terms of how the population is directly affected by the virus – not indirectly due to the lockdown, economic measures, etc. – we are now certainly experiencing the first real wave here.

What is more devastating to the economy: the spread of the virus or the measures to contain it?

I think to some extent this is a false dichotomy. If we frame the question in this way we kind of suggest that we can just go to the virus and say: „how about we give you 2000 of our lives and in return you give us 3 percent economic growth?“ I don‘t think that is how it works. We have seen other countries trying to achieve that – even if perhaps publicly they don‘t admit it so directly. For example we have seen Sweden try a more economy-friendly approach. There is a debate over whether it has helped them economically. I think the reduction in GDP there was slightly less than in other European countries. But some people would argue that other countries bounced back faster. Certainly it came at the cost of a high death toll. Sweden has seen around 6000 deaths from Covid so far. Compare this with neighbouring Norway which has had less than 300. Does a marginally better economic situation really justify this big difference?

Also, we should never forget that economies are interconnected, and in Georgia a lot of it is probably out of the control of what the government in this country does. A lot of it depends on remittances, it depends on international trade, etc.

To sum it up, I think we have to be very careful not to play these two things – health and economy – against each other, but instead try and find a good middle path that takes into account all aspects of the problem.

In spring when we had the strict lockdown, the government was justifying it by saying that they need time to get the healthcare system ready. But poverty also kills. A lot of people lost their jobs, etc. Was it worth it to have these two months of strict lockdown? I am asking not only about Georgia, but other countries as well.

Yes, I do think so. With respect to Georgia, if you ask me whether two months were perhaps a little too long: maybe we could say so with hindsight. But we have to take into account that in this country until around 20 April less than 500 tests were conducted every day. Consider the case numbers we see right now: there would have been absolutely no chance of controlling a situation like that before summer.

Just to keep things in perspective: we have a pretty bad outbreak right now in Batumi. I don‘t have the actual figures, but a rough calculation tells you that Adjara, the worst hit region in the country, is looking at perhaps an excess mortality of 30 or 40 percent in October [i.e. 30-40% more people died overall than in a typical month of October]. Compare this to Bergamo, one of the worst hit communities in March in Italy. They had an excess mortality of 500 percent during that month. So that is on a totally different level. I think, the fact that we basically skipped the first wave here enables us now to talk on the level of „do we find it acceptable to have a couple of hundred Covid deaths per month, nationally, or not?“ [Apologies for the degree of cynicism!] Frankly, I believe we would be having totally different discussions now if we had let it run its course in spring.

So let me ask this: had the number of tests in March and April not been 5000 in total but rather 5000 a day, do you believe there would have been more confirmed cases?

Yes. The NCDC started to scale up the number of tests at the end of April. You can compare this to their statistic of asymptomatic cases, which went up from under 20 percent to around 40 percent in that period. 40 percent is roughly the number you would expect if most infections are detected, because we know that around 40 to 50 percent of all Covid-19 infections are asymptomatic (which is not to be confused with presymptomatic at the time of testing). So this seems to be more or less consistent after the end of April, but not before.

Another consistency check is to look at the fatality rate. We now have a number of really good studies and meta-studies that look at the actual fatality rate across countries. I am talking here about the rate of deaths among all infected people, not just the confirmed cases [i.e. the Infection Fatality Ratio].

In particular, I have in mind the recent meta-study by Meyerowitz-Katz and others, an American-Australian research collaboration. It is based on a careful evaluation of randomised antibody testing in a large number of countries. What they found is that the fatality rate of a country is to a very large extent predictable solely from the age structure of the infected population. Their study is based mostly on data from high-income countries (although there were some other countries such as Lithuania included in the study as well), so there is some question mark as to whether we can apply its findings to Georgia. Nevertheless, I have done that for the Georgian demographic and calculated what would be the expected fatality rate if we assume all age segments of the population to be equally affected. The result is almost exactly one percent. In other words, if all age segments of the Georgian population are equally likely to get Covid-19, then one out of 100 infected people would be expected to die.

Compare this to the actual figures. First of all, the NCDC statistics show that all age groups in Georgia do get infected roughly at the same rate. [There are in fact small differences, but they are irrelevant to the argument laid out here.] If we assume that the 1500 confirmed cases until the end of August represent almost all the infections in the country since the beginning of the pandemic, you would get an actual fatality rate of 19/1500 = 1.27 percent [there were 19 deaths in that time period]. Now I also indicated that we probably had a significant number of undetected infections, mainly before mid April (possibly several hundred), which would push that number down closer to 1 percent or even below. Moreover, the stochastic uncertainty due to the small number of deaths is quite large – we are talking about plus-minus half a percent. So, all in all this is perfectly consistent with the study I mentioned (and with most other studies which have looked at the real fatality rate).

Now I also looked at the current situation. Since the cases are rising, it is a little more complicated to get a fair live estimate of the fatality rate. But using the theoretical fatality rate from the study, you can in fact do a forward projection of deaths based on the confirmed cases in order to estimate the expected number of deaths up to a certain point in time. Interestingly, we are now seeing almost twice as many deaths as you would expect. [This is a very rough estimate dependent on a number of model assumptions. It could well be 1.5 or 2.5 times as many.]

There are several possible explanations for this: for one, it could be that patients are not anymore getting the best care possible. Or it could mean that the theoretical fatality rate is actually a little higher in Georgia [perhaps due to the high prevalence of cardiovascular diseases], for example 1.5 percent which is still inside the confidence interval of what we observed until August. But the most likely explanation to me seems, also considering the high test positivity rate, that we are currently missing around half of the infections in the confirmed cases. Some of them might be asymptomatic and in quarantine and just not get tested, but others might not be found by the contact tracers.

What is the fatality rate in other countries?

Ok, so let‘s be precise here: there are two fatality rates, and it is important not to confuse them. The „infection fatality rate“ is what I just called the „real fatality rate“. This is the percentage of all infected people who die. On the other hand, the „case fatality rate“ is the percentage of confirmed cases who die. That is the number you see quoted most often [because it is easier to estimate].

The infection fatality rate or real fatality rate is relatively similar across countries, typically somewhere between 0.5 and 1.3 percent. However, the case fatality rate can vary wildly. There were countries with case fatality rates of 10 percent or higher in spring. Now we know the reason for this: the statistics simply registered only one tenth of all the infections, but a majority of the deaths. [Deaths were also underreported in many countries, but the discrepancy was not nearly as large as with infections.]

You mentioned that the true extent of the pandemic in spring was revealed by randomised antibody testing. Can you talk a bit more about that? And why are we not seeing such studies being done in Georgia?

Antibody tests look at whether your immune system has produced antibodies against Covid-19. So they test whether you have been infected with Covid in the past.

A good way to check how many people have been infected with the virus is to randomly pick, say, 1000 people and test how many of them have had the virus. It works just like a political opinion poll. If 100 of those people test positive, you conclude that roughly 10 percent of the population has had Covid.

However, like in an opinion poll there are statistical uncertainties. And here it gets a little messy, so bear with me.

You have uncertainty due to the fact that you only test a sample of 1000 people and not the whole population. This is like the usual „polling error“. In this example, that would be roughly plus-minus one percent. But more importantly, you have an uncertainty in the test itself. You have a certain percentage of false negatives, which is easy to correct for, but you also have a small percentage of false positives – I believe it was around 2 percent for the best tests on the market. In other words, if you test 1000 people who never had Covid, you will still get something like 10, 20 or 30 positive tests.

Now let‘s say 1 percent of the population has had Covid. We pick 1000 people randomly and subject them to an antibody test. You would expect that around 10 of them will test positive, right? But the statistical uncertainty throws in a „polling error“ of plus-minus ten positive tests. And on top of that you will have another 10, 20 or 30 false positives. So when you eventually have a result of, say, 25 positives, does that really tell you anything? Not really, the uncertainties are too large and the result is pretty useless!

The situation is different if, say, already 10 percent of the population has had Covid. In that case we expect around 100 of the 1000 tested people to be positive, and an uncertainty of 10, 20 or 30 will not make the result useless.

Based on what we discussed earlier, I would expect that around 2 percent of the Georgian population has been infected with Covid so far. This means that systematic randomised antibody testing until now made little sense, at least on a national level. But we might soon enter a stage where the numbers are large enough for that to become useful, and I hope it will be done.

Regarding the current situation here in Georgia, NCDC‘s Head Amiran Gamkrelidze said that maybe we will reach three or four thousand daily cases in mid November. Do you have any projections of your own, and do you think making such projections is a good thing?

As I said, the reproduction number is very close to one. This means that if we all reduce our exposure to the virus by around twenty percent on average, cases will stop rising. But we have to take into account that new cases get confirmed one or two weeks after they were exposed to the virus, so I do expect that cases will continue to rise for a bit.

If we manage to get the reproduction number below one soon, the numbers mentioned by Gamkrelidze, three or four thousand at the peak, might well check out. However, as I said, I am not yet convinced that this is happening. In my opinion it is a very optimistic scenario if we don‘t do anything more. It could be that the NCDC is anticpating some additional measures after the elections, but at least they did not say so when they projected this peak in mid November. When I saw this, I was asking myself: „why do you think there will be a peak so soon?“ There is no theoretical reason why the epidemic should suddenly stop growing at this point, unless we do something about it, and reduce our exposure to the virus a little bit more.

So when do you anticipate the peak?

If the reproduction number stays at the level it is now, 1.2 or 1.3, every epidemiological model tells you that this is going to be a very long, drawn-out wave throughout winter, for months and months to come. However, if this goes on and on, I do think that people will at some point reduce their exposure a little bit so that the reproduction number will fall below one and the epidemic will peak. But there is no reason why this should necessarily happen in November already. This is only going to be the case if people react now, or if government reacts now.


Update: Since this interview was held (28 October), daily confirmed cases have risen to 2901 as of 8 November. The government has introduced a mandatory outdoors mask mandate on 4 November as well as a night-time “restriction of movement” in seven cities from tomorrow, 9 November (which practically amounts to a curfew but is called differently, possibly for legal reasons).

One thought on “Interview: interpreting numbers and data around the coronavirus pandemic

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.