Evaluating the free market by comparing it to the alternatives (We don't need more regulations, We don't need more price controls, No Socialism in the courtroom, Hey, White House, leave us all alone)
"Just before a major election, and scary Halloween, Doug Bernheim et al released a working paper (“The Effects of Large Group Meetings on the Spread of COVID-19: The Case of Trump Rallies,”
30 Oct.) that purports to show that 21 Trump rallies caused a total of
30,000 cases and 700 deaths. Those cases, they assert, were incremental,
meaning that they would not otherwise have occurred.
Trump rallies, says this study, were superspreader events. For all I
know, they might have been. The world is a complicated place. But we
also know with absolute certainty that in order for this paper to
cultivate its results – along with its definitive conclusion – the
authors had to do a tremendous superspreading of statistical manure.
The authors question studies that show that individual Trump rallies,
like the infamous indoor one in Tulsa, did not cause numerous
infections (e.g., D.M. Dave et al, “Risk Aversion, Offsetting Community Effects, and COVID-19: Evidence from an Indoor Political Rally”
NBER Working Paper 27522”) because “COVID-19 outcomes are highly
variable” (p. 3). In other words, some rallies may have been
superspreader events while others probably weren’t.
Fair enough, but simply combining data from 21 rallies to condemn
them all is a bit like saying that a baseball team had a winning record
because it lost 19 games 1 to 0 but won its last one 20 to 0. Yes, it
scored more runs in total than its opponents, 20-19, and also scored
more runs on average than its opponents did, 1 to 0.95, but its record
was still 1 win and 19 losses. In short, “measuring the average
treatment effect over multiple events” does NOT produce “more reliable
results;” it produces, at best, an estimate of the average result.
Bernheim et al also use data for 10 weeks (instead of 3 like in other
studies) after the rally because “the effects of a superspreader event
may snowball over time.” Again, that may well be true but error terms
also snowball over time. In the limit, one hits the post hoc ego propter hoc (after this, therefore because of this) fallacy.
For example, as the graph below from its official state site shows,
cases went up in South Dakota in September and then surged in October.
Was that a snowball from Trump’s Mount Rushmore visit on 3 July? The
Sturgis motorcycle rally in early August? The opening of pheasant
season, which also brought in an influx of out-of-staters?
South Dakota is particularly interesting because it reminds us that
Covid-19 data is a statistical artifact of the real world, where people
and viruses travel across political boundaries with impunity. One person
can contract a case in another country, state, or county but be counted
in another. Another person can spread the virus elsewhere and remain
asymptomatic and never appear in the official stats at all.
Bernheim et al have county-level data, so the county becomes a
magical thing in their analysis even though we know reality is much more
complex. They know where the rallies were held but not where the rally
participants actually came from, or went to afterwards. Using the most
convenient data, a common technique in economics, is a lot like the
drunk looking for his keys under the streetlamp even though he lost them
in the dark alley far away.
Imagine, for example, a Trump rally held in a true blue county just
because the event space happened to be located in one and was convenient
to get to. It drew in no one from that county but lots of people from a
nearby red state. Cases then increased in this blue county 6, 7, 8
weeks later simply because it had not been hit previously. Bernheim et
al would blame the rally for that spread.
Without actually tracing transmission from person-to-person (which if
anyone actually could do the virus would have been stopped in March),
how can one tease out all this noise from this world full of possible
explanations for the spread of a virus that sometimes causes disease?
Well through statistics of course! And assumptions. Another assumption (they do cite a single CNN
article) is that compliance with “social distancing” rules was
uniformly low at the Trump rallies examined, which all took place
between 20 June and 22 September. Yet the authors insist the outcomes
were “highly variable,” almost as if social distancing compliance is not
that important to spread. (This is one reason I called for more controlled experiments way back in April.)
If fact, despite their confident assertion that Trump rallies killed
700 people, the authors admit that “the dynamics of COVID-19 are
complex” (p. 5) and vary over time and place. “Substantial cross-county
heterogeneity with respect to the manner in which the dynamic process
has evolved over time” means that “familiar econometric specifications
with time and county fixed effects are not able to accommodate the
first-order patterns in the data. Specifications that include
interactions between time-fixed effects and a handful of county
characteristics perform only marginally better” (p. 6).
Translation: the data did not show what the authors wanted it to and
they know that a lot of people will be checking their results so they
did not want to employ the usual econometric cheat codes.
Ergo, “an entirely different approach is required” (p. 6). That
approach is called “matched sampling.” It basically tries to replicate
an actual experiment with control and treatment groups by identifying
counties “‘similar’” (p. 6) to the ones where rallies were held and
comparing their Covid cases and deaths to those of rally counties. The
concept of similarity of counties is so fraught that the authors
themselves actually place quotation marks around it (hence the double
quotation marks above).
Obviously, the comparison counties cannot be identical because Trump
did not hold rallies in them and he most assuredly was not holding
rallies in random counties. But maybe the comparison counties are
similar enough? In what ways? What characteristics are most important to
compare? The authors assume “the most important dimension of
comparability is the pre-event trajectory of COVID-19 cases” (p. 7) but
of course “trajectories” stemming from different causes may land in very
different places. The trajectory of a baseball and a missile, for
example, may be similar at first but one lands in the catcher’s mitt and
the other goes into orbit.
The authors find that cases increased in rally counties compared to
their matched counties. The 95 percent confidence range is huge (150 to
500 cases per 100,000) but greater than zero. The size of the effect
diminishes as more controls, like “total population,” “percent less than
college educated,” and “Trump vote share in 2016” (p. 10) are added,
which usually happens in statistical studies with low predictive power
and the addition of spurious controls (like total population where
population density would make more sense).
The authors’ death estimates are based on simple multiplication of
the known death rate to the number of incremental cases presumably
caused by the rally, but they admit that “the same death rate might not
apply to baseline cases and incremental cases” and that “total deaths
might not rise at all” (p. 11). They could not, however, find
statistical evidence of changes in average “death rates,” by which they
must have meant CFR (case fatality rates). But who knows, because like
many econometricians they do not show their work.
Cognizant that those statistical analyses may strike some as
unconvincing, the authors then take a close look at the number of tests
and their positive rates in two Wisconsin counties where Trump held
rallies “because testing data are readily available for Wisconsin” (p.
13). The number of tests did not go up immediately post-rally but
positivity rates did in the two rally counties, although not “in
aggregate” for the rest of the state. I suspect the authors, who believe
themselves to be econometric geniuses, chose to report the data for the
rest of the state in aggregate because they cannot explain why
positivity rates increased in some counties far from the rally sites but
declined in nearby counties.
In short, contra the paper’s conclusion, the authors’ analysis suggests,
rather than “strongly supports” (p. 13), that large group gatherings
can spread Covid-19 under certain unknown conditions. They present no
data on social distancing guideline compliance at the rallies, so their
claim that large group gatherings where “the use of masks and social
distancing is low” are “particularly” (p. 13) likely to encourage spread
is essentially obiter dictum.
Finally, the assertion that the communities where “Trump rallies took
place paid a high price in terms of disease and death” (p. 13) is
normative because it imposes their own value judgments about cases,
which not everyone believes are negative. Death is certainly negative
but by their own admission they do not know how many of those deaths
occurred because of direct attendance at a rally and strongly imply
that, in fact, there was at least one, and perhaps several, intervening
infections until the virus struck a vulnerable or elderly person who
succumbed.
A lawyer or an insurance adjuster would immediately invoke the
concept of contributory negligence here. Just because A caused B, which
combined with other events to cause C, and others to cause D, doesn’t
mean that A is wholly, or even necessarily partly, or even at all,
responsible for D. Heck, why not blame Patient Zero for everything and leave it at that?
When it comes to Covid-19, it is not easily spread by asymptomatic people. Still, if you don’t want to catch it, you
need to take precautions not to get it. Don’t blame Trump, or anyone
else except the person who actually transmitted to you, if you do. Even
then it might not be the transmitter’s fault in the legal or insurance
sense because like the tango, it takes two to get Covid.
Discerning cause and effect in any area of life is a difficult
proposition. Doing so for the transmission trajectory of a widespread
virus, within a radically heterogeneous population over large land
masses where millions of variables are at work, is harder still. One
might even say: impossible. It surely is not enough to throw together a
paper that is 95% statistical mumbo-jumbo, and with a firm conclusion
that impacts on an earth-shaking election that is only three days away.
Scientific integrity requires much more than coming to a politically
correct result summed up with a made-for-media soundbite."
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.