We noted with great interest the study by Loeb et al., published in the December 2022 issue of Annals of Internal Medicine, as well as a helpful commentary and various reactions to the findings. In this commentary, we discuss the notion of a “non-inferiority” trial and address several concerns.
What are non-inferiority trials?
When an effective treatment is expensive or inconvenient, we might consider a reasonable alternative that is cheaper and/or more convenient. How much reduction in effectiveness is acceptable? Research studies can be designed as “non-inferiority” trials that, critically, set a numerical threshold for how much reduced effectiveness one is willing to tolerate in exchange for the known benefits of reduced cost and/or convenience. If the study can rule out the possibility of an effectiveness reduction as large as the numerical threshold, then the cheaper treatment can be deemed “non-inferior.” Ideally, the threshold is not based on one research team’s opinion, but rather on empirical data from those who would experience both the reduced benefits and better cost/harm profile. Even if data-based, the answer is a middle-of-the-road judgment with no objectively correct answer, as different people would set higher or lower thresholds.
In this study by Loeb et al, the chosen threshold was a doubling of risk when wearing a medical mask instead of an N95 respirator (e.g., 10% risk versus 5% risk); this was based on a study on SARS among critical care nurses, as well as expert input. In a commentary accompanying the 2022 article, Chou called this a “generous noninferiority threshold which may be unacceptable to many health workers.” The study authors stated that even if people disagree with the threshold, the article provides the numerical range of possible values (up to a 4.9% increase in risk if you use medical masks), and “readers and policy makers can decide for themselves about this,” as reported in Medscape.
Was this study necessary?
Some have asserted that the study was pointless, based both on logic and science, as well as prior empirical work. Scientifically, aerosol transmission of SARS-CoV-2 is well-documented, and while N95 respirators are intentionally designed to prevent aerosol transmission (i.e., respiratory protection), medical masks are not. Moreover, N95 respirators are intended to fit the wearer’s face closely in order to eliminate gaps where air could flow around the edges, while medical masks are not fitted. Other forms of transmission (e.g., contact, droplet, non-hospital) would not be expected to differ between respirators and masks. These observations suggest that we already know that N95 respirators are better at preventing transmission.
Questioning superiority, however, was not the point of the study. As described above, the study investigated non-inferiority for any type of transmission (i.e., how much worse is the cheaper/easier option?).
Empirically, five systematic reviews on this or related topics have been published in 2020 and 2021. While the balance of evidence implies the superiority of N95 respirators, the notion of non-inferiority was not addressed in any of the five reviews. Three reviews (Collins et al; Yin et al; Chou et al) found clear evidence of the superiority of N95 respirators, but the other two reviews (Jefferson et al; Bartoszko et al) concluded that when the outcome was laboratory-confirmed infection, there is likely little important difference between N95 respirators and medical masks. The reviews used different inclusion criteria, investigated different infections, and analyzed different outcomes.
Was this study biased?
As the key mechanistic design difference between medical masks and N95 respirators involves the prevention of aerosol transmission, other types of transmission (e.g., contact transmission in the hospital, large droplet transmission in the hospital, non-hospital transmission) would not be expected to differ between the two groups. This means that an analysis of any transmission could dilute any real difference in aerosol transmission. Some commenters on the study (Greenhalgh et al; Wen) noted that two-thirds of the cases occurred in Egypt and Pakistan during an omicron surge with high community spread.
To address non-hospital transmission, the authors conducted a post hoc subgroup analysis showing the same risk ratio for those who did and did not have household or community exposure (hazard ratios of 1.08 and 1.06, respectively). Regarding in-hospital usage, while some commenters claimed that healthcare workers may have only used their masks “intermittently”(Greenhalgh et al; Fisman and MacIntyre), thereby biasing the study to find no difference in overall transmission rates, the authors responded that all participants were “expected” to use the assigned mask whenever in the hospital, with key exceptions (e.g., all participants used N95 respirators during aerosol-generating procedures). Unfortunately, the authors could not conduct an analysis of aerosol transmission specifically.
Commenters raised other bias concerns, including variation across countries (CIDRAP News; Medscape; Wen; Greenhalgh et al; Fisman and MacIntyre), the lack of an unmasked control group (Brosseau; CIDRAP News), the reliance on self-reporting of community exposure (CIDRAP News), better adherence in the medical mask group (91% versus 81%) (Wen), the inclusion of some workers who never cared for COVID patients (thereby diluting the effect) (CIDRAP News), the senior author’s public skepticism about aerosol transmission as well as the possible harms of N95 respirators (Medscape), and statistical power (Medscape; Fisman and MacIntyre). While many of these concerns may be valid, the commenters themselves could also be biased. Several seem to think the study concluded that medical masks and N95 respirators are clearly equivalent with respect to COVID transmission risk, which misunderstands the goal of a non-inferiority trial. One commenter receives funding from 3M, a manufacturer of N95 respirators (as well as medical masks, as pointed out by Osterholm).
The study concluded that its data “rule out a doubling in hazard.” Many other technically accurate phrasings could alternatively have been used, such as (1) “the finding of noninferiority in this trial was consistent with up to a relative 70% increased risk” (as suggested by one commenter) or (2) “medical masks were noninferior” (as suggested by another commenter). The first seems more pro-N95 than the author’s conclusion, whereas the second seems more anti-N95 because it omits the “doubling.” We appreciated the authors’ wording, because it aligns with the non-inferiority design and the interpretation of the confidence interval, and also forces readers to consider for themselves whether a doubling is a reasonable threshold.
In a December 20, 2022, comment, the journal editors stated that the data were inconclusive: “We chose to publish the report because even inconclusive trials contribute evidence towards answering important questions.” We agree that inconclusive trials should be published, but this particular trial was clearly “conclusive,” given its design and data, as well as the authors’ “Conclusion” section of their abstract.
In this commentary, we have attempted to shed light on many underlying issues. The strong reactions to this study reveal the importance of the topic, the vehemence of prior beliefs, the tension between mechanistic explanations and empirical demonstrations, and the difficulty of interpreting a pragmatic trial of a complex intervention.
Learn more about ECRI's independent, unbiased expertise to support evidence-based coverage decisions on new, emerging, and existing medical devices, drugs, procedures, and care processes.