Key for those who see the ban on smoking in enclosed public spaces as a small start towards banning smoking anywhere is any evidence that smoking in private homes and cars impinges on the rights of powerless third parties. So the news that passive smoking increases the risk of still-births by a whopping 13% and that of birth malformations by 23% was reported widely.
The BBC news site quoted the press release freely:
"Fathers-to-be should stop smoking to protect their unborn child from the risk of stillbirth or birth defects, scientists say. They looked at 19 previous studies from around the world.Vital that women knew the risks? So what are the risks? The paper (abstract) was not primary research, but combined data from multiple studies, which sounds good. But most of the studies were either of poor quality or did not address the desired health outcomes. In the end, it came down to 19 studies with four separate outcome measures. Two of them, the risk of miscarriage and the risk of perinatal or neonatal death, came out negative: no increased risk. The other two came out with the 13% (4 studies) and 23% (7 studies) increased risk.
A UK expert said it was 'vital' women knew the risks of second-hand smoke."
So, the news reports could have started with headlines of "Passive Smoking Does Not Cause Miscarriage" or "New Study Produces Contradictory Results", or even "We're Trying Really Hard But We Still Can't Prove Passive Smoking is Particularly Dangerous". Although I can't imagine researchers from the UK Tobacco Control research Network policy advocacy group pushing that last one!
Statistical SignificanceWhen researchers attribute risks to particular behaviours, they calculate not only the best estimate of the increased risk (eg an odds ratio of 1.13, or an increase of 13%), but also the high and low limits within which they are confident that the 'true' risk lies. Any measurement will have uncertainties, and to be confident that a risk is real it must be repeatable: that is, doing the whole study again will produce the same result.
Obviously, you can't wait until the next study before you publish, so you use the mathematics of chance to see the results might have been if thing had gone slightly differently during the study. The outcome, then, is not a 'best' figure, but a 'confidence interval' which the 'true' result would be within 95% of the time. (or outside the range 5% of the time).
Confidence TricksThe study found that two of the outcomes had confidence intervals that started below an odds ratio of 1. That is, there is a real chance that there was no risk at all, even though the 'best' figure was higher. So the results are dismissed as not significant.
What of the other two? Stillbirth came out as 1.01 - 1.26 (middle value 1.13), with malformations as 1.09 - 1.38 (middle value 1.23). So, even without a further look, stillbirths could be increased by perhaps 1%, or as much as 26%. We can't tell which, but we can tell that presenting 13% as the figure is misleading.
But it is worse that that. The researchers looked at many outcomes and picked out to publicise the ones which had the wanted results, which makes it far more likely that you will find significance in your results. As an example, let's say that you roll four dice. The chance that any one of them will come up a six is 1/6 (or 17%), but the chance that at least one of the four will come up a six much greater at 48%.
For the researchers to be confident that their overall result of, say, 'passive smoking causes harm to unborm babies', an allowance must be made (eg the Bonferroni correction) for each of these multiple comparisons to get the overall confidence back up to 95%. For four tests as here, the intervals should be increased by the factor of around 1.27, so they become:
Still birth relative risk: 0.98 - 1.30
Congnital malformation relative risk: 0.91 - 1.40
Note that both now include the relative risk (ie no risk) in the range. On this test, none of the outcomes is significant.
Two Bites at the CherryThe upshot is this. If you use statistical arguments to judge outcomes, you should know that the more measurements you make the more likely you are to come up with spurious results, so you should make allowances for it.
The headline should have been, at best, "Our Research Was Too Underpowered to be Sure of Anything, but it is Worth Asking for More Funding".
Unlikely to be reported in the papers, but honest.