Tag Archives: statistical analyses

If at first you don’t succeed . . . statistical manipulation might help

Anti-fluoride campaigners are promoting yet another new study they claim shows community water fluoridation lowers children’s IQ. For example, the Fluoride Free NZ (FFNZ) press release Ground Breaking Study – Fluoridated Water Lowers Kid’s IQs which claims the study confirms“our worst fears, linking exposure to fluoridated water during pregnancy to lowered IQ for the developing child.”

Yet the study itself shows no significant difference in children whose mothers lived in fluoridated or unfluoridated areas during pregnancy. Here is the relevant data from Table 1 in the paper:

Mean IQ of children whose mothers drank fluoridated or unfluoridated water during pregnancy (SD =  11.9 – 14.7)

Nonfluoridated Fluoridated
All children 108.07 108.21
Boys 106.31 104.78
Girls 109.86 111.47

The differences between fluoridated and nonfluoridated are not statistically significant.

The paper has just been published and is:

Green, R., Lanphear, B., Hornung, R., Flora, D., Martinez-Mier, E. A., Neufeld, R., … Till, C. (2019). Association Between Maternal Fluoride Exposure During Pregnancy and IQ Scores in Offspring in Canada. JAMA Pediatrics, 1–9.

Surprisingly the authors do not discuss the data in the table above. Its as if the data didn’t exist, despite being given in their Table 1. I find this surprising because their discussion is aimed at finding a difference – specifically, a decrease in child IQ due to fluoridation – and surely these mean values must be relevant. Were the authors embarrassed by these figures because they did not show the effect they wanted?

So how did they manage to find an effect they could attribute to fluoride, or fluoridation, despite the mean values above? They basically resort to statistical manipulation – and this has opened up an intense controversy about the paper.

An unprecedented “Editor’s Note”

The journal editor, Dimitri A. Christakis, published a note alongside the paper (see Decision to Publish Study on Maternal Fluoride Exposure During Pregnancy), together with a piece in the Opinion section by David C. Bellinger (see Is Fluoride Potentially Neurotoxic?). This opinion piece is described as an editorial although Bellinger is not an editor of the journal or on the Editorial Board.

This is, in my experience, completely unprecedented. Editor’s don’t comment on the quality of papers or the refereeing process and I can only conclude that within the journal editorial board and those who reviewed the paper there were sharp differences about its quality and whether it should be published. While an editorial may sometimes bring attention to an article, in this case, it is likely that Bellinger was one of the reviewers of the paper and he is expressing his viewpoint on it and supports its publication.

Christakis writes “The decision to publish this article was not easy.” He goes on to imply the journal supports publication “regardless of how contentious the results may be.”  But surely there is no need to defend a good quality paper in this way just because the results may be “contentious.”

Interestingly, FFNZ interpreted the publication of the Editor’s note as making the publication of the paper more “impactful” not realising that the Note is probably not positive for the paper as it reveals controversy over the paper’s quality and whether it was worthy of publication. FFNZ also chose to describe Bellinger’s comments in his opinion piece as representing the views of the authors. However, it would be inappropriate for an editor to make such comments.

I think Bellinger has his own biases and preferences which lead him to advocate for papers like this. I commented on Bellinger’s role in the review of another paper promoting an anti-fluoride perspective in my articles Poor peer-review – a case study and Poor peer review – and its consequences.

A large amount of controversy

I am surprised at the degree of controversy around this paper – and it’s loudness. The fact that it started on the same day the paper was made public reveal various actors have had access to the paper and have been debating it for some time.  This could have been stoked by the unorthodox statistical analysis used and contradictions in the findings.

But it appears this controversy had gone far wider than the journal editors and reviewers of the paper because of the immediate reactions from anti-fluoride organisations like the Fluoride Action Network (see BREAKING: GOVERNMENT-FUNDED STUDY LINKS FLUORIDATED WATER DURING PREGNANCY TO LOWER IQS IN OFFSPRING), some leading Newspapers,  professional bodies (see AADR Comment on Effect of Fluoride Exposure on Children’s IQ Study) and the UK Science Media Centre which published a reaction from experts article (see expert reaction to study looking at maternal exposure to fluoride and IQ in children).

This suggests to me a large degree of lobbying. Not only from activists and anti-fluoride scientists or reviewers. But also from authors and their institute. I am not really surprised as I have often seen how politics, activism, commercial interests, and scientific ambitions will coordinate in these situations.

How to discover an effect from a nonsignificant difference

So how do we get from the data in the table above – showing no statistically significant difference between fluoridated and unfluoridated areas – to a situation where the authors (who don’t refer to that data in their discussion) say:

“higher levels of fluoride exposure during pregnancy were associated with lower IQ scores in children measured at age 3 to 4 years. These findings were observed at fluoride levels typically found in white North American women. This indicates the possible need to reduce fluoride intake during pregnancy.”

In their press releases and statements to media, where they are not constrained by a journal’s need for evidence and objectivity, they come out even more vocally against community water fluoridation.

Well, it appears to me, by statistical manipulation. One of the Science Media experts referred to above, Prof Thom Baguley, wrote:

“First, the claim that maternal fluoride exposure is associated with a decrease in IQ of children is false. This finding was non-significant (but not reported in the abstract). They did observe a decrease for male children and a slight increase in IQ (but non-significant) for girls. This is an example of subgroup analysis – which is frowned upon in these kinds of studies because it is nearly always possible to identify some subgroup which shows an effect if the data are noisy. Here the data are very noisy.”

It appears the authors found a significant effect of child sex on IQ so made a decision to do a subgroup analysis – of boys and girls – and this produced a significant association of IQ with maternal urinary fluoride for the boys. This resort to subgroup analysis may have, in itself, produced a misleading significant relationship.

Adam Krutchen, Biostatistics PhD student at the University of Pittsburgh, also illustrates how the relationship with child sex has confused the analysis. He comments on the data that he managed to extract from the paper’s Figure 3:

“There were drastic sex-specific IQ differences in the children, which is of course strange. We shouldn’t expect anything like that to happen. This difference is very significant. There’s also some outlier extremely low IQ values among the male children.”

He is saying that his regression analysis showed a strong effect of child sex on IQ. This is quite irrespective of maternal urinary F or drinking water F. However, once that effect of child sex is taken into account he found no relationship of child IQ with maternal urinary F. He says:

“with such a significant effect of sex on IQ, does fluoride have any remaining relationship? The answer is a resounding no in the digitized data.”

It appears that including child sex difference in the regression analysis produces the finding that there is no significant relationship of fluoride to child IQ after taking into account the significant relationship of IQ with child sex. But when the data is divided into subgroups and analysed separately (a technique statisticians “frown on” “because it is nearly always possible to identify some subgroup which shows an effect if the data are noisy”) a significant relationship of IQ with maternal urinary fluoride can be produced for boys (but not girls).

Interestingly, a second part of the Green et al., (2019) study investigated a relationship of child IQ with unverified estimated fluoride intake by the pregnant mothers. The estimation method was not verified so may be questionable). No sex difference appeared in that data set.

How strong are the reported relationships

Perhaps it is not necessary to go any further. Perhaps the data for mean IQ in the table above is sufficient to show there is no effect of fluoride on IQ. Or perhaps the critique of the analysis of subgroups used is sufficient to make the reported conclusions suspect.

However, perhaps a comment on the weakness of the relationships reported by Green et al is useful – if only because I took the trouble to digitally extract the data from the figures in the paper and do my own regression analyses on the data.

Of course, digital extraction does not get all the data – even if only because the points may merge. In this case, I managed to extract 410 data points from Figure 3A which showed the relationships of child IQ with the maternal urinary F concentrations during pregnancy. This is quite a bit smaller than the 512 data pairs the authors reported in their Table 1 and suggests to me they had not plotted all their data. However, the values for means and coefficients obtained by my own regression were very similar to those reported by Green et al., (2019).

The authors reported a significant (p=0.02) negative relationship of boy’s IQ with maternal urinary F. They do not discuss how strong that relationship is – although the wide scatter of data points in the figures suggest it is not strong. My regression analysis showed the relationship explained only 1.3% of the variance in IQ. I do not think that is worth much. With such low explanatory power, I think the authors overstate their conclusions.

I think this is another case of placing far too much reliance on p-values and ignoring other results of the statistical analysis. I discussed this in a previous article – see Anti-fluoride activists misrepresent a new kidney/liver study).

Conclusions

I think this paper has been overblown. It has problems with its statistical analyses as well as other limitations referred to in the paper. I do not think it should have been published in its present form – surely reviewers should have picked up on these problems. I can only conclude that intense arguments occurred within the journal’s editorial board and amongst reviewers – and most probably more widely amongst institutes and activist groups. In the end, the publication decision was most likely political.

Similar articles

“Crusade Against Multiple Regression Analysis” – don’t throw baby out with bathwater

Nisbett640

Richard Nisbett is a professor of psychology and co-director of the Culture and Cognition Program at the University of Michigan.

Edge has an interesting talk  about the problems of research relying on regression analyses (see The Crusade Against Multiple Regression Analysis).  Unfortunately, it has some important faults – not the least is his use of the term “multiple regression” when he was really complaining  about simple regression analysis.

Professor Richard Nisbett quite rightly points out that many studies using regression analysis are worthless, even misleading – even, as he suggests, “quite damaging.”  Damaging because these studies gets reported in the popular media and their faulty conclusions are “taken as gospel” by many readers. Nisbet says:

“I hope that in the future, if I’m successful in communicating with people about this, there’ll be a kind of upfront warning in New York Times articles: These data are based on multiple regression analysis. This would be a sign that you probably shouldn’t read the article because you’re quite likely to get non-information or misinformation.

Knowing that the technique is terribly flawed and asking yourself—which you shouldn’t have to do because you ought to be told by the journalist what generated these data—if the study is subject to self-selection effects or confounded variable effects, and if it is, you should probably ignore them. What I most want to do is blow the whistle on this and stop scientists from doing this kind of thing. As I say, many of the very best social psychologists don’t understand this point.

I want to do an article that will describe, similar to the way I have done now, what the problem is. I’m going to work with a statistician who can do all the formal stuff, and hopefully we’ll be published in some outlet that will reach scientists in all fields and also act as a kind of “buyer beware” for the general reader, so they understand when a technique is deeply flawed and can be alert to the possibility that the study they’re reading has the self-selection or confounded-variable problems that are characteristic of multiple regression.”

I really hope he does work with a statistician who can explain to him the mistakes he is making.  The fact that he raises the issue of “confounded-variable problems” shows he is really talking about simple regression analysis. This problem can be reduced by increasing the types and numbers of comparisons performed in an analysis – by the use of multiple regression analysis, the very thing he makes central to his attack!

The self-selection problem

Nisbett gives a couple of examples of the self-selection problem:

“A while back, I read a government report in The New York Times on the safety of automobiles. The measure that they used was the deaths per million drivers of each of these autos. It turns out that, for example, there are enormously more deaths per million drivers who drive Ford F150 pickups than for people who drive Volvo station wagons. Most people’s reaction, and certainly my initial reaction to it was, “Well, it sort of figures—everybody knows that Volvos are safe.”

Let’s describe two people and you tell me who you think is more likely to be driving the Volvo and who is more likely to be driving the pickup: a suburban matron in the New York area and a twenty-five-year-old cowboy in Oklahoma. It’s obvious that people are not assigned their cars. We don’t say, “Billy, you’ll be driving a powder blue Volvo station wagon.” Because of this self-selection problem, you simply can’t interpret data like that. You know virtually nothing about the relative safety of cars based on that study.

I saw in The New York Times recently an article by a respected writer reporting that people who have elaborate weddings tend to have marriages that last longer. How would that be? Maybe it’s just all the darned expense and bother—you don’t want to get divorced. It’s a cognitive dissonance thing.

Let’s think about who makes elaborate plans for expensive weddings: people who are better off financially, which is by itself a good prognosis for marriage; people who are more educated, also a better prognosis; people who are richer; people who are older—the later you get married, the more likelihood that the marriage will last, and so on.”

You get the idea. But how many academic studies rely on regression analysis of data from a self-selected sample of people? The favourite groups for many studies are psychology undergraduates at universities!

Confounded variable problem

I have, in past articles, discussed some examples of this related to fluoride and community water fluoridation.

See also: Prof. Nisbett’s “Crusade” Against Regression

Conclusions 

Simple regression analyses are too prone to confirmation bias and Nisbett should have chosen his words more carefully, and wisely. Multiple regression is not a silver bullet – but it is far better than a simple correlation analysis. Replication and proper peer review at all research and publication stages also helps. And we should always be aware of these and other limitation in exploratory statistical analysis. Ideally, use of such analyses should be limited to a guide for future, more controlled, studies.

Unfortunately, simple correlation studies are widespread and reporters seem to see them as easy studies for their mainstream media articles. This is dangerous because it has more influence on readers, and their actions, than such limited studies really warrant. And in the psychological and health fields there are ideologically motivated groups who will promote such poor quality studies because it fits their own agenda.

Similar articles