This week, a team from the University of California, Los Angeles claimed to have found several epigenetic marks—chemical modifications of DNA that don’t change the underlying sequence—that are associated with homosexuality in men. Postdoc Tuck Ngun presented the results yesterday at the American Society of Human Genetics 2015 conference. Nature News were among the first to break the story based on a press release issued by the conference organisers. Others quickly followed suit. “Have They Found The Gay Gene?” said the front page of Metro, a London paper, on Friday morning.
Meanwhile, the mood at the conference has been decidedly less complimentary, with several geneticists criticizing the methods presented in the talk, the validity of the results, and the coverage in the press.
Ngun’s study was based on 37 pairs of identical male twins who were discordant—that is, one twin in each pair was gay, while the other was straight—and 10 pairs who were both gay. He analysed 140,000 regions in the genomes of the twins and looked for methylation marks—chemical Post-It notes that dictate when and where genes are activated. He whittled these down to around 6,000 regions of interest, and then built a computer model that would use data from these regions to classify people based on their sexual orientation.
The best model used just five of the methylation marks, and correctly classified the twins 67 percent of the time. “To our knowledge, this is the first example of a biomarker-based predictive model for sexual orientation,” Ngun wrote in his abstract.
The problems begin with the size of the study, which is tiny. The field of epigenetics is littered with the corpses of statistically underpowered studies like these, which simply lack the numbers to produce reliable, reproducible results.
Unfortunately, the problems don’t end there. The team split their group into two: a “training set” whose data they used to build their algorithm, and a “testing set”, whose data they used to verify it. That’s standard and good practice—exactly what they should have done. But splitting the sample means that the study goes from underpowered to really underpowered.
There’s also another, larger issue. As far as could be judged from the unpublished results presented in the talk, the team used their training set to build several models for classifying their twins, and eventually chose the one with the greatest accuracy when applied to the testing set. That’s a problem because in research like this, there has to be a strict firewall between the training and testing sets; the team broke that firewall by essentially using the testing set to optimise their algorithms.
If you use this strategy, chances are you will find a positive result through random chance alone. Chances are some combination of methylation marks out of the original 6,000 will be significantly linked to sexual orientation, whether they genuinely affect sexual orientation or not. This is a well-known statistical problem that can be at least partly countered by running what’s called a correction for multiple testing. The team didn’t do that. (In an email to The Atlantic, Ngun denies that such a correction was necessary.)
And, “like everyone else in the history of epigenetics studies they could not resist trying to interpret the findings mechanistically,” wrote John Greally from the Albert Einstein College of Medicine in a blog post. By which he means: they gave the results an imprimatur of plausibility by noting the roles of the genes affected by the five epi-marks. One is involved in controlling immune genes that have been linked to sexual attraction. Another is involved in moving molecules along neurons. Could epi-marks on these genes influence someone’s sexual attraction? Maybe. It’s also plausible that someone’s sexual orientation influences epi-marks on these genes. Correlation, after all, does not imply causation.
So, ultimately, what we have is an underpowered fishing expedition that used inappropriate statistics and that snagged results which may be false positives. Epigenetics marks may well be involved in sexual orientation. But this study, despite its claims, does not prove that and, as designed, could not have.
In a response to Greally’s post, Ngun admitted that the study was underpowered. “The reality is that we had basically no funding,” he said. “The sample size was not what we wanted. But do I hold out for some impossible ideal or do I work with what I have? I chose the latter.” He also told Nature News that he plans to “replicate the study in a different group of twins and also determine whether the same marks are more common in gay men than in straight men in a large and diverse population.”
Great. Replication and verification are the cornerstones of science. But to replicate and verify, you need a sturdy preliminary finding upon which to build and expand—and that’s not the case here. It may seem like the noble choice to work with what you’ve got. But when what you’ve got are the makings of a fatally weak study, of the kind well known to cause problems in a field, it really is an option—perhaps the best option—to not do it at all. (The same could be said for journalists outside the conference choosing to cover the study based on a press release.)
As Greally wrote in his post: “It’s not personal about [Ngun] or his colleagues, but we can no longer allow poor epigenetics studies to be given credibility if this field is to survive. By ‘poor,’ I mean uninterpretable.”
“This is only representative of the broader literature,” he told me. “The problems in the field are systematic. We need to change how epigenomics research is performed throughout the community.”
As Greally wrote in his post: “It’s not personal about [Ngun] or his colleagues, but we can no longer allow poor epigenetics studies to be given credibility if this field is to survive. By ‘poor,’ I mean uninterpretable.”
“This is only representative of the broader literature,” he told me. “The problems in the field are systematic. We need to change how epigenomics research is performed throughout the community.”