To put it bluntly, statistics is a messy science. Previous statistical studies have not adequately established a consensus because almost all of these studies have been substantially subpar.
Although it can be tempting to jump to conclusions, more studies with large, randomly-selected samples must be conducted before it will be clear whether there are differences between same-sex and opposite-sex parenting.
Because of a scarcity of data on the subject, much of the social science around same-sex parenting has deviated from the ideal in order to glean what knowledge can be obtained from limited data sets. While this is understandable, it is nevertheless a concession. Researchers should not “settle” by prematurely accepting a consensus about what’s possible in select same-sex parenting communities rather than pursuing the truth about what is probablein the population of same-sex households at large. Whether it is probable that children of same-sex couples fare better or worse than those of opposite-sex couples still remains largely unknown because of the failings of the vast majority of prior studies.
While no statistical study is perfect, most of the previous literature falls well short of even basic criteria for making convincing arguments. In terms of random sampling and large sample size, which are the most basic of social scientific criteria, the vast majority of the previous literature simply fails. Meanwhile, among the small number of studies that pass even these foundational tests, the decision is still split.
How Big Is Big Enough?
Allen’s study has far more cases than most of its peers, with nearly 1400 young adults raised by same-sex couples. Meanwhile, among studies conducted prior to 2010 (which constitute forty-four of the fifty studies reviewed by Allen), the average sample size was just sixty-nine children raised by gay or lesbian couples. The largest of these early studies included only 475 children, making it just barely acceptable in terms of sample size.
Why is a small sample size a damning characteristic of such studies?
Suppose a researcher wishes to compare two groups based on their likelihood of exhibiting a certain behavior—in the case of Allen’s new study, completing high school. Further, suppose that the true unknown underlying population average for Group A is 87%, and for Group B it is 90%. This means that the failure rates for each group are 13% and 10%, respectively. Now, a 3-percentage-point difference may not seem like much, but it would represent a 30% increase in the rate at which children fail to finish high school in Group A as compared to Group B. This indicates a dramatic distinction that could have far-reaching implications for such children’s future job prospects and quality of life.
Subscribe to Free “Top 10 Stories” Email
Get the top 10 stories from The Aquila Report in your inbox every Tuesday morning.