The Complex “Question” of Gender Identity

By Watson Scott Swail, President & Senior Research Scholar, Educational Policy Institute


Which gender are you? Seems like a simple question, but not so much anymore. With the coming US Census as well as other annual surveys by the federal government in the US and in other countries around the world, there are more considerations about how we get to the issue of gender identity and sexual preferences. This is an issue that will impact every school district, college, and university to some degree.

According to Gallup, 3.8 percent of Americans selected gay or lesbian in a recent sample of 58,000 persons. Interestingly, the American public grossly overstates the percentage of people who are gay or Lesbian. In a 2015 Gallup poll, Americans believe that 23 percent of the population is gay or lesbian.[1] The number and percentage of transgendered people is understandably smaller. In 2016, transgenders accounted for 0.6 percent of the US population.[2] Put in relative language, that means 1 out of 167 persons in the US are transgender, or 1.4 million people. Interestingly, the District of Columbia had by the far the highest rate of transgender people at 2.8 percent. Even California was modest at 0.8 percent. North Dakota had the lowest percentage (0.3 percent), but over the border in South Dakota it was back at the national average. That simple variation seems odd to me.

Even if these numbers are small, they are sizeable and important from a research and public policy issue. It is always arguably how accurate data are on issues that are personal to people such as gender identity. Some people do not trust the government, let alone other surveyors, to protect information. The Census Bureau goes as far as carefully warning that members of the Department face prison and a $250,000 fine if they divulge any personal information from their surveys.

As more people want information on how people perceive themselves with regard to gender identity, the research community is similarly pushed to consider how best to both categorize the different identities and what this will mean for research. And while it may not seem like much to the non-researcher out there, it is a challenging issue for the research community.

Historically, and even currently in some cases, the term “sex” was used to discuss whether the person was a male or female. Now, more often than not, surveys use the term “gender” for the same purpose. For that reason, they have been fairly interchangeable in common use. In recent years, I argue that the term “sex” has been used more as a verb rather than a noun. That is, to describe the act rather than the gender.

Many of us see “gender” similarly, and the truth is that people really don’t like throwing the word “sex” around in society. As well, it is one of the reasons that half my readers won’t get this article because most IT systems have spam-blockers that do not like the term “sex” (I literally have re-drafted articles due to the use of that and other seemingly innocuous words, including that of “mortgage,” if you can believe). A recent article by a University of Victoria researcher argues that both “sex” and “gender” are societal social statuses. I disagree. In research, “sex” has been used to define the biological variances between people. As described in Social Problems: Continuity and Change, “Sex refers to the anatomical and other biological differences between females and males that are determined at the moment of conception and develop in the womb and throughout childhood and adolescence.”

As researchers, policymakers, and advocates, we want to know whether someone is male, female, or other, for biological reasons, as well as how they perceive themselves through gender identity. These are two separate issues. In most surveys, the gender question is offered only two responses: male and female. This becomes a conundrum for many people and some have argued that only having the two options is “ethically wrong.”[3] There is a third option, of course, in the form of “prefer not to say,” “other,” or “transgender.” Researchers typically do not like the “prefer not to say” option because it mucks up the analysis.

None of this deals with gender identities, of course, which is another complex but important issue due in part to the fluidity of the issue. Even recently, the term “gay” was widely used to describe homosexuals. LGB was introduced in the 1980s to group lesbian, gay, and bisexuals. A T was added in the 1990s for transgenders, and finally, “Q” was recently added for “Queer.” Still, you will often find the acronym LGBT used in many writings. “Q” has been controversial at best due to a historically-negative connotation of the term “queer” to disparage certain people. It is more than likely that our terminology will have many more iterations over the coming decades, impacting our ability to consider how best to use identities on surveys. Thus, perhaps the use of multiple questions may be good advice so that variations can be parsed out for future comparisons of today’s data versus tomorrow’s.

To give you an idea of how many identities there are, Tumblr lists 112 and keeps growing.[4] I won’t list them here (a previous draft did) because then every spam blocker would stop this article from going through. For a more thorough discussion of identities, click here.

In a 2018 study by the US Bureau of Labor Statistics, the researchers found that there were a “number of obstacles to accurate collection of gender identities,” but that the information was deemed valuable to collect. One particular worry is the accuracy of responses from people, given that in US Census and Current Population Studies, typically one person from the household answers for the entire household. Thus, does that household member know exactly how the dependent or other person calls themselves? It sounds simple enough, but with so many categories, it is likely that mistakes would be made. Even in the BLS study, it is apparent that not everyone agreed on the definition of transgender and how it should be used. This becomes a problem when people within that community cannot come up with common terminology.

Even SurveyMonkey has written on this topic, claiming that it may be best to use several questions rather than one question to get at the issue of sexual and gender identities.

From an empirical point of view—and only that—building new and varied categories is always a bit of a pain for analysts. In the end, and usually due to the “N,” or number of those surveyed, categories get lumped together in order to gain any statistical significance. This doesn’t happen in something as large as the US Census, but it does in many other surveys. Thus, the breaking apart single categories into many categories can be problematic from a statistical perspective. For instance, we may want to ask about income by category in a survey, but if we divide income into too many levels, we may run into an analytical problem due to small cells. The same with race/ethnic groupings. The US Census categories include White, Black, and Hispanic/Latino, Asian, American Indian/Alaska Native, and Native Hawaiian or Other Pacific Islanders. And even on large, national surveys, we run into sample size issues which often impede us from conducting any heavy statistical lifting on the last two groups, much to chagrin of those who want to see those groups in the analysis. The same thing will happen as we expand our definitions and categories of sex and gender identities.

In March 2017, the Administration removed proposed questions on sexual orientation and gender identities from the upcoming 2020 Census. A few months later, HR 3273, the LGBT Data Inclusion Act,[5] was introduced into the US House of Representatives. This bill would require the identification and revision or inclusion of questions related to sexual orientation and gender identities on federal surveys, including the 2020 Census. The LGBTQ community has been strongly behind this bill, but it is not likely to make any headway through this Congress due to other looming issues.[6]

Other US Census Bureau surveys have stayed away from the issue. The American Community Survey, which is administered annually by the Bureau, currently uses the terms male and female with no other option and no reference to identity.[7] In the end, the 2020 Census will add a question about “same-sex” or “opposite-sex” marriage, but not include any questions about gender identity.[8]

Surveys by the National Center for Education Statistics (NCES), a division of the US Department of Education, have also left things at male or female in the past. However, the upcoming administration of the Baccalaureate and Beyond (B&B) in 2018 will include the following questions:

What is your gender? Your gender is how you feel inside and can be the same or different from your biological or birth sex. 
(Please choose all that apply)
Transgender, male-to-female
Transgender, female-to-male
Genderqueer or gender nonconforming
Please describe
A different identity
Please describe

In addition, the B&B survey also asks, “What sex were you assigned at birth?” Thus, it gets at both issues of biological or birth sex and identity.

From a research point of view, we will have to keep flexible in how we deal with this issue, knowing that, for much of the time, we will be consolidating many variables due to our statistical significance issues. Thus, do not be surprised that on institutional and other smaller surveys that the number of variables analyzed is significantly reduced. Researchers may be able to tell you that X numbers of students selected X different gender identities boxes, but the statistical flooring of these questions will fall out due to significance.

Given that this is a complex issue, I am interested in your thoughts. Please comment accordingly.


I received information from a colleague at RTI, who provided these useful details (edited):

Often LGBTQ is combined altogether, where most of those populations are defined by sexuality, whereas transgender/gender non-conforming is about gender identity. Make a clear distinction between capturing other minority populations and how it relates to what we know about capturing gender identity in surveys, because there are separate methods to doing so (i.e., 2-step process for gender, whereas sexuality can be: preference, identification, behavior, etc.). Too often one may not realize that transgender identification is not about sexuality, and it would make the blog post stronger to have a clear justification for including those comparisons to the collection of data on sexual minority populations.   

-The two-step process of capturing gender identity is not mentioned explicitly, which is the method generally accepted in the field. Here is an assessment of different methods, that will also provide a great ref list:

-This is also the method utilized in B&B:08/18. Before collecting gender identification, respondents are first asked about biological sex. This allows researchers to make inferences about respondents whose gender does not align with their biological sex, regardless of self-reported gender identification. This is ideal given some of the consequences and considerations already mentioned in the blog post.

What sex were you assigned at birth (what the doctor put on your birth certificate)?

o   Male

o   Female









Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.