5 strategies to uncover bias in data

A poll, survey, or other dataset may look like an example of objective truth. But human choices shape the creation of a data product — and its interpretation.

So how can journalists fairly report on this data? To answer this question, the National Press Club Journalism Institute and the National Association of Science Writers (NASW) hosted a panel with:

  • Fernand Amandi, managing partner of Bendixen & Amandi, the nation’s leading multilingual and multiethnic public opinion research and strategic communications consulting firm
  • Caroline Chen, health care reporter at ProPublica, and 2019 winner of the June L. Biedler Cancer Prize for Cancer Journalism for her series with Riley Wong on racial disparities in clinical trials
  • Dr. Kyler J. Sherman-Wilkins, assistant professor in the Sociology and Anthropology department at Missouri State University and a Mellon Emerging Faculty Leader for 2021
  • Moderator: Tinsley Davis, executive director of the National Association of Science Writers 

Pause and think: What is the most important component from the dataset that readers want to know, and will this data provide the right context? 

“You can take one number, put it in a headline, and give readers totally the opposite impression of even what you meant to say,” Chen said. “Our job as journalists is to take those numbers, make sure we know the necessary context, and then be careful about which numbers we pick to put forward.”

Chen recommends continually asking yourself: What is not yet answerable by the data? What steps need to be taken for this to be answered? What can I NOT say from this study?

Be familiar with the methodology by taking time to interview the researchers behind the study.

Learn how many interviews were conducted. Know when the interviews were conducted. Look for missing pieces from the methodology. Consider the sponsors of the research. 

“I may not explain the whole thing to my readers because they don’t need to know that, but if I don’t understand the methodology inside out myself, then I’m not qualified to pull the biggest highlights,” Chen said.

Ask for (and carefully review) the results.

Before reporting on a dataset, journalists should take time to get to know the actual wording and results of the survey questionnaire or instrument. This will help determine any bias in the numbers.

“Understanding who was asked, what they were asked, is so very important,” Sherman-Wilkins said. 

“It’s incumbent upon the reporter to look at the way the questions are written,” Amandi said. “If it’s confusing to you, it’s going to be even probably more confusing to the respondent, so these are things that the reporters should also think about and flag when asking for instruments.”

Treat researchers, pollsters, and polling companies the same as any other source.

“Sometimes you have sources that get things wrong, but they get things wrong in good faith, it just doesn’t check out,” Amandi said. “The same applies to a research firm, or a polling firm, or whoever you’re engaging with — the vetting needs to be much more stringent.”

“One question I like to ask is: If somebody who you really respected was to criticize this paper, what would they say about it?” Chen said. “I think that there are often really interesting criticisms that can come up and when you, sort of, nudge someone to think about it … I’ve gotten some really interesting answers to that question.” 

Disaggregate the data for nuance.

Disaggregated data is data that has been broken down into detailed sub-categories, such as gender, race, ethnicity, or level of education.

“Disaggregating the data is a really important tool that we use to uncover some of the bias that might exist,” Sherman-Wilkins said. “Specifically the bias of generalizing to an entire population without really considering the intersections.”

Digging into this data allows for more nuanced reporting that might get obscured if all of the numbers were aggregated together as one set. His example: A headline from a recent jobs report indicated that 100% of the 140,000 jobs lost were only from women. But that was not the complete story. 

“What was important to note about that story was that the jobs lost were entirely by women of color and white women actually made gains” Sherman-Wilkins said. “As a sociologist who’s interested in understanding how the world works, and how to address issues with policy and recommending policy, it’s important to understand and disaggregate those numbers.”

“Disaggregation really leads to richer stories and richer impressions for audiences,” Davis said. “It reminds me that we as reporters are not writing for a lumped audience either. Knowing who our audiences and who our readers are, and writing for that context, is really important.”

Notify of
Inline Feedbacks
View all comments