5 strategies to uncover bias in data
A poll, survey, or other dataset may look like an example of objective truth. But human choices shape the creation of a data product — and its interpretation.
So how can journalists fairly report on this data? To answer this question, the National Press Club Journalism Institute and the National Association of Science Writers (NASW) hosted a panel with:
- Fernand Amandi, managing partner of Bendixen & Amandi, the nation’s leading multilingual and multiethnic public opinion research and strategic communications consulting firm
- Caroline Chen, health care reporter at ProPublica, and 2019 winner of the June L. Biedler Cancer Prize for Cancer Journalism for her series with Riley Wong on racial disparities in clinical trials
- Dr. Kyler J. Sherman-Wilkins, assistant professor in the Sociology and Anthropology department at Missouri State University and a Mellon Emerging Faculty Leader for 2021
- Moderator: Tinsley Davis, executive director of the National Association of Science Writers
Pause and think: What is the most important component from the dataset that readers want to know, and will this data provide the right context?
“You can take one number, put it in a headline, and give readers totally the opposite impression of even what you meant to say,” Chen said. “Our job as journalists is to take those numbers, make sure we know the necessary context, and then be careful about which numbers we pick to put forward.”
Chen recommends continually asking yourself: What is not yet answerable by the data? What steps need to be taken for this to be answered? What can I NOT say from this study?
Be familiar with the methodology by taking time to interview the researchers behind the study.
Learn how many interviews were conducted. Know when the interviews were conducted. Look for missing pieces from the methodology. Consider the sponsors of the research.
“I may not explain the whole thing to my readers because they don’t need to know that, but if I don’t understand the methodology inside out myself, then I’m not qualified to pull the biggest highlights,” Chen said.
Ask for (and carefully review) the results.
Before reporting on a dataset, journalists should take time to get to know the actual wording and results of the survey questionnaire or instrument. This will help determine any bias in the numbers.
“Understanding who was asked, what they were asked, is so very important,” Sherman-Wilkins said.
“It’s incumbent upon the reporter to look at the way the questions are written,” Amandi said. “If it’s confusing to you, it’s going to be even probably more confusing to the respondent, so these are things that the reporters should also think about and flag when asking for instruments.”
Treat researchers, pollsters, and polling companies the same as any other source.
“Sometimes you have sources that get things wrong, but they get things wrong in good faith, it just doesn’t check out,” Amandi said. “The same applies to a research firm, or a polling firm, or whoever you’re engaging with — the vetting needs to be much more stringent.”
“One question I like to ask is: If somebody who you really respected was to criticize this paper, what would they say about it?” Chen said. “I think that there are often really interesting criticisms that can come up and when you, sort of, nudge someone to think about it … I’ve gotten some really interesting answers to that question.”
Disaggregate the data for nuance.
Disaggregated data is data that has been broken down into detailed sub-categories, such as gender, race, ethnicity, or level of education.
“Disaggregating the data is a really important tool that we use to uncover some of the bias that might exist,” Sherman-Wilkins said. “Specifically the bias of generalizing to an entire population without really considering the intersections.”
Digging into this data allows for more nuanced reporting that might get obscured if all of the numbers were aggregated together as one set. His example: A headline from a recent jobs report indicated that 100% of the 140,000 jobs lost were only from women. But that was not the complete story.
“What was important to note about that story was that the jobs lost were entirely by women of color and white women actually made gains” Sherman-Wilkins said. “As a sociologist who’s interested in understanding how the world works, and how to address issues with policy and recommending policy, it’s important to understand and disaggregate those numbers.”
“Disaggregation really leads to richer stories and richer impressions for audiences,” Davis said. “It reminds me that we as reporters are not writing for a lumped audience either. Knowing who our audiences and who our readers are, and writing for that context, is really important.”
Additional resources
- CDC report on Provincetown COVID outbreak misinterpreted on social media
- CDC morbidity and mortality weekly report (MMWR)
- Statistical power and underpowered statistics — Statistics done wrong
- Why you shouldn’t say ‘this study is underpowered’
- 20 questions a journalist should ask about poll results
- ClinCalc sample size calculator
- Survey sampling methods
- National Council on Public Polls: FAQs
- How can a poll of only 1,004 Americans represent 260 million people with only a 3 percent margin of error?
- Methods 101: Random sampling – Pew Research Center methods
- Sense about Science USA – STATS check
- Forging a path from solutions journalism to readerrevenue
Click here to download the chat Q&A and resources.
If you have questions about this program, please email Julie Moos, Institute executive director, at [email protected].
About the speakers
Fernand R. Amandi is the managing partner of Bendixen & Amandi, the nation’s leading multilingual and multiethnic public opinion research and strategic communications consulting firm. He manages the firm and brings over a decade’s worth of experience in research and strategic management with an emphasis in corporate, political and public affairs consulting for clients including the United Nations, the Inter-American Development Bank, the World Bank, Univision, New America Media, the White House, the John & James L. Knight Foundation, the California Endowment, US Senator John Kerry and US Senator Robert Menendez. He has also conceived, produced and edited a number of successful television commercials for B&A’s media practice including the highly regarded “Nuestra Amiga” television spot for the Hillary Clinton Presidential Campaign which Rolling Stone magazine lauded as “one of the more charming moments in the history of the political ad wars.”
Caroline Chen covers health care for ProPublica. She is currently reporting on the coronavirus pandemic. Her 2019 stories on a heart transplant program in New Jersey that prioritized metrics over patient care won the Livingston Award for local reporting. Her story on racial disparities in cancer clinical trials with Riley Wong in 2018 won the June L. Biedler Cancer Prize for Cancer Journalism in online/multimedia reporting. Previously, she worked at Bloomberg News, where her coverage included the unraveling of blood test maker Theranos and the 2014 Ebola outbreak. She received her Master’s degree from the Stabile Program in Investigative Journalism at Columbia University, where she was awarded a Pulitzer Traveling Fellowship.
Dr. Kyler J. Sherman-Wilkins is an assistant professor in the Sociology and Anthropology department at Missouri State University and a Mellon Emerging Faculty Leader for 2021. He works to empower more people of color on the university’s campus. His highlighted work, “Social Determinants of Cognitive Functioning Among Diverse Older Adults in the United States,” illustrates his research focus on aging. He was also a recipient of the 2017 Diversity Scholar Award at Missouri State University.