Interviewing and Simplifying Data
Sam An Mardy is an Information Technology and Website Manager at Open Development Cambodia (ODC). He manages and oversees ODC’s platform and network infrastructure, provides technical guidance and support, and works on the implementation of projects. In 2022, he received training on data literacy from data journalism experts. He has been a co-trainer in data literacy programs for government officials, NGO staff, journalists, and media students at national and regional levels. Mardy and Vong Pisith, Senior Data Research & GIS Officer at ODC, co-led the session on Interviewing and Simplifying Data, attended by 29 participants (11 of them were women).
Interviewing data shapes many similarities with interviewing a human being. However, instead of asking questions to a field expert, researchers prepare themselves to analyze what kind of questions a data set can answer, thus providing new information. Doing interviews is a process that requires setting up a hypothesis, forming questions, and analyzing new data to test the hypothesis. Hypothesis refers to what you assume to be accurate based on the background of the topic and problem. It sets the scope of the topic, allowing you to verify and seek the truth based on the assumption quickly.
Good interview questions have four characteristics: they help researchers produce data to test hypotheses, they are not too broad, they are based on pre-established hypotheses and available datasets, and must allow researchers to measure a problem, its impact, causes, and solutions.
Data analysis consists of using data to answer specific questions. The essential operations include data aggregation, query analysis, and visualization. Therefore, having good data analysis skills and being comfortable working with spreadsheets and statistics is essential. During his presentation, Mardy shared several data analysis tools: Microsoft Excel, Google Sheets, R Programming, SPSS, and Python Programming Language. Data interviewing can be particularly helpful in situations where interviewing human beings is less convenient. For example, implementing Cambodia’s national internet gateway (NIG) will likely generate large quantitative datasets reflecting the NIG changes in the online sphere. Additionally, since reduced internet freedoms can be a sensitive topic many people might not want to discuss publicly with researchers, interviewing data will be an alternative to understanding the social and political implications of the gateway.
To understand data findings, further steps are necessary. Pisith discussed the importance of simplifying data as it leads to a better understanding of the findings and can help generate new insight. Simplifying data also allows the audience to understand complex ideas that come from extensive datasets, with which they might not be familiar.
There are three main techniques to simplify data. First, making findings relatable to people. Pisith said, “simplifying is more than just simplifying numbers but making it human relatable”. Therefore, researchers should not expect the audience to understand figures and statistical analysis without some help and a simplified version of the analytical task. Second, using simple words and avoiding jargon. Researchers should avoid using too many technical terms, writing long sentences, and including multiple findings in one sentence. Third and last, reporting numbers when referring to the percentages (reporting proportion of part to the whole) and ratio (reporting individual category compared to the whole individual). It is necessary to think carefully about which is more appropriate so the audience can easily understand the message.
Before ending the session, Pisith reminded the participants that simplifying data is not only about making the findings more straightforward but humanizing findings by making the audience a key agent in the process so they can relate to the story. Pisith also asked the participants to produce a story about the session by finding hypotheses and data-driven questions.