Open Questions about the Visualization of Sociodemographic Data

Florent Cabric,Margrét Vilborg Bjarnadóttir,Anne-Flore Cabouat,Petra Isenberg
2023-08-23
Abstract:This paper collects a set of open research questions on how to visualize sociodemographic data. Sociodemographic data is a common part of datasets related to people, including institutional censuses, health data systems, and human-resources fles. This data is sensitive, and its collection, sharing, and analysis require careful consideration. For instance, the European Union, through the General Data Protection Regulation (GDPR), protects the collection and processing of any personal data, including sexual orientation, ethnicity, and religion. Data visualization of sociodemographic data can reinforce stereotypes, marginalize groups, and lead to biased decision-making. It is, therefore, critical that these visualizations are created based on good, equitable design principles. In this paper, we discuss and provide a set of open research questions around the visualization of sociodemographic data. Our work contributes to an ongoing refection on representing data about people and highlights some important future research directions for the VIS community. A version of this paper and its fgures are available online at <a class="link-external link-http" href="http://osf.io/a2u9c" rel="external noopener nofollow">this http URL</a>.
Human-Computer Interaction
What problem does this paper attempt to address?
This paper attempts to address the issues of how to avoid reinforcing stereotypes, marginalizing groups, and leading to biased decisions when visualizing socio-demographic data. Specifically, the authors discuss the following aspects: 1. **Data Sensitivity**: Socio-demographic data includes census data, health data systems, and human resources documents, among others. These data are highly sensitive and require careful handling in their collection, sharing, and analysis. For example, the European Union protects the collection and processing of personal data, including sexual orientation, race, and religion, through the General Data Protection Regulation (GDPR). 2. **Potential Harm**: Data visualization can reinforce stereotypes, marginalize certain groups, and lead to biased decisions. Therefore, creating harmless data visualizations requires adherence to good and fair design principles. 3. **Open Research Questions**: The authors propose a series of open research questions regarding the visualization of socio-demographic data, aiming to promote in-depth discussion in this field. These questions include: - **Balancing Efficiency and Harmlessness**: How to avoid using stereotypical visual encodings while maintaining decision-making efficiency. - **Balancing Simplicity and Inclusiveness**: How to simplify visualizations to improve readability while ensuring marginalized groups are adequately represented. - **Inclusiveness of Different Representation Types**: How to choose the most appropriate representation method to display the socio-demographic data behind individuals to promote diversity and inclusiveness. Through these questions, the authors hope to encourage the community to conduct more research and reflection on how to effectively and fairly visualize socio-demographic data.