Abstract:The capacity to collect and analyse data is growing exponentially. Referred to as ‘Big Data’, this scientific, social and technological trend has helped create destabilising amounts of information, which can challenge accepted social and ethical norms. Big Data remains a fuzzy idea, emerging across social, scientific, and business contexts sometimes seemingly related only by the gigantic size of the datasets being considered. As is often the case with the cutting edge of scientific and technological progress, understanding of the ethical implications of Big Data lags behind. In order to bridge such a gap, this article systematically and comprehensively analyses academic literature concerning the ethical implications of Big Data, providing a watershed for future ethical investigations and regulations. Particular attention is paid to biomedical Big Data due to the inherent sensitivity of medical information. By means of a meta-analysis of the literature, a thematic narrative is provided to guide ethicists, data scientists, regulators and other stakeholders through what is already known or hypothesised about the ethical risks of this emerging and innovative phenomenon. Five key areas of concern are identified: (1) informed consent, (2) privacy (including anonymisation and data protection), (3) ownership, (4) epistemology and objectivity, and (5) ‘Big Data Divides’ created between those who have or lack the necessary resources to analyse increasingly large datasets. Critical gaps in the treatment of these themes are identified with suggestions for future research. Six additional areas of concern are then suggested which, although related have not yet attracted extensive debate in the existing literature. It is argued that they will require much closer scrutiny in the immediate future: (6) the dangers of ignoring group-level ethical harms; (7) the importance of epistemology in assessing the ethics of Big Data; (8) the changing nature of fiduciary relationships that become increasingly data saturated; (9) the need to distinguish between ‘academic’ and ‘commercial’ Big Data practices in terms of potential harm to data subjects; (10) future problems with ownership of intellectual property generated from analysis of aggregated datasets; and (11) the difficulty of providing meaningful access rights to individual data subjects that lack necessary resources. Considered together, these eleven themes provide a thorough critical framework to guide ethical assessment and governance of emerging Big Data practices.

Occams Razor for Big Data? On Detecting Quality in Large Unstructured Datasets

Ieee Access Special Section Editorial: Advanced Data Analytics For Large-Scale Complex Data Environments

Big Data Analytics in Medicine and Healthcare

Big Data, Big Challenges

Big data, bigger dilemmas: A critical review

Challenges of Big Data Analysis

Rethinking Abstractions for Big Data: Why, Where, How, and What

Rethinking big data: A review on the data quality and usage issues

Big data dimensional analysis

Emerging Trends and Challenges in Data Science and Big Data Analytics

Big Data Quality: A systematic literature review and future research directions

A Survey on Big Data Analytics: Challenges, Open Research Issues and Tools

Data Smells: Categories, Causes and Consequences, and Detection of Suspicious Data in AI-based Systems

The anatomy of big data computing

Big Data: Understanding Big Data

Enabling data-centric AI through data quality management and data literacy

Big Data Quality: A Survey

CRITICAL QUESTIONS FOR BIG DATA

Big Data Analytics: A Review on Theoretical Contributions and Tools Used in Literature

The Ethics of Big Data: Current and Foreseeable Issues in Biomedical Contexts