Discussion of “Experiences with big data: Accounts from a data scientist’s perspective”

Timothy J. Robinson,Richard C. Giles,Rasika U. Rajapakshage
DOI: https://doi.org/10.1080/08982112.2020.1758333
2020-06-05
Quality Engineering
Abstract:Kulachi, Frumosu, Khan, Rensch and Sponner have initiated an important discussion about the implications of Big Data on production analytics within Industry 4.0. Throughout their discussion, the authors refer to themselves as data scientists and given the plethora of variability on perceptions of what "data science" encapsulates, we begin our discussion defining the paradigm of data science through use of the data lifecycle. Using this lifecycle as an organizational framework, our discussion focuses upon data storage, access, wrangling and analytical methods associated with unstructured data. It is widely estimated that over 80% of the data encountered in Industry 4.0 will be in an unstructured format (ex. videos, imaging data, audio files, PDF text files, social media posts, etc.) and the tools used to glean information from unstructured data are vastly different from those used with structured data. Our discussion also provides remarks about the importance of publishing reproducible research as our field tackles challenges associated with Industry 4.0. In our conclusion, we suggest that traditional, static report presentations of data analytics be substituted with dynamic reports which enable the data consumer to interact and engage with the data.
engineering, industrial,statistics & probability
What problem does this paper attempt to address?