Data Science Studies:State-of-the-art and Trends
Le-men CHAO,Chun-xiao XING,Yong ZHANG
DOI: https://doi.org/10.11896/j.issn.1002-137X.2018.01.001
2018-01-01
Abstract:The entering big data era gives rise to a novel discipline called data science.First,the differences between domain-general data science and domain-specific data science were proposed based upon conducting an in-depth discussion on its basic concept,brief history,scientific roles and the body of knowledge.Secondly,top ten challenges faced by data science were identified via describing the debates on paradoxical topics including the shifts of thinking pattern (knowledge pattern or data pattern),perspectives on data (active or negative),implementation of intelligence(via AI or via big data),bottlenecks of data products development(computing intensive or data intensive),data preparation (data preprocessing or data wrangling),quality of services(performance of services or user experiences),data analysis (explanatory or predictive),evaluation of algorithm(by complexity or by scalability),research paradigm(third paradigm or fourth paradigm) as well as main motivations of the education(in order to cultivate data engineer or data scientist).And then,the top ten trends in data science studies were proposed:to vale predictive models and correlation analysis,to give more attention on model integration and meta-analysis,to embrace data first,model later or never paradigm,to be led by realism and ensure data consistence,to support multi-copies and data locality,the coexistence of varieties in implementation techno logics and integrated applications,to be dominated by simple computing and pragmatism,to develop data products and the embedded applications of data science,to embrace the Pro-Am and metadata,and cultivate data scientist and curriculums or majors.Finally,some suggestions on how do further studies were also proposed.