Data driven identification of international cutting edge science and technologies using SpaCy

Chunqi Hu,Huaping Gong,Yiqing He
DOI: https://doi.org/10.1371/journal.pone.0275872
IF: 3.7
2022-10-14
PLoS ONE
Abstract:Difficulties in collecting, processing, and identifying massive data have slowed research on cutting-edge science and technology hotspots. Promoting these technologies will not be successful without an effective data-driven method to identify cutting-edge technologies. This paper proposes a data-driven model for identifying global cutting-edge science technologies based on SpaCy. In this model, we collected data released by 17 well-known American technology media websites from July 2019 to July 2020 using web crawling with Python. We combine graph-based neural network learning with active learning as the research method in this paper. Next, we introduced a ten-fold cross-check to train the model through machine learning with repeated experiments. The experimental results show that this model performed very well in entity recognition tasks with an F value of 98.11%. The model provides an information source for cutting-edge technology identification. It can promote innovations in cutting-edge technologies through its effective identification and tracking and explore more efficient scientific and technological research work modes.
multidisciplinary sciences
What problem does this paper attempt to address?