A Probabilistic Model For Co-Occurrence Analysis in Bibliometrics

Xiaobei Zhou,Miao Zhou,Desheng Huang,Lei Cui
DOI: https://doi.org/10.21203/rs.3.rs-634136/v1
2021-06-22
Abstract:Abstract The co-occurrence analysis of Medical Subject Heading (MeSH) terms in the bibliographic database is popularly used in bibliometrics. Practically for making the result interpretable, it is necessary to apply a certain filter procedure of co-occurrence matrix for removing the low-frequency items that should not appear in the final result due to their low representativeness for co-occurrence analysis. Unfortunately, there is rare research referring to determine a critical threshold to remove noise of data for co-occurrence analysis. Here, we propose a probabilistic model for co-occurrence analysis that can calculate statistical significance (p-values) of co-occurred items. With help of this model, the dimensionality of co-occurrence network could be conveniently reduced according to selection of different levels of p-value thresholds. The conceptual model framework, simulation and practical applications are illustrated in the manuscript. Further details (including all reproducible codes) can be downloaded from the project website: https://github.com/Miao-zhou/Co-occurrence-analysis.git.
What problem does this paper attempt to address?