Predicting scientific impact based on h-index
Samreen Ayaz,Nayyer Masood,Muhammad Arshad Islam
DOI: https://doi.org/10.1007/s11192-017-2618-1
IF: 3.801
2017-12-16
Scientometrics
Abstract:Predicting the future impact of a scientist/researcher is a critical task. The objective of this work is to evaluate different h-index prediction models for the field of Computer Science. Different combinations of parameters have been identified to build the model and applied on a large data set taken from Arnetminer comprised of almost 1.8 million authors and 2.1 million publications’ record of Computer Science. Machine learning prediction technique, regression, is used to find the best set of parameters suitable for h-index prediction for the scientists from all career ages, without enforcing any constraint on their current h-index values with R2 as a metric to measure the accuracy. Further, these parameters are evaluated for different career ages and different thresholds for h-index values. Prediction results for 1 year are really good, having R2 0.93 but for 5 years R2 declines to 0.82 on average. Hence inferred that prediction of h-index is difficult for longer periods. Predictions for the researchers having 1 year experience are not precise, having R2 0.60 for 1 year and 0.33 for 5 years. Considering scientists of different career ages, average R2 values for researchers having 20–36 years of experience were 0.99. For the researches having different h-index values, researchers having low h-index were difficult to predict. Parameters set comprising of current h-index, average citations per paper, number of coauthors, years since publishing first article, number of publications, number of impact factor publications, and number of publications in distinct journals performed better than all other combinations.
information science & library science,computer science, interdisciplinary applications