Scientometrics in a Changing Research Landscape
Lutz Bornmann,Loet Leydesdorff
DOI: https://doi.org/10.15252/embr.201439608
IF: 9.071
2014-01-01
EMBO Reports
Abstract:Science & Society11 November 2014free access Scientometrics in a changing research landscape Bibliometrics has become an integral part of research quality evaluation and has been changing the practice of research Lutz Bornmann Lutz Bornmann [email protected] Division for Science and Innovation Studies, Administrative Headquarters of the Max Planck Society, Munich, Germany Search for more papers by this author Loet Leydesdorff Loet Leydesdorff [email protected] Amsterdam School of Communication Research (ASCoR), University of Amsterdam, Amsterdam, The Netherlands Search for more papers by this author Lutz Bornmann Lutz Bornmann [email protected] Division for Science and Innovation Studies, Administrative Headquarters of the Max Planck Society, Munich, Germany Search for more papers by this author Loet Leydesdorff Loet Leydesdorff [email protected] Amsterdam School of Communication Research (ASCoR), University of Amsterdam, Amsterdam, The Netherlands Search for more papers by this author Author Information Lutz Bornmann1 and Loet Leydesdorff2 1Division for Science and Innovation Studies, Administrative Headquarters of the Max Planck Society, Munich, Germany 2Amsterdam School of Communication Research (ASCoR), University of Amsterdam, Amsterdam, The Netherlands EMBO Reports (2014)15:1228-1232https://doi.org/10.15252/embr.201439608 PDFDownload PDF of article text and main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Quality assessments permeate the entire scientific enterprise, from funding applications to promotions, prizes and tenure. Their remit can encompass the scientific output of individual scientists, whole departments or institutes, or even entire countries. Peer review has traditionally been the major method used to determine the quality of scientific work, either to arbitrate if the work should be published in a certain journal, or to assess the quality of a scientist's or institution's total research output. Since the 1990s, quantitative assessment measures in the form of indicator-supported procedures, such as bibliometrics, have gained increasing importance, especially in budgetary decisions where numbers are more easily compared than peer opinion, and are usually faster to produce. In particular, quantitative procedures can provide important information for quality assessment when it comes to comparing a large number of units, such as several research groups or universities, as individual experts are not capable of handling so much information in a single evaluation procedure. Thus, for example, the new UK Research Excellence Framework (REF) puts more emphasis on bibliometric data and less on peer review than did its predecessor. Even though bibliometrics and peer review are often thought of as alternative methods of evaluation, their combination in what is known as informed peer review can lead to more accurate assessments: peer reviewers can enhance their qualitative assessment on the basis of bibliometric and other indicator-supported empirical results. This reduces the risk of distortions and mistakes as discrepancies between the peers' judgements and the bibliometric evaluation become more transparent. Although this combination of peer review and bibliometrics is regarded as the ideal method for research evaluation, the weighting of both can differ. The German Research Foundation (DFG), for example, encourages applicants to submit only their five most relevant publications, which is a manageable number for the reviewers. On the other side, the Australian Research Council (ARC) and the UK REF focus on bibliometric instruments for national evaluations to the detriment of peer review. The weighting of the two instruments can also change over time: the new REF weights bibliometrics higher than the former Research Assessment Exercise. Bibliometrics has various advantages that make it suitable for the evaluation of research. The most important one is that bibliometrics analyses data, which concerns the essence of scientific work. In virtually all research disciplines, publishing relevant research results is crucial; results that are not published are usually of no importance. Furthermore, authors of scientific publications have to discuss the context and implications of their research with reference to the state of the art and appropriately cite the methods, data sets and so on that they have used. Citations are embedded in the reputation system of research, as researchers express their recognition and the influence of others' work. Another advantage of using bibliometrics in research evaluation is that the bibliometric data can be easily found and assessed for a broad spectrum of disciplines using appropriate databases: for example, Web of Science (WoS) or Scopus. The productivity and impact even of large research units can therefore be measured with reasonable effort. Finally, the results of bibliometrics correlate well with other indicators of research quality, including external funding or scientific prizes 12. Since there is now hardly any evaluation that does not count publications and citations, bibliometrics seems to have established itself as a reliable tool in the general assessment of research. Indeed, it would not last long if reputations and awards based on bibliometric analyses were arbitrary or undeserved. However, bibliometrics also has a number of disadvantages. These, though, do not relate to its general applicability in research evaluation—this is no longer doubted—but relate to whether such an analysis is done professionally according to standards 3, which are often known only to experts. … bibliometrics can only be applied to disciplines where the literature and its citations are available from appropriate databases. First, bibliometrics can only be applied to disciplines where the literature and its citations are available from appropriate databases. While the natural sciences are well-represented in such databases, the literature of the technical sciences, the social sciences, and the humanities (TSH) are only partly included. Bibliometrics can therefore only yield limited results for these disciplines. Google Scholar is often seen as a solution, but it is not clear what Google Scholar considers as a citation; the validity of the data is therefore not guaranteed 4. Second, bibliometric data are numerical data with highly skewed distributions. Their evaluation therefore requires appropriate statistical methods. For example, the arithmetic mean is relatively inappropriate for citation analysis, since it is strongly influenced by highly cited publications. Thus, Göttingen University in Germany achieved a good place in the current Leiden ranking, which uses a mean-based indicator, because it could boast one extremely highly cited publication in recent years. The Journal Impact Factor—the best known indicator for the importance of journals—is similarly affected by this problem: since it gives the average number of citations for the papers in a journal during the preceding 2 years, it may be determined by a few highly cited papers and hardly at all by the mass of papers, which are cited very little or not at all. The h-index—a bibliometric indicator which is now similarly well known as the Journal Impact Factor—is unaffected by this problem, as it is not based on the mean. Rather, it measures the publications in a set with a specific minimum of citations (namely h) so that the few highly cited publications play only a small role in its calculations. The h-index, however, has other weaknesses that make its use in research evaluation questionable; the arbitrary limit for the selection of the significant publications with at least h citations is criticised; it could just as well be h2 citations. Third, citations need time to accumulate. Research evaluation on the basis of bibliometrics can therefore say nothing about more recent publications. It has now become standard practice in bibliometrics to allow at least 3 years for a reliable measurement of the impact of publications. This disadvantage of bibliometrics is chiefly a problem with the evaluation of institutions where the research performance of recent years is generally assessed, about which bibliometrics—the measurement of impact based on citations—can say little. In the assessment of recent years, one can only use bibliometric instruments to evaluate the productivity of the researchers of an institution and their success in publishing their manuscripts in respected journals. Here, the most important question is how long the citation window should be to achieve reliable and valid impact measurement. There are many examples where the importance of research results has become apparent only decades after publication 5. For example, the “Shockley–Queisser limit” describes the limited efficiency of solar cells on the basis of absorption and reemission processes. The original reception of the paper was rather timid, but today, it has become one of the relatively few highly cited papers in a field that has developed relatively synchronously with rapidly growing solar-cell and photovoltaic research. There are many examples where the importance of research results have become apparent only decades after publication Although such papers constitute probably one in every 10,000 papers 5, the standard practice of using a citation window of only 3 years nevertheless seems to be too small. In one study, of the 10% of highest cited papers identified using a 30-year window, more than 40% are excluded from this elite collection when a 3-year window is used 6. When a 20-year window is used, 92% are still included, and a 10-year window yields 82% of the 30-year highest cited papers. Based on his results, Wang recommends that researchers should report “the potential errors in their evaluations when using short-time windows, providing a paragraph such as: ‘Although a citation window of 5 years is used here, note that the Spearman correlation between these citation counts and long-term (31 years) citation counts will be about 0.87. Furthermore, the potential error of using a 5-year time window will be higher for highly cited papers because papers in the top 10% most cited papers in year 5 have a 32% chance of not being in the top 10% in year 31’” 5. This tendency to focus on the citations of papers published during the last 2 or 3 years assumes a rapid research front, as in the biomedical sciences. However, disciplines differ in terms of the existence and speed of research fronts and their historical developments. A recent study has distinguished between “transitory knowledge claims” in research papers at the research front and “sticky knowledge claims” that may accumulate citations during ten or even more years 7. As bibliometrics has developed into a standard procedure in research evaluation, with both advantages and disadvantages, a further question is now whether bibliometric measurement and assessment is likely to change scientific practice, as fixing on particular indicators for measuring research performance generally leads to an adaptation of researchers' behaviour. This may well be intentional: one reason for research evaluation is to increase research performance, namely productivity. However, there are also unintended effects. For example, in order to achieve a desired increase in publication volume, some researchers choose a publication strategy known as salami slicing: The results of a research project are published in many small parts, although they could also be published in a few large papers or a single one. This behaviour is not generally considered to help the progress of research, but it may improve bibliometric scores. It is also desirable for researchers to publish in respected journals. Yet since these journals only publish newsworthy results or results with a possible high impact, a stronger focus on respected journals in research evaluation raises the risk of scientific malpractice when results are manipulated or falsified to satisfy this requirement. The risk of this behaviour should not be unreasonably increased by research evaluation processes, in which, for example, scientists in China are sometimes financially rewarded according to the Impact Factors of the journals in which they publish their papers 8. In national scientific systems, in which research evaluation or bibliometrics plays a major role, indicators are often used without sufficient knowledge of the subject. Since the demand for such figures is high and the numbers are often required speedily or inexpensively, they are sometimes produced by analysts with little understanding of bibliometrics. For example, such amateur bibliometricians may be inclined to use the h-index because it is a popular and modern indicator that is readily available and easy to calculate. Yet, these assessments often do not take into account that the h-index is unsuitable for comparing researchers from different subject areas and with different academic ages. Amateur bibliometricians also often wrongly use the Journal Impact Factor to measure the impact of single pieces of work, although the Journal Impact Factor only provides information about the performance of a journal. …, a further question is now whether bibliometric measurement and assessment is likely to change scientific practise… There is a community of professional experts in bibliometrics who develop advanced indicators for productivity and citation impact measurements. Only experts from this community should undertake a bibliometric study that involves comparisons across fields of science. These centres of professional expertise have generated analytical versions of the databases and can be found, for example, at the Centre for Science and Technology Studies (CWTS, Leiden) or the Centre for Research & Development Monitoring (ECOOM, Leuven). Fourth, a range of suppliers of bibliometric data, such as Elsevier or Thomson Reuters, have developed research evaluation systems that allow decision-makers to produce results about any given research unit at the press of a button. This “desktop bibliometrics” also increases the risk that such analyses are applied without sufficient knowledge of the subject. Furthermore, these systems often present themselves as a black box: the user does not know how the results are calculated; but even simple indicators such as the h-index can be calculated in different ways. This is why the results of bibliometric analyses do not always correspond to the current standards in bibliometrics. The state no longer has faith that excellent research alone is automatically best for society. Fifth, bibliometrics can be applied well in the natural sciences, but its application to TSH is limited. Even if research in these disciplines is published, these publications and their citations are only poorly represented in the literature databases that can be used for bibliometrics. The differing citation culture—in particular the different average number of references per paper and thereby the different probability of being cited—is widely regarded as the cause of this variation. Based on an analysis of all WoS records published in 1990, 1995, 2000, 2005 and 2010, however, a study found that almost all disciplines show similar numbers of references in the reference lists 9. This suggests that the comparatively low citation rates in the humanities are not so much the result of a lower average number of references per paper, but caused by the low fraction of references that are published in the core set of journals covered by WoS. Furthermore, the research output in TSH is not only publications, but other products such as software and patents. These products and their citations are hardly reflected in the literature databases. Thus, for example, a large part of the publications and other research products from the TSH area are missing from the Leiden University Ranking, which is based on data in WoS. Even the indicator report of the German Competence Centre for Bibliometrics (KB), which assesses German research based on bibliometric data from WoS, underrepresents publications from the TSH areas. So far, scientometric research has developed no satisfactory solution to evaluate TSH in the same sophisticated way that is used for the natural sciences. Various initiatives have therefore tried to develop alternative quality criteria. For example, the cooperative project “Developing and Testing Research Quality Criteria in the Humanities, with an emphasis on Literature Studies and Art History” of the Universities of Zurich and Basel, supplies Swiss universities with instruments to measure research performance and compare research performance internationally. Until the 1990s, politicians had faith that pushing the quality of science to the highest levels would automatically generate returns for society. Quality controls in research were primarily concerned with the use of research for research. Triggered by the financial crisis and by growing competition between nations, the direct societal benefits of research have moved increasingly into the foreground of quality assessments. The state no longer has faith that excellent research alone is automatically best for society. Basic research in particular has become subject to scrutiny, since it is more difficult to show a link between its results and beneficial applications. Recent years have therefore seen a tendency to implement evaluation procedures that attempt to provide information on the societal impacts of research. For example, applicants to the US National Science Foundation have to state what benefits their research would bring beyond science. As part of the UK REF, British institutions also have to provide information about the societal impacts of their research. … productivity no longer only means publication output, and the impact of publications can no longer be equated simply with citations Evaluating the societal impacts of research does not stop at the traditional products of research, such as prizes or publications, but includes other elements such as software, patents or data sets. The impact itself is also measured more broadly to include effects on society and not just on research. However, there are still no accepted standard procedures that yield reliable and valid information. Often, a case study is carried out in which an institution describes one or several examples of the societal impacts of its research. The problem is that the results of case studies cannot be generalised and compared owing to a lack of standardisation. So-called altmetrics—the number of page views, downloads, shares, saves, recommendations, and comments from social media platforms, such as Twitter, Mendeley and Facebook—could provide a possible alternative to bibliometric data. A perceived advantage of altmetrics is the ability to provide recent data, whereas citations need time to accumulate. Another perceived advantage is that alternative metrics can also measure the impact of research in other sectors of society, as social media platforms are used by individuals and institutions from many parts of society. However, it is not clear to what extent these advantages—speed and breadth of impact—really matter. The study of altmetrics began only a few years ago and is now in a state similar to that of research into traditional metrics in the 1970s. Before alternative metrics can be applied to research evaluation—with possible effects on funding decisions or promotions—there are a number of open questions. What kind of impact do the metrics measure, and with what category of persons? How reliable are the data obtained from social media platforms? How can the manipulation of social media data by users be counteracted or prevented? Finally, metrics need to be validated by correlating them with other indicators: is there, for example, a connection between alternative metrics and the judgment of experts as to the societal relevance of publications? This new challenge of measuring the broad impact of research on society has triggered a scientific revolution in scientometrics. This assertion is based on a fundamental change in the taxonomy of scientometrics: productivity no longer only means publication output, and the impact of publications can no longer be equated simply with citations. Scientometrics should therefore soon enter a phase of normal science to find answers to the questions mentioned above. Such corresponding alternative indicators should be applied in research evaluation only after altmetrics has been thoroughly scrutinised in further studies. It is clear that scientometrics has become an integral part of research evaluation and plays a crucial role in making decisions about national research policies, funding, promotions, job offers and so on, and thereby on the careers of scientists. Scientometrics therefore has demonstrated that it provides reliable, transparent and relevant results, which it largely achieves with citation-based data if it is done correctly. The next challenge will be to develop altmetrics to the same standards. Conflict of interest The authors declare that they have no conflict of interest. References Diekmann A, Naf M, Schubiger M (2012) The impact of (Thyssen)-awarded articles in the scientific community. Kölner Z Soz Sozialpsychol 64: 563–581CrossrefWeb of Science®Google Scholar Luhmann N (1992) Die Wissenschaft der Gesellschaft. Frankfurt am Main, Germany: SuhrkampGoogle Scholar Bornmann L, Marx W (2014) How to evaluate individual researchers working in the natural and life sciences meaningfully? A proposal of methods based on percentiles of citations. Scientometrics 98: 487–509CrossrefWeb of Science®Google Scholar Bornmann L, Marx W, Schier H, Rahm E, Thor A, Daniel HD (2009) Convergent validity of bibliometric Google Scholar data in the field of chemistry. Citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published elsewhere, using Google Scholar, Science Citation Index, Scopus, and Chemical Abstracts. J Inform 3: 27–35CrossrefWeb of Science®Google Scholar van Raan AFJ (2004) Sleeping beauties in science. Scientometrics 59: 467–472CrossrefWeb of Science®Google Scholar Wang J (2013) Citation time window choice for research impact evaluation. Scientometrics 94: 851–872CrossrefWeb of Science®Google Scholar Baumgartner SE, Leydesdorff L (2014) Group-based trajectory modeling (GBTM) of citations in scholarly literature: dynamic qualities of “transient” and “sticky knowledge claims”. J Assoc Inf Sci Technol 65: 797–811Wiley Online LibraryWeb of Science®Google Scholar Shao J, Shen H (2011) The outflow of academic papers from China: why is it happening and can it be stemmed? Learned Publishing 24: 95–97Wiley Online LibraryWeb of Science®Google Scholar Marx W, Bornmann L (2014) On the causes of subject-specific citation rates in Web of Science. Scientometrics (in press)Google Scholar Previous ArticleNext Article Read MoreAbout the coverClose modalView large imageVolume 15,Issue 12,December 2014Cover picture: Inspired by the article on p 1228 | Cover illustration by Uta Mackensen. Volume 15Issue 121 December 2014In this issue ReferencesRelatedDetailsLoading ...
What problem does this paper attempt to address?
-
Technology and Innovation Has Made Impact Factor Redundant—Better Alternatives Are Here to Thrive
Purvish M. Parikh,Amish Vora
DOI: https://doi.org/10.1055/s-0043-1767699
2023-04-11
South Asian Journal of Cancer
Abstract:Three recent publications have let the cat out of the bag. They are "Stop Congratulating Colleagues for Publishing in High Impact-Factor Journals," "Who are the real parasite publishers and journals? What prevents all medical data from being open access in real time?" and "Against Parasite Publishers: Making Journals Free."[1] [2] [3] For long impact factor (IF) has ruled the academic publishing world. It has expanded its sphere of influence on career progress, appointments in academic institutions, promotion reviews, and grant applications. IF is popular since 1975 and is based on the number of citations that the journal received in the previous 2 years. For instance, if our article is published in the year 2022, the IF that the journal in which our article is published will be decided on the citations received by the journal in the years 2020 and 2021. Authors are cognizant of the significance of IF because journals with higher IF are considered as respectable, their review process is supposed to be selective, there is greater scrutiny of articles submitted, and if published, the authors are considered worthy by their peers. Publications in journals with high IF also cascade into wider publicity through reporting by science journals and social media. They also have a higher chance of being included in reports on new publications. As the definition suggests, there is also a higher chance of being cited by other scientists working in the field. However, IF has a very important flaw. All the metrics are related to the journal. There is no evaluation of the individual publication or the author who has done the research work. Therefore, having a publication in a high IF journal does not guarantee that your work will be cited. In fact, more than two-thirds of publications in such journals have fewer citations than the IF of the journal.[4] By experience, we have found that using the right keywords does wonders to online access. When the research work is great, those interested in the subject can easily find it on the net. Searchable databases such as Scopus, Google Scholar, and Web of Science make this possible. There are also niche areas in science and medicine, where journals with high IF simply do not exist. The research communities in such fields are small and usually know each other well. So, they tend to find work of colleagues online even if published in journals with IF of 3 or 4. As journals with high IF are in "great demand," their review process is time consuming and ultimately most of the work submitted does not get accepted. In the process, our data may become redundant in today's exponentially progressive research environment. Can we afford to face this? The final stumbling block is that if such journals are open access, they command a high premium in terms of publication "processing fees." For instance, last year, Nature announced a princely sum of Euro 9,500 as their charges if the authors wanted their article to have open access. Clearly, this is an elitist attitude where rich publishing houses want to get richer at the cost of dissemination of information. This has a major health care implication. Work from low- and medium-income countries (LMIC) will not be published in such journals. Or the work will be behind a paywall that the LMIC colleagues cannot surmount.[5] No wonder global citation inequality is increasing. Data between 2000 and 2015 encompassing 1 million authors and 26 million scientific publications show interesting confirmation of the same.[6] Citations have increased from 14 to 21% only for the tip 1% of most cited researchers. The increasing trend was most prominent in the Netherland, Denmark, Australia, and United Kingdom. Interestingly, it showed a decline in the United States and China. IF has actually become so frustrating and misleading that many grant providers have started ignoring any reference to it. The European Research Council has taken a step further by banning mention of journal IF from all grant proposals and bids submitted to them.[7] So, the question that begs an answer is, what are the alternative metrics that we can use and which of them is the most likely to replace IF? SCImago Journal and Country Rank, Eigenfactor Metrics, and Science Gateway need mention only to say that they exist but pale in comparison to IF. In 2005, Hirsch published another metric to evaluate the scientific research output of individual scientists and researchers.[8] This was labeled as the H index factor. It is based on two data points. One is the number of journal articles the author has published. The other is the number of citations received by those articles (of the same author). An example from publications from one of us shows that number of total citations as 12,103 and the H index as 34, as calculated using Google Scholar.[9] Let us take another example. Charles -Abstract Truncated-
-
What Is Wrong With the Current Evaluative Bibliometrics?
Endel Põder
DOI: https://doi.org/10.3389/frma.2021.824518
2022-01-21
Frontiers in Research Metrics and Analytics
Abstract:Bibliometric data are relatively simple and describe objective processes of publishing articles and citing others. It seems quite straightforward to define reasonable measures of a researcher's productivity, research quality, or overall performance based on these data. Why do we still have no acceptable bibliometric measures of scientific performance? Instead, there are hundreds of indicators with nobody knowing how to use them. At the same time, an increasing number of researchers and some research fields have been excluded from the standard bibliometric analysis to avoid manifestly contradictive conclusions. I argue that the current biggest problem is the inadequate rule of credit allocation for multiple authored articles in mainstream bibliometrics. Clinging to this historical choice excludes any systematic and logically consistent bibliometrics-based evaluation of researchers, research groups, and institutions. During the last 50 years, several authors have called for a change. Apparently, there are no serious methodologically justified or evidence-based arguments in the favor of the present system. However, there are intractable social, psychological, and economical issues that make adoption of a logically sound counting system almost impossible.
-
The application of bibliometrics to research evaluation in the humanities and social sciences: An exploratory study using normalized Google Scholar data for the publications of a research institute
Lutz Bornmann,Andreas Thor,Werner Marx,Hermann Schier
DOI: https://doi.org/10.1002/asi.23627
2016-03-03
Journal of the Association for Information Science and Technology
Abstract:In the humanities and social sciences, bibliometric methods for the assessment of research performance are (so far) less common. This study uses a concrete example in an attempt to evaluate a research institute from the area of social sciences and humanities with the help of data from Google Scholar (GS). In order to use GS for a bibliometric study, we developed procedures for the normalization of citation impact, building on the procedures of classical bibliometrics. In order to test the convergent validity of the normalized citation impact scores, we calculated normalized scores for a subset of the publications based on data from the Web of Science (WoS) and Scopus. Even if scores calculated with the help of GS and the WoS/Scopus are not identical for the different publication types (considered here), they are so similar that they result in the same assessment of the institute investigated in this study: For example, the institute's papers whose journals are covered in the WoS are cited at about an average rate (compared with the other papers in the journals).
information science & library science,computer science, information systems
-
Do bibliometrics introduce gender, institutional or interdisciplinary biases into research evaluations?
Mike Thelwall,Kayvan Kousha,Emma Stuart,Meiko Makita,Mahshid Abdoli,Paul Wilson,Jonathan Levitt
DOI: https://doi.org/10.1016/j.respol.2023.104829
IF: 7.2
2023-06-14
Research Policy
Abstract:Systematic evaluations of publicly funded research sometimes use bibliometrics alone or bibliometric-informed peer review, but it is not known whether bibliometrics introduce biases when supporting or replacing peer review. This article assesses this by comparing three alternative mechanisms for scoring 73,612 UK Research Excellence Framework (REF) journal articles from all 34 field-based Units of Assessment (UoAs) 2014–17: REF peer review scores, field normalised citations, and journal average field normalised citation impact. The results suggest that in almost all academic fields, bibliometric scoring can disadvantage departments publishing high quality research, as judged by peer review, with the main exception of article citation rates in chemistry. Thus, introducing journal or article level citation information into peer review exercises may have a regression to the mean effect. Bibliometric scoring slightly advantaged women compared to men, but this varied between UoAs and was most evident in the physical sciences, engineering, and social sciences. In contrast, interdisciplinary research gained from bibliometric scoring in about half of the UoAs, but relatively substantially in two. In conclusion, out of the three potential sources of bibliometric bias examined, the most serious seems to be the tendency for bibliometric scores to work against high quality departments, assuming that the peer review scores are correct. This is almost a paradox: although high quality departments tend to get the highest bibliometric scores, bibliometrics conceal the full extent of departmental quality advantages, as judged by peer review. This should be considered when using bibliometrics or bibliometric informed peer review.
management
-
Methods for the generation of normalized citation impact scores in bibliometrics: Which method best reflects the judgements of experts?
Lutz Bornmann,Werner Marx
DOI: https://doi.org/10.1016/j.joi.2015.01.006
IF: 3.7
2015-04-01
Journal of Informetrics
Abstract:Evaluative bibliometrics compare the citation impact of researchers, research groups and institutions with each other across time scales and disciplines. Both factors, discipline and period – have an influence on the citation count which is independent of the quality of the publication. Normalizing the citation impact of papers for these two factors started in the mid-1980s. Since then, a range of different methods have been presented for producing normalized citation impact scores. The current study uses a data set of over 50,000 records to test which of the methods so far presented correlate better with the assessment of papers by peers. The peer assessments come from F1000Prime – a post-publication peer review system of the biomedical literature. Of the normalized indicators, the current study involves not only cited-side indicators, such as the mean normalized citation score, but also citing-side indicators. As the results show, the correlations of the indicators with the peer assessments all turn out to be very similar. Since F1000 focuses on biomedicine, it is important that the results of this study are validated by other studies based on datasets from other disciplines or (ideally) based on multi-disciplinary datasets.
information science & library science,computer science, interdisciplinary applications
-
A scientists' view of scientometrics: Not everything that counts can be counted
R. Kenna,O. Mryglod,B. Berche
DOI: https://doi.org/10.5488/CMP.20.13803
2017-03-30
Abstract:Like it or not, attempts to evaluate and monitor the quality of academic research have become increasingly prevalent worldwide. Performance reviews range from at the level of individuals, through research groups and departments, to entire universities. Many of these are informed by, or functions of, simple scientometric indicators and the results of such exercises impact onto careers, funding and prestige. However, there is sometimes a failure to appreciate that scientometrics are, at best, very blunt instruments and their incorrect usage can be misleading. Rather than accepting the rise and fall of individuals and institutions on the basis of such imprecise measures, calls have been made for indicators be regularly scrutinised and for improvements to the evidence base in this area. It is thus incumbent upon the scientific community, especially the physics, complexity-science and scientometrics communities, to scrutinise metric indicators. Here, we review recent attempts to do this and show that some metrics in widespread use cannot be used as reliable indicators research quality.
Physics and Society,Digital Libraries
-
The Assessment of Science: The Relative Merits of Post-Publication Review, the Impact Factor, and the Number of Citations
Adam Eyre-Walker,Nina Stoletzki
DOI: https://doi.org/10.1371/journal.pbio.1001675
IF: 9.8
2013-10-08
PLoS Biology
Abstract:The assessment of scientific publications is an integral part of the scientific process. Here we investigate three methods of assessing the merit of a scientific paper: subjective post-publication peer review, the number of citations gained by a paper, and the impact factor of the journal in which the article was published. We investigate these methods using two datasets in which subjective post-publication assessments of scientific publications have been made by experts. We find that there are moderate, but statistically significant, correlations between assessor scores, when two assessors have rated the same paper, and between assessor score and the number of citations a paper accrues. However, we show that assessor score depends strongly on the journal in which the paper is published, and that assessors tend to over-rate papers published in journals with high impact factors. If we control for this bias, we find that the correlation between assessor scores and between assessor score and the number of citations is weak, suggesting that scientists have little ability to judge either the intrinsic merit of a paper or its likely impact. We also show that the number of citations a paper receives is an extremely error-prone measure of scientific merit. Finally, we argue that the impact factor is likely to be a poor measure of merit, since it depends on subjective assessment. We conclude that the three measures of scientific merit considered here are poor; in particular subjective assessments are an error-prone, biased, and expensive method by which to assess merit. We argue that the impact factor may be the most satisfactory of the methods we have considered, since it is a form of pre-publication review. However, we emphasise that it is likely to be a very error-prone measure of merit that is qualitative, not quantitative.
biochemistry & molecular biology,biology
-
What is the impact of a research publication?
Seena Fazel,Achim Wolf
DOI: https://doi.org/10.1136/eb-2017-102668
2017-04-06
Abstract:An increasing number of metrics are used to measure the impact of research papers. Despite being the most commonly used, the 2-year impact factor is limited by a lack of generalisability and comparability, in part due to substantial variation within and between fields. Similar limitations apply to metrics such as citations per paper. New approaches compare a paper's citation count to others in the research area, while others measure social and traditional media impact. However, none of these measures take into account an individual author's contribution to the paper or the number of authors, which we argue are key limitations. The UK's 2014 Research Exercise Framework included a detailed bibliometric analysis comparing 15 selected metrics to a 'gold standard' evaluation of almost 150 000 papers by expert panels. We outline the main correlations between the most highly regarded papers by the expert panel in the Psychiatry, Clinical Psychology and Neurology unit and these metrics, most of which were weak to moderate. The strongest correlation was with the SCImago Journal Rank, a variant of the journal impact factor, while the amount of Twitter activity showed no correlation. We suggest that an aggregate measure combining journal metrics, field-standardised citation data and alternative metrics, including weighting or colour-coding of individual papers to account for author contribution, could provide more clarity.
psychiatry
-
Bibliometrics, a useful tool within the field of research
Karla Salinas-Ríos,Ang´élica Janneire García López
DOI: https://doi.org/10.29057/jbapr.v3i6.6829
2022-01-05
Journal of Basic and Applied Psychology Research
Abstract:The activity in scientific research has been able to be studied, measured, compared, analyzed and objectified through Scientometrics, discipline that applies to all the literature of scientific character, mathematical and statistical methods, thus achieving, that social aspects of science can be quantified. From the scientific literature, scientific publications (tangible products of the research) are derived, which are specifically studied by the bibliometrics. This last one, is a branch of the Scientometrics, that is guided under the assumption that the scientific discoveries and the research results are published in scientific journals, so its unit of analysis is the scientific article. The word bibliometrics was defined for first time by Alan Pritchard in 1969, and since then, multiple concepts of this term have been developed. However, it has reached to the consensus that this methodological tool allows to know the scientific production (in quantity, quality and impact) on various topics, journals, authors and countries, among others. Its main research lines are the methodology for bibliometrics, the scientific disciplines and the health management and policies. Likewise, it has descriptive, evaluation and supervision/monitoring functions of the research activity, on which its classification into levels (micro, meso and maso) will depend directly. Because it has components from various sciences, among them the mathematics, its methodology and theory are based on mathematical models, from which the bibliometric indicators are derived. Although there are other types of research such as systematic reviews and meta-analyzes, these, require a better management in the field of research and of the statistical measurement, as well as more resources. On the other hand, a bibliometric study owns the nobility of being within the reach of students and researchers due to its methodology, practicality, relevance, resource saving, potential to extend to most of the scientific areas, multiple applications and favoring the fact of not committing ethical misconduct related to research. Finally, although bibliometrics is often underestimated, its power and importance as a tool to manage evidence-based knowledge and to serve as a basis for other types of studies such as systematic reviews must be emphasized.
English Else
-
Peer review and citation data in predicting university rankings, a large-scale analysis
David Pride,Petr Knoth
DOI: https://doi.org/10.48550/arXiv.1805.08529
2018-05-22
Digital Libraries
Abstract:Most Performance-based Research Funding Systems (PRFS) draw on peer review and bibliometric indicators, two different methodologies which are sometimes combined. A common argument against the use of indicators in such research evaluation exercises is their low correlation at the article level with peer review judgments. In this study, we analyse 191,000 papers from 154 higher education institutes which were peer reviewed in a national research evaluation exercise. We combine these data with 6.95 million citations to the original papers. We show that when citation-based indicators are applied at the institutional or departmental level, rather than at the level of individual papers, surprisingly large correlations with peer review judgments can be observed, up to r <= 0.802, n = 37, p < 0.001 for some disciplines. In our evaluation of ranking prediction performance based on citation data, we show we can reduce the mean rank prediction error by 25% compared to previous work. This suggests that citation-based indicators are sufficiently aligned with peer review results at the institutional level to be used to lessen the overall burden of peer review on national evaluation exercises leading to considerable cost savings.
-
Citations versus journal impact factor as proxy of quality: Could the latter ever be preferable?
Giovanni Abramo,Ciriaco Andrea D'Angelo,Flavia Di Costa
DOI: https://doi.org/10.1007/s11192-010-0200-1
2018-11-05
Abstract:In recent years bibliometricians have paid increasing attention to research evaluation methodological problems, among these being the choice of the most appropriate indicators for evaluating quality of scientific publications, and thus for evaluating the work of single scientists, research groups and entire organizations. Much literature has been devoted to analyzing the robustness of various indicators, and many works warn against the risks of using easily available and relatively simple proxies, such as journal impact factor. The present work continues this line of research, examining whether it is valid that the use of the impact factor should always be avoided in favour of citations, or whether the use of impact factor could be acceptable, even preferable, in certain circumstances. The evaluation was conducted by observing all scientific publications in the hard sciences by Italian universities, for the period 2004-2007. Performance sensitivity analyses were conducted with changing indicators of quality and years of observation.
Digital Libraries
-
The validation of (advanced) bibliometric indicators through peer assessments: A comparative study using data from InCites and F1000
Lutz Bornmann,Loet Leydesdorff
DOI: https://doi.org/10.1016/j.joi.2012.12.003
IF: 3.7
2013-04-01
Journal of Informetrics
Abstract:The data of F1000 and InCites provide us with the unique opportunity to investigate the relationship between peers’ ratings and bibliometric metrics on a broad and comprehensive data set with high-quality ratings. F1000 is a post-publication peer review system of the biomedical literature. The comparison of metrics with peer evaluation has been widely acknowledged as a way of validating metrics. Based on the seven indicators offered by InCites, we analyzed the validity of raw citation counts (Times Cited, 2nd Generation Citations, and 2nd Generation Citations per Citing Document), normalized indicators (Journal Actual/Expected Citations, Category Actual/Expected Citations, and Percentile in Subject Area), and a journal based indicator (Journal Impact Factor). The data set consists of 125 papers published in 2008 and belonging to the subject category cell biology or immunology. As the results show, Percentile in Subject Area achieves the highest correlation with F1000 ratings; we can assert that for further three other indicators (Times Cited, 2nd Generation Citations, and Category Actual/Expected Citations) the “true” correlation with the ratings reaches at least a medium effect size.
information science & library science,computer science, interdisciplinary applications
-
Quantitative analysis of automatic performance evaluation systems based on the h-index
Marc P. Hauer,Xavier C. R. Hofmann,Tobias D. Krafft,Katharina A. Zweig
DOI: https://doi.org/10.1007/s11192-020-03407-7
IF: 3.801
2020-03-14
Scientometrics
Abstract:Abstract Since the h -index has been invented, it is the most frequently discussed bibliometric value and one of the most commonly used metrics to quantify a researcher’s scientific output. The more it is increasingly gaining popularity to use the metric as an indication of the quality of a job applicant or an employee the more important it is to assure its correctitude. Many platforms offer the h -index of a scientist as a service, sometimes without the explicit knowledge of the respective person. In this article we show that looking up the h -index for a researcher on the five most commonly used platforms, namely AMiner, Google Scholar, ResearchGate, Scopus and Web of Science, results in a variance that is in many cases as large as the average value. This is due to the varying definitions of what a scientific article is, the underlying data basis, and different qualities of the entity recognition problem. To perform our study, we crawled the h -index of the worlds top researchers according to two different rankings, all the Nobel Prize laureates except Literature and Peace, and the teaching staff of the computer science department of the TU Kaiserslautern Germany with whom we additionally computed their h -index manually. Thus we showed that the individual h -indices differ to an alarming extent between the platforms. We observed that researchers with an extraordinary high h -index and researchers with an index appropriate to the scientific career path and the respective scientific field are affected alike by these problems.
information science & library science,computer science, interdisciplinary applications
-
Professional and citizen bibliometrics: complementarities and ambivalences in the development and use of indicators—a state-of-the-art report
Loet Leydesdorff,Paul Wouters,Lutz Bornmann
DOI: https://doi.org/10.1007/s11192-016-2150-8
IF: 3.801
2016-01-01
Scientometrics
Abstract:Bibliometric indicators such as journal impact factors, h -indices, and total citation counts are algorithmic artifacts that can be used in research evaluation and management. These artifacts have no meaning by themselves, but receive their meaning from attributions in institutional practices. We distinguish four main stakeholders in these practices: (1) producers of bibliometric data and indicators; (2) bibliometricians who develop and test indicators; (3) research managers who apply the indicators; and (4) the scientists being evaluated with potentially competing career interests. These different positions may lead to different and sometimes conflicting perspectives on the meaning and value of the indicators. The indicators can thus be considered as boundary objects which are socially constructed in translations among these perspectives. This paper proposes an analytical clarification by listing an informed set of (sometimes unsolved) problems in bibliometrics which can also shed light on the tension between simple but invalid indicators that are widely used (e.g., the h -index) and more sophisticated indicators that are not used or cannot be used in evaluation practices because they are not transparent for users, cannot be calculated, or are difficult to interpret.
-
Model-based evaluation of scientific impact indicators
Matus Medo,Giulio Cimini
DOI: https://doi.org/10.1103/PhysRevE.94.032312
2016-06-14
Abstract:Using bibliometric data artificially generated through a model of citation dynamics calibrated on empirical data, we compare several indicators for the scientific impact of individual researchers. The use of such a controlled setup has the advantage of avoiding the biases present in real databases, and allows us to assess which aspects of the model dynamics and which traits of individual researchers a particular indicator actually reflects. We find that the simple citation average performs well in capturing the intrinsic scientific ability of researchers, whatever the length of their career. On the other hand, when productivity complements ability in the evaluation process, the notorious $h$ and $g$ indices reveal their potential, yet their normalized variants do not always yield a fair comparison between researchers at different career stages. Notably, the use of logarithmic units for citation counts allows us to build simple indicators with performance equal to that of $h$ and $g$. Our analysis may provide useful hints for a proper use of bibliometric indicators. Additionally, our framework can be extended by including other aspects of the scientific production process and citation dynamics, with the potential to become a standard tool for the assessment of impact metrics.
Physics and Society,Digital Libraries,Social and Information Networks
-
Comparison of a citation-based indicator and peer review for absolute and specific measures of research-group excellence
O. Mryglod,R. Kenna,Yu. Holovatch,B. Berche
DOI: https://doi.org/10.1007/s11192-013-1058-9
2013-05-27
Abstract:Many different measures are used to assess academic research excellence and these are subject to ongoing discussion and debate within the scientometric, university-management and policy-making communities internationally. One topic of continued importance is the extent to which citation-based indicators compare with peer-review-based evaluation. Here we analyse the correlations between values of a particular citation-based impact indicator and peer-review scores in several academic disciplines, from natural to social sciences and humanities. We perform the comparison for research groups rather than for individuals. We make comparisons on two levels. At an absolute level, we compare total impact and overall strength of the group as a whole. At a specific level, we compare academic impact and quality, normalised by the size of the group. We find very high correlations at the former level for some disciplines and poor correlations at the latter level for all disciplines. This means that, although the citation-based scores could help to describe research-group strength, in particular for the so-called hard sciences, they should not be used as a proxy for ranking or comparison of research groups. Moreover, the correlation between peer-evaluated and citation-based scores is weaker for soft sciences.
Digital Libraries,Physics and Society
-
Comparative study of science evaluation practices
Nedra Ibrahim,Anja Habacha Chaibi,Henda Ben Ghézala
DOI: https://doi.org/10.1108/vjikms-12-2021-0293
2024-07-27
VINE Journal of Information and Knowledge Management Systems
Abstract:Purpose Given the magnitude of the literature, a researcher must be selective of research papers and publications in general. In other words, only papers that meet strict standards of academic integrity and adhere to reliable and credible sources should be referenced. The purpose of this paper is to approach this issue from the prism of scientometrics according to the following research questions: Is it necessary to judge the quality of scientific production? How do we evaluate scientific production? What are the tools to be used in evaluation? Design/methodology/approach This paper presents a comparative study of scientometric evaluation practices and tools. A systematic literature review is conducted based on articles published in the field of scientometrics between 1951 and 2022. To analyze data, the authors performed three different aspects of analysis: usage analysis based on classification and comparison between the different scientific evaluation practices, type and level analysis based on classifying different scientometric indicators according to their types and application levels and similarity analysis based on studying the correlation between different quantitative metrics to identify similarity between them. Findings This comparative study leads to classify different scientific evaluation practices into externalist and internalist approaches. The authors categorized the different quantitative metrics according to their types (impact, production and composite indicators), their levels of application (micro, meso and macro) and their use (internalist and externalist). Moreover, the similarity analysis has revealed a high correlation between several scientometric indicators such as author h-index, author publications, citations and journal citations. Originality/value The interest in this study lies deeply in identifying the strengths and weaknesses of research groups and guides their actions. This evaluation contributes to the advancement of scientific research and to the motivation of researchers. Moreover, this paper can be applied as a complete in-depth guide to help new researchers select appropriate measurements to evaluate scientific production. The selection of evaluation measures is made according to their types, usage and levels of application. Furthermore, our analysis shows the similarity between the different indicators which can limit the overuse of similar measures.
-
Misuse of Bibliometric Indexes Published on International Portals for the Evaluation of Academic Staff at Universities in Bosnia and Herzegovina
Izet Masic
DOI: https://doi.org/10.5455/ijbh.2023.11.284-306
2023-01-01
International Journal on Biomedicine and Healthcare
Abstract:Background: The scientific researchers have the role of interacting through published articles in scientific journals or presentations at scientific and professional conferences were they can affect the practices that can make achievements to society and country. or worldwide. For this reason, scientists are encouraged after completing the project and finalizing their research investigation to publish scientific work outcomes in professional and scientific journals. Objective: The aim of this article was to describe scientometric and bibliometric indexes and explaine its importance for its evaluation and measuring quality assessment of published papers in scientific journals deposited in scientific indexed databases. Also, author criticaly analized advantages and disadvantages of current bibliometric portals for creating the list of universities and its accademic staff by counts of deposited articles in databases and number of its citations. Methods: The author searched the most influential online databases and analyzed deposited papers by scientometric/bibliometric indexes, and used a descriptive method to review important facts about scientometrics experiences in scientific and academic practice. The author used facts deposited on the main international portals for analyzing number of citations of deposited scientific papers on Scopus and Google Scholar platform - h-Index and i10-Index and number of citations as basic data fot created top list of most citated scientists in almost of all countries in the world. . Results and Discussion: Bibliometric methods are used for quantitative analysis of written materials. Citation provides guidelines for scientific work because it stimulates scientists to deal with the most current research areas and organizes scientific articles at the world level or shapes and directs them. Citation is influenced by: article quality, understanding of the article, language in which the article is written, loyalty to a group of researchers, article type, etc. Some indicators used in evaluating scientific work are Impact factor (IF); Citation of the article; Journal citations; Number and order of authors, etc. The index factor of influence depends on the quality of the journal, the language in which it was printed, the area it covers, and the journal distribution system. Finaly, three portals and its plaforms (Webometrics, "AD Scientific Index" and Stanford Bibliometric List) are not fully relevant for measuring quality assessment of universities and its academic staff, even as potentialy members of academies of sciences, like it used in the past. They need to improve in the future. In this article, we pointed out that h-Index and Google Scholar indexes for present valuable measures to determine scientific excellence. These criteria should be necessary for quality assessment of the scientific curriculum of scientists and their published papers in journals when experts of indexed databases like Medline, PMC, Scopus, etc., do reviews during the evaluation of applied journals for potentially including indexed databases. Conclusion: Current academies and academicians can propose criteria how improve indexing scientific papers with the consultation of scientific bodies and experts at universities in one country, selected regions, or worldwide. Only quality research with exact results offers the scientific community new information about the examined problem and the researcher’s personal satisfaction, and opening opportunities to receive critical reviews of those who have insight into the research.
-
Using Conventional Bibliographic Databases for Social Science Research: Web of Science and Scopus are not the Only Options
Esther Isabelle Wilder,William H. Walters
DOI: https://doi.org/10.29024/sar.36
2021-01-01
Scholarly Assessment Reports
Abstract:Although large citation databases such as Web of Science and Scopus are widely used in bibliometric research, they have several disadvantages, including limited availability, poor coverage of books and conference proceedings, and inadequate mechanisms for distinguishing among authors. We discuss these issues, then examine the comparative advantages and disadvantages of other bibliographic databases, with emphasis on (a) discipline-centered article databases such as EconLit, MEDLINE, PsycINFO, and SocINDEX, and (b) book databases such as Amazon.com, Books in Print, Google Books, and OCLC WorldCat. Finally, we document the methods used to compile a freely available data set that includes five-year publication counts from SocINDEX and Amazon along with a range of individual and institutional characteristics for 2,132 faculty in 426 U.S. departments of sociology. Although our methods are time-consuming, they can be readily adopted in other subject areas by investigators without access to Web of Science or Scopus (i.e., by faculty at institutions other than the top research universities). Data sets that combine bibliographic, individual, and institutional information may be especially useful for bibliometric studies grounded in disciplines such as labor economics and the sociology of professions. Policy highlightsWhile nearly all research universities provide access to Web of Science or Scopus, these databases are available at only a small minority of undergraduate colleges. Systematic restrictions on access may result in systematic biases in the literature of scholarly communication and assessment.The limitations of the largest citation databases influence the kinds of research that can be most readily pursued. In particular, research problems that use exclusively bibliometric data may be preferred over those that draw on a wider range of information sources.Because books, conference papers, and other research outputs remain important in many fields of study, journal databases cover just one component of scholarly accomplishment. Likewise, data on publications and citation impact cannot fully account for the influence of scholarly work on teaching, practice, and public knowledge.The automation of data compilation processes removes opportunities for investigators to gain first-hand, in-depth understanding of the patterns and relationships among variables. In contrast, manual processes may stimulate the kind of associative thinking that can lead to new insights and perspectives.
-
Professional and Citizen Bibliometrics: Complementarities and ambivalences in the development and use of indicators
Loet Leydesdorff,Paul Wouters,Lutz Bornmann
DOI: https://doi.org/10.48550/arXiv.1609.04793
2016-09-23
Abstract:Bibliometric indicators such as journal impact factors, h-indices, and total citation counts are algorithmic artifacts that can be used in research evaluation and management. These artifacts have no meaning by themselves, but receive their meaning from attributions in institutional practices. We distinguish four main stakeholders in these practices: (1) producers of bibliometric data and indicators; (2) bibliometricians who develop and test indicators; (3) research managers who apply the indicators; and (4) the scientists being evaluated with potentially competing career interests. These different positions may lead to different and sometimes conflicting perspectives on the meaning and value of the indicators. The indicators can thus be considered as boundary objects which are socially constructed in translations among these perspectives. This paper proposes an analytical clarification by listing an informed set of (sometimes unsolved) problems in bibliometrics which can also shed light on the tension between simple but invalid indicators that are widely used (e.g., the h-index) and more sophisticated indicators that are not used or cannot be used in evaluation practices because they are not transparent for users, cannot be calculated, or are difficult to interpret.
Digital Libraries