Scholars on Twitter: who and how many are they?

Rodrigo Costas,Jeroen van Honk,Thomas Franssen
DOI: https://doi.org/10.48550/arXiv.1712.05667
2017-12-15
Abstract:In this paper we present a novel methodology for identifying scholars with a Twitter account. By combining bibliometric data from Web of Science and Twitter users identified by <a class="link-external link-http" href="http://Altmetric.com" rel="external noopener nofollow">this http URL</a> we have obtained the largest set of individual scholars matched with Twitter users made so far. Our methodology consists of a combination of matching algorithms, considering different linguistic elements of both author names and Twitter names; followed by a rule-based scoring system that weights the common occurrence of several elements related with the names, individual elements and activities of both Twitter users and scholars matched. Our results indicate that about 2% of the overall population of scholars in the Web of Science is active on Twitter. By domain we find a strong presence of researchers from the Social Sciences and the Humanities. Natural Sciences is the domain with the lowest level of scholars on Twitter. Researchers on Twitter also tend to be younger than those that are not on Twitter. As this is a bibliometric-based approach, it is important to highlight the reliance of the method on the number of publications produced and tweeted by the scholars, thus the share of scholars on Twitter ranges between 1% and 5% depending on their level of productivity. Further research is suggested in order to improve and expand the methodology.
Digital Libraries
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to systematically identify scholars who are active on Twitter. Specifically, the authors have developed a new methodology to match scholars with their Twitter accounts by combining bibliometric data from Web of Science (WoS) and Twitter account data recorded by Altmetric.com. This method aims to overcome the shortcomings of small sample sizes and high manual labor intensity in previous studies, and provide a more systematic and large - scale method to identify scholars' Twitter accounts. ### Main problems: 1. **How to identify scholars' Twitter accounts on a large scale**: The paper proposes a new method that combines bibliometric data and social media data to systematically identify scholars' Twitter accounts. 2. **The activity level of scholars on Twitter**: Through this method, researchers can analyze the activity of scholars in different disciplinary fields and with different academic output levels on Twitter. 3. **The characteristics of scholars on Twitter**: It explores the characteristics of scholars on Twitter, such as age distribution and disciplinary distribution, and the relationship between these characteristics and their performance in the academic community. ### Methodology: - **Data sources**: Use author data in the Web of Science database and Twitter account data in the Altmetric.com database. - **Matching algorithm**: Match by normalizing and cleaning data and combining different language elements (such as names, institutions, geographical locations, etc.). - **Scoring system**: Develop a rule - based scoring system to score the matching results to determine the most likely correct matching pairs. ### Results: - **Matching scale**: Through this method, more than 385,000 scholars' Twitter accounts have been successfully matched, which is the largest - scale matching of scholars' Twitter accounts so far. - **Disciplinary distribution**: Scholars in the social sciences and humanities are the most active on Twitter, while scholars in the natural sciences are the least active. - **Age distribution**: Scholars who are active on Twitter are generally younger than those who are not. ### Significance: - **Systematicness**: It provides a systematic method to identify scholars' Twitter accounts in large - scale datasets. - **Verifiability**: It has been verified by external standards (such as data provided by ORCID), which improves the reliability of the method. - **Expandability**: Once scholars' Twitter accounts are identified, other bibliometric data, such as institutions, research fields, citation impacts, etc., can be further linked to more comprehensively analyze scholars' online activities. By solving these problems, this research provides a new perspective for understanding scholars' behavior on social media and lays the foundation for future related research.