Luca Pajola,Saskia Laura Schröer,Pier Paolo Tricomi,Mauro Conti,Giovanni Apruzzese
Abstract:Billions of individuals engage with Online Social Networks (OSN) daily. The owners of OSN try to meet the demands of their end-users while complying with business necessities. Such necessities may, however, lead to the adoption of restrictive data access policies that hinder research activities from "external" scientists -- who may, in turn, resort to other means (e.g., rely on static datasets) for their studies. Given the abundance of literature on OSN, we -- as academics -- should take a step back and reflect on what we have done so far, after having written thousands of papers on OSN. This is the first paper that provides a holistic outlook to the entire body of research that focused on OSN -- since the seminal work by Acquisti and Gross (2006). First, we search through over 1 million peer-reviewed publications, and derive 13,842 papers that focus on OSN: we organize the metadata of these works in the Minerva-OSN dataset, the first of its kind -- which we publicly release. Next, by analyzing Minerva-OSN, we provide factual evidence elucidating trends and aspects that deserve to be brought to light, such as the predominant focus on Twitter or the difficulty in obtaining OSN data. Finally, as a constructive step to guide future research, we carry out an expert survey (n=50) with established scientists in this field, and coalesce suggestions to improve the status quo such as an increased involvement of OSN owners. Our findings should inspire a reflection to "rescue" research on OSN. Doing so would improve the overall OSN ecosystem, benefiting both their owners and end-users and, hence, our society.
What problem does this paper attempt to address?
The problem that this paper attempts to solve lies in the overall trend and current situation analysis in the field of online social network (OSN) research. Specifically, by reviewing a large number of OSN - related literatures since 2006, the authors aim to:
1. **Comprehensively examine the development process of OSN research**: Starting from Acquisti and Gross's research on Facebook in 2006 to the latest progress in 2023, the paper systematically collected more than 1 million peer - reviewed publications and screened out 13,842 OSN - related papers from them to construct a dataset named Minerva - OSN.
2. **Reveal the trends and problems in the research**: Through the analysis of the Minerva - OSN dataset, the paper provides empirical evidence and reveals some important trends and aspects in OSN research, such as excessive attention to specific OSNs (especially Twitter), difficulties in obtaining OSN data, etc.
3. **Put forward improvement suggestions**: In order to guide future research, the authors conducted an expert survey (n = 50), collected the opinions of well - known OSN researchers, and proposed suggestions to improve the current research situation, such as increasing the participation of OSN owners.
### Main contributions:
- **Large - scale meta - analysis**: Conducted the most comprehensive meta - analysis of OSN - related literatures so far, revealed research trends and aspects, and linked them to the OSN ecosystem.
- **Public dataset**: Released the Minerva - OSN dataset, which contains 13,842 papers covering 91 OSNs, and this is the first dataset of its kind.
- **Expert survey**: Collected the opinions of 50 well - known OSN researchers through an anonymous expert survey to verify the discovered problems and put forward improvement suggestions.
### Research background:
Online social networks (OSNs) have become a part of the daily lives of billions of people. They are not only used for social interaction but also widely applied in multiple fields such as career development, news dissemination, and e - commerce. However, in order to meet commercial needs, OSN owners often adopt restrictive data access policies, which pose obstacles to the research activities of external scientists. Therefore, the academic community needs to reflect on past research achievements to better guide future research directions.
### Research methods:
- **Data collection**: Systematically collected OSN - related papers published between 2006 and 2023 through Google Scholar and Scopus databases.
- **Topic modeling**: Used the BERTopic model to perform topic modeling on paper abstracts and identified 17 major research topics.
- **Expert verification**: Verified the accuracy of topic modeling through a triple - review system.
### Main findings:
- **High research concentration**: Although 296 OSNs were considered, only 91 (30.6%) were involved in academic research, and most of these OSNs had less than 100 research papers.
- **Twitter's dominant position**: Twitter is the most studied OSN, followed by Facebook, Wikipedia, and YouTube.
- **Diverse research topics**: The main research topics include multimedia retrieval and tagging, knowledge mining, security and privacy risks, etc., and the research interests in these topics change over time.
### Conclusion:
Through systematic meta - analysis and expert surveys, this paper reveals the current situation and existing problems in the OSN research field and proposes improvement suggestions. These findings help the academic community better understand the development trends of OSN research, promote the overall improvement of the OSN ecosystem, and ultimately benefit society.