Abstract:In the evolving landscape of scientific publishing, it is important to understand the drivers of high-impact research, to equip scientists with actionable strategies to enhance the reach of their work, and to understand trends in the use of modern scientific publishing tools to inform their further development. Here, we study trends in the use of early preprint publications and revisions on ArXiv and the use of X (formerly Twitter) for promotion of such papers in computer science and physics. We find that early submissions to ArXiv and promotion on X have soared in recent years. Estimating the effect that the use of each of these modern affordances has on the number of citations of scientific publications, we find that peer-reviewed conference papers in computer science that are submitted early to ArXiv gain on average $21.1 \pm 17.4$ more citations, revised on ArXiv gain $18.4 \pm 17.6$ more citations, and promoted on X gain $44.4 \pm 8$ more citations in the first 5 years from an initial publication. In contrast, journal articles in physics experience comparatively lower boosts in citation counts, with increases of $3.9 \pm 1.1$, $4.3 \pm 0.9$, and $6.9 \pm 3.5$ citations respectively for the same interventions. Our results show that promoting one's work on ArXiv or X has a large impact on the number of citations, as well as the number of influential citations computed by Semantic Scholar, and thereby on the career of researchers. These effects are present also for publications in physics, but they are relatively smaller. The larger relative effect sizes, effects of promotion accumulating over time, and elevated unpredictability of the number of citations in computer science than in physics suggest a greater role of world-of-mouth spreading in computer science than in physics.

On the Use of ArXiv as a Dataset

unarXive 2022: All arXiv Publications Pre-Processed for NLP, Including Structured Full-Text and Citation Network

Large Synthetic Data from the arXiv for OCR Post Correction of Historic Scientific Articles

Text mining arXiv: a look through quantitative finance papers

Scientometric engineering: Exploring citation dynamics via arXiv eprints

Merging the Citations Received by Arxiv-Deposited E-Prints and Their Corresponding Published Journal Articles: Problems and Perspectives.

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

Quantitative Analysis of AI-Generated Texts in Academic Research: A Study of AI Presence in Arxiv Submissions using AI Detection Tool

Subdivisions and Crossroads: Identifying Hidden Community Structures in a Data Archive's Citation Network

Can Microsoft Academic be used for citation analysis of preprint archives? The case of the Social Science Research Network

Discovering Mathematical Objects of Interest -- A Study of Mathematical Notations

WithdrarXiv: A Large-Scale Dataset for Retraction Study

Patterns of Text Reuse in a Scientific Corpus

Measuring the Evolution of a Scientific Field through Citation Frames

arXivEdits: Understanding the Human Revision Process in Scientific Writing

The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices

AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing

All Data on the Table: Novel Dataset and Benchmark for Cross-Modality Scientific Information Extraction

Effects of Research Paper Promotion via ArXiv and X

Public Git Archive: a Big Code dataset for all

Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models