A Dataset of Venture Capitalist Types in China (1978–2021): A Machine-Human Hybrid Approach

Jin Chen,Ruining Cao,Yifei Song,Anan Hu,Ying Ding
DOI: https://doi.org/10.1038/s41597-024-04108-z
2024-01-01
Scientific Data
Abstract:Despite escalating interest in distinguishing among various types of venture capitalists (VCs) and their roles in shaping entrepreneurship and innovation, such research remains sparse in the world's second-largest VC market, i.e., China. To address this important gap, we have devised a machine-human hybrid approach to perform the classification task for VC types. Specifically, we have compiled a list of 49,187 VCs that made investments in China before 2021 from CVSource database, collected VC ownership information from other public sources, developed machine-learning algorithms to predict VC types, and used human coders when machine-learning failed to produce a prediction. Utilizing this hybrid approach, we have classified VCs into one of the following types: GVC (public agency-affiliated, state-owned enterprise-affiliated), CVC (corporate VC), IVC (independent VC), BVC (bank-affiliated VC), FVC (financial/non-bank-affiliated VC), UVC (university-affiliated VC), and PenVC (pension-fund-affiliated VC). We not only provide the most up-to-date database for VC types in the Chinese setting but also demonstrate how to leverage machine-learning algorithms to devise a transparent coding approach for VC-type classifications.
What problem does this paper attempt to address?