Cast vote records: A database of ballots from the 2020 U.S. Election

Shiro Kuriwaki,Mason Reece,Samuel Baltz,Aleksandra Conevska,Joseph R. Loffredo,Can Mutlu,Taran Samarth,Kevin E. Acevedo Jetter,Zachary Djanogly Garai,Kate Murray,Shigeo Hirano,Jeffrey B. Lewis,James M. Snyder Jr.,Charles H. Stewart III
2024-10-24
Abstract:Ballots are the core records of elections. Electronic records of actual ballots cast (cast vote records) are available to the public in some jurisdictions. However, they have been released in a variety of formats and have not been independently evaluated. Here we introduce a database of cast vote records from the 2020 U.S. general election. We downloaded publicly available unstandardized cast vote records, standardized them into a multi-state database, and extensively compared their totals to certified election results. Our release includes vote records for President, Governor, U.S. Senate and House, and state upper and lower chambers -- covering 42.7 million voters in 20 states who voted for more than 2,204 candidates. This database serves as a uniquely granular administrative dataset for studying voting behavior and election administration. Using this data, we show that in battleground states, 1.9 percent of solid Republicans (as defined by their congressional and state legislative voting) in our database split their ticket for Joe Biden, while 1.2 percent of solid Democrats split their ticket for Donald Trump.
Computers and Society,Applications
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the following main problems: 1. **Standardization and centralization of election data**: - The paper introduces a database containing 42.7 million voter - voting records in the 2020 US general election. These records are from 20 states, covering the voting situations for president, governor, senator, representative, and the upper and lower houses of the state legislature. - The author downloaded the publicly available non - standardized voting records and standardized them into a multi - state database for more extensive research and analysis. 2. **Verification and comparison of election data**: - The author made extensive comparisons between the total number of votes in the database and the officially certified election results to ensure the accuracy and reliability of the data. - This process helps to verify the integrity and authenticity of the database, although its purpose is not to conduct an election audit. 3. **Research on election behavior**: - The database provides detailed individual voting records, allowing researchers to study voting behavior more accurately. For example, by analyzing voting choices across different positions, split - ticket voting (the behavior of supporting different party candidates for different positions) can be measured more precisely. - Researchers can use these data to explore various election phenomena, such as voters' preferences in specific elections, and the voting behaviors of different types of voters in different elections. 4. **Research on election management and law**: - This database is also of great significance for the study of election laws, election management, and election integrity. By analyzing voting records, the technical details and potential problems in the election process can be better understood. - The release of the database helps to explore the balance between transparency and privacy, especially how to protect voters' privacy in the election process. ### Specific application scenarios - **Political science research**: - Researchers can use these data to study election behavior, especially the split - ticket phenomenon. For example, the paper mentions that in swing states, 1.9% of staunch Republicans supported Biden in the presidential election, while 1.2% of staunch Democrats supported Trump. - **Election management research**: - Election administrators can use these data to evaluate the efficiency and fairness of the election process, especially in handling a large number of ballots and ensuring the accuracy of vote counting. - **Legal and policy research**: - Legal scholars and policy - makers can use these data to explore legal issues in the election process, such as voter trust, election transparency, and privacy protection. ### Method overview - **Data collection and standardization**: - The author obtained the original voting records from multiple states and counties and standardized them so that they are comparable among different states and counties. - The standardization process includes identifying the party affiliation of each candidate, unifying the coding of invalid votes, and standardizing candidate names, etc. - **Data verification**: - The author compared the standardized data with the officially certified election results to ensure the accuracy and integrity of the data. - Only when the difference at the candidate level is no more than 1% will the data of that county be released. - **Privacy protection**: - To protect voters' privacy, the author further aggregated some voting records to avoid the leakage of specific voters' voting choices. Through the above methods, the paper provides a high - quality, standardized voting - record database, which provides important data support for election - behavior research and election management.