Berkeley Open Extended Reality Recordings 2023 (BOXRR-23): 4.7 Million Motion Capture Recordings from 105,852 Extended Reality Device Users

Vivek Nair,Wenbo Guo,Rui Wang,James F. O'Brien,Louis Rosenberg,Dawn Song
DOI: https://doi.org/10.1109/TVCG.2024.3372087
2023-10-01
Abstract:Extended reality (XR) devices such as the Meta Quest and Apple Vision Pro have seen a recent surge in attention, with motion tracking "telemetry" data lying at the core of nearly all XR and metaverse experiences. Researchers are just beginning to understand the implications of this data for security, privacy, usability, and more, but currently lack large-scale human motion datasets to study. The BOXRR-23 dataset contains 4,717,215 motion capture recordings, voluntarily submitted by 105,852 XR device users from over 50 countries. BOXRR-23 is over 200 times larger than the largest existing motion capture research dataset and uses a new, highly efficient purpose-built XR Open Recording (XROR) file format.
Human-Computer Interaction
What problem does this paper attempt to address?
The problem this paper attempts to address is the lack of large-scale human motion datasets, particularly for users of extended reality (XR) devices. Current research lacks a deep understanding of the significance of these data in areas such as security, privacy, and usability. Moreover, existing datasets are relatively small and do not meet research needs. Therefore, the paper introduces a new dataset named BOXRR-23, which includes 4,717,215 motion capture records from 105,852 XR device users, making it over 200 times larger than the largest existing motion capture dataset. This dataset is not only vast in scale but also diverse in origin, covering users from over 50 countries and utilizing various XR devices, thus providing a valuable resource for research in related fields. Specifically, this dataset can be used in the following areas: 1. **Security and Privacy Research**: Analyzing user motion patterns to achieve passive identification and authentication of XR users. 2. **Human-Computer Interaction Research**: Studying user interaction patterns in XR environments to identify interactions that may cause discomfort or injury. 3. **Motion Synthesis Research**: Training large-scale generative models to synthesize natural human motion data. 4. **Machine Learning Research**: Handling long sequence data and developing new machine learning techniques to efficiently process these data. Additionally, the paper introduces the data collection process, the innovative data format (XROR format), the structure and content of the dataset, and discusses the potential applications and limitations of the dataset.