Automobile Driver Fingerprinting

Miro Enev,Alex Takakuwa,Karl Koscher,Tadayoshi Kohno
DOI: https://doi.org/10.1515/popets-2015-0029
2015-09-08
Proceedings on Privacy Enhancing Technologies
Abstract:Abstract Today’s automobiles leverage powerful sensors and embedded computers to optimize efficiency, safety, and driver engagement. However the complexity of possible inferences using in-car sensor data is not well understood. While we do not know of attempts by automotive manufacturers or makers of after-market components (like insurance dongles) to violate privacy, a key question we ask is: could they (or their collection and later accidental leaks of data) violate a driver’s privacy? In the present study, we experimentally investigate the potential to identify individuals using sensor data snippets of their natural driving behavior. More specifically we record the in-vehicle sensor data on the controllerarea- network (CAN) of a typical modern vehicle (popular 2009 sedan) as each of 15 participants (a) performed a series of maneuvers in an isolated parking lot, and (b) drove the vehicle in traffic along a defined ~ 50 mile loop through the Seattle metropolitan area. We then split the data into training and testing sets, train an ensemble of classifiers, and evaluate identification accuracy of test data queries by looking at the highest voted candidate when considering all possible one-vs-one comparisons. Our results indicate that, at least among small sets, drivers are indeed distinguishable using only incar sensors. In particular, we find that it is possible to differentiate our 15 drivers with 100% accuracy when training with all of the available sensors using 90% of driving data from each person. Furthermore, it is possible to reach high identification rates using less than 8 minutes of training data. When more training data is available it is possible to reach very high identification using only a single sensor (e.g., the brake pedal). As an extension, we also demonstrate the feasibility of performing driver identification across multiple days of data collection
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: Can different drivers be identified using the data from modern car interior sensors? Specifically, the authors experimentally studied the ability to identify individual drivers using fragments of in - car sensor data to explore the potential privacy risks that these data may bring. They are concerned about whether the data collected by car manufacturers or third - party components (such as insurance dongles) will disclose the privacy of drivers. ### Background of the Paper With the development of technology, modern cars are integrated with powerful sensors and embedded computers to optimize efficiency, safety, and driving experience. However, the complexity of making inferences using in - car sensor data has not been fully understood. Although there is currently no evidence that car manufacturers or third - party component manufacturers will deliberately violate privacy, a key question is: Will the collection and potential leakage of these data violate the privacy of drivers? ### Experimental Design To verify this problem, the authors carried out the following experiments: 1. **Data Collection**: The in - vehicle sensor data of 15 participants while driving in a closed parking lot and on an open road were recorded. Each participant respectively completed: - Driving tasks in the closed parking lot (such as parallel parking, slalom driving through poles, etc.) - Driving on an open road along a predefined route of approximately 50 miles. 2. **Data Processing**: The collected data were divided into training sets and test sets, trained using multiple classifiers, and the recognition accuracy was evaluated. ### Main Findings - **High Recognition Rate**: Even with very short training data (less than 8 minutes) and a small number of sensors, a high recognition rate can be achieved. In particular, when using all 15 sensors and the entire database for training, a 100% recognition accuracy can be reached. - **Effectiveness of a Single Sensor**: Using only the data of the brake pedal can also achieve a relatively high recognition rate with shorter training data (for example, using the first 15 minutes of open - road driving data, the recognition accuracy is 87%). - **Feasibility of Multi - Day Data**: The authors also demonstrated the feasibility of driver identification across multi - day data collection. ### Significance The research results show that in - car sensor data has significant potential for powerful inferences, some of which may be unfavorable to drivers. Although this study is not aimed at reducing consumers' demand for interactive, personalized, and connected driving experiences, it reveals the potential privacy risks in computerized cars. The authors hope that their work can help manufacturers, drivers, and market participants design policies and mechanisms to balance the utility of data and privacy protection. ### Conclusion This paper experimentally proves the possibility of using in - car sensor data to identify different drivers, emphasizes the importance of data privacy, and provides experimental evidence for future policy - making.