Driving behavior analysis and classification by vehicle OBD data using machine learning

Raman Kumar,Anuj Jain
DOI: https://doi.org/10.1007/s11227-023-05364-3
2023-05-19
Abstract:The transportation industry's focus on improving performance and reducing costs has driven the integration of IoT and machine learning technologies. The correlation between driving style and behavior with fuel consumption and emissions has highlighted the need to classify different driver's driving patterns. In response, vehicles now come equipped with sensors that gather a wide range of operational data. The proposed technique collects critical vehicle performance data, including speed, motor RPM, paddle position, determined motor load, and over 50 other parameters through the OBD interface. The OBD-II diagnostics protocol, the primary diagnostic process used by technicians, can acquire this information via the car's communication port. OBD-II protocol is used to acquire real-time data linked to the vehicle's operation. This data are used to collect engine operation-related characteristics and assist with fault detection. The proposed method uses machine learning techniques, such as SVM, AdaBoost, and Random Forest, to classify driver's behavior based on ten categories that include fuel consumption, steering stability, velocity stability, and braking patterns. The solution offers an effective means to study driving behavior and recommend corrective actions for efficient and safe driving. The proposed model offers a classification of ten driver classes based on fuel consumption, steering stability, velocity stability, and braking patterns. This research work uses data extracted from the engine's internal sensors via the OBD-II protocol, eliminating the need for additional sensors. The collected data are used to build a model that classifies driver's behavior and can be used to provide feedback to improve driving habits. Key driving events, such as high-speed braking, rapid acceleration, deceleration, and turning, are used to characterize individual drivers. Visualization techniques, such as line plots and correlation matrices, are used to compare drivers' performance. Time-series values of the sensor data are considered in the model. The supervised learning methods are employed to compare all driver classes. SVM, AdaBoost, and Random Forest algorithms are implemented with 99%, 99%, and 100% accuracy, respectively. The suggested model offers a practical approach to examining driving behavior and suggesting necessary measures to enhance driving safety and efficiency.
What problem does this paper attempt to address?