Abstract:The cooperative, connected, and automated mobility (CCAM) infrastructure plays a key role in understanding and enhancing the environmental perception of autonomous vehicles (AVs) driving in complex urban settings. However, the deployment of CCAM infrastructure necessitates the efficient selection of the computational processing layer and deployment of machine learning (ML) and deep learning (DL) models to achieve greater performance of AVs in complex urban environments. In this paper, we propose a computational framework and analyze the effectiveness of a custom-trained DL model (YOLOv8) when deployed in diverse devices and settings at the vehicle-edge-cloud-layered architecture. Our main focus is to understand the interplay and relationship between the DL model's accuracy and execution time during deployment at the layered framework. Therefore, we investigate the trade-offs between accuracy and time by the deployment process of the YOLOv8 model over each layer of the computational framework. We consider the CCAM infrastructures, i.e., sensory devices, computation, and communication at each layer. The findings reveal that the performance metrics results (e.g., 0.842 mAP@0.5) of deployed DL models remain consistent regardless of the device type across any layer of the framework. However, we observe that inference times for object detection tasks tend to decrease when the DL model is subjected to different environmental conditions. For instance, the Jetson AGX (non-GPU) outperforms the Raspberry Pi (non-GPU) by reducing inference time by 72%, whereas the Jetson AGX Xavier (GPU) outperforms the Jetson AGX ARMv8 (non-GPU) by reducing inference time by 90%. A complete average time comparison analysis for the transfer time, preprocess time, and total time of devices Apple M2 Max, Intel Xeon, Tesla T4, NVIDIA A100, Tesla V100, etc., is provided in the paper. Our findings direct the researchers and practitioners to select the most appropriate device type and environment for the deployment of DL models required for production.

Performance Implications of Multi-Chiplet Neural Processing Units on Autonomous Driving Perception

A Scalable Real-time Semantic Segmentation Network for Autonomous Driving

DaDianNao: A Machine-Learning Supercomputer

Characterizing Perception Module Performance and Robustness in Production-Scale Autonomous Driving System.

NeurAll: Towards a Unified Visual Perception Model for Automated Driving

Chiplets on Wheels: Review Paper on Holistic Chiplet Solutions for Autonomous Vehicles

Hardware Accelerators in Autonomous Driving

Multi-Task Network Pruning and Embedded Optimization for Real-time Deployment in ADAS

LiDAR-BEVMTN: Real-Time LiDAR Bird's-Eye View Multi-Task Perception Network for Autonomous Driving

Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets

Performance/power assessment of CNN packages on embedded automotive platforms

Computing Utilization Enhancement for Chiplet-based Homogeneous Processing-in-Memory Deep Learning Processors

Panoptic Driving Perception Model and Inference Acceleration Based on FPGA

ProAI: An Efficient Embedded AI Hardware for Automotive Applications -- a Benchmark Study

Mobip: a lightweight model for driving perception using MobileNet

NN-Baton: DNN Workload Orchestration and Chiplet Granularity Exploration for Multichip Accelerators

Large-Scale Bandwidth and Power Optimization for Multi-Modal Edge Intelligence Autonomous Driving

DeepPicarMicro: Applying TinyML to Autonomous Cyber Physical Systems

Perception Helps Planning: Facilitating Multi-Stage Lane-Level Integration via Double-Edge Structures

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-based Accelerators

A Vehicle-Edge-Cloud Framework for Computational Analysis of a Fine-Tuned Deep Learning Model