Abstract:Fast object tracking on embedded devices is of great importance for applications such as autonomous driving, unmanned aerial vehicle, and intelligent monitoring. Whereas, most of previous general solutions failed to reach this goal due to the facts that (i) high computational complexity and heterogeneous operation steps in the tracking models and (ii) parallelism-limited and bloated hardware platforms (e.g., CPU/GPU). Although previously proposed devices leverage neural dynamics and near-data processing for efficient tracking, their flexibility is limited due to the tight integration with vision sensor and the effectiveness on various video datasets is yet to be fully demonstrated. On the other side, recently the many-core architecture with massive parallelism and optimized memory locality is being widely applied to improve the performance for flexibly executing neural networks. This motivates us to adapt and map an object tracking model based on attractor neural networks with continuous and smooth attractor dynamics onto neural network chips for fast tracking. In order to make the model hardware friendly, we add local-connection restriction. We analyze the tracking accuracy and observe that the model achieves comparable results on typical video datasets. Then, we design a many-core neural network architecture with several computation and transformation operations to support the model. Moreover, by discretizing the continuous dynamics to the corresponding discrete counterpart, designing a slicing scheme for efficient topology mapping, and introducing a constant-restricted scaling chain rule for data quantization, we build a complete mapping framework to implement the tracking model on the many-core architecture. We fabricate a many-core neural network chip to evaluate the real execution performance. Results show that a single chip is able to accommodate the whole tracking model, and a fast tracking speed of nearly 800 FPS (frames per second) can be achieved. This work enables high-speed object tracking on embedded devices which normally have limited resources and energy.

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices

Towards Memory-Efficient Inference in Edge Video Analytics

APPTracker: Improving Tracking Multiple Objects in Low-Frame-Rate Videos

An Adaptive Video Acquisition Scheme for Object Tracking and its Performance Optimization

High Performance Multi Transform Coding Hardware Architecture Design for H.264/AVC

An Efficient Real-Time Object Detection Framework on Resource-Constricted Hardware Devices via Software and Hardware Co-design

P2M-DeTrack: Processing-in-Pixel-in-Memory for Energy-efficient and Real-Time Multi-Object Detection and Tracking

FPGA-Based Vehicle Detection and Tracking Accelerator

Efficient Multi-Object Tracking on Edge Devices via Reconstruction-Based Channel Pruning

Fast and Resource-Efficient Object Tracking on Edge Devices: A Measurement Study

Stereo Matching Accelerator With Re-Computation Scheme and Data-Reused Pipeline for Autonomous Vehicles

A 1096fps Hardware Architecture For Fast Training In Object Tracking

An FPGA Accelerator for High-Speed Moving Objects Detection and Tracking With a Spike Camera

A 58.6mW Real-Time Programmable Object Detector with Multi-Scale Multi-Object Support Using Deformable Parts Model on 1920x1080 Video at 30fps

A Joint Intensity-Neuromorphic Event Imaging System for Resource Constrained Devices

ViTrack: Efficient Tracking on the Edge for Commodity Video Surveillance Systems

A Hardware-Efficient Multi-Resolution Block Matching Algorithm and Its VLSI Architecture for High Definition MPEG-Like Video Encoders

MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices

Lightweight and Energy-Efficient Deep Learning Accelerator for Real-Time Object Detection on Edge Devices

Fast Object Tracking on a Many-Core Neural Network Chip

A Highly Data Reusable And Standard-Compliant Motion Estimation Hardware Architecture