Learning Grammar of Complex Activities via Deep Neural Networks

Becky Mashaido
DOI: https://doi.org/10.48550/arXiv.2101.02774
2021-01-08
Abstract:Motivated by the growing amount of publicly available video data on online streaming services and an increased interest in applications that analyze continuous video streams such as autonomous driving, this technical report provides a theoretical insight into deep neural networks for video learning, under label constraints. I build upon previous work in video learning for computer vision, make observations on model performance and propose further mechanisms to help improve our observations.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?