Guest Editorial Introduction to the Special Section on Representation Learning for Visual Content Understanding
Jiwen Lu,Yuxin Peng,Guo-Jun Qi,Jun Yu
DOI: https://doi.org/10.1109/tcsvt.2020.3009095
IF: 5.859
2020-09-01
IEEE Transactions on Circuits and Systems for Video Technology
Abstract:Representation learning methods allow a system to automatically learn robust and discriminative features from raw data for given goals, which play an important role in various visual content understanding applications, such as visual object segmentation, detection, tracking, recognition, and search. The performance of visual content understanding tasks is heavily dependent on the choice of data representation (or features) on which they are applied. Conventional feature representation methods usually employ transformations of data that make it easier to extract useful information, such as scale-invariant feature transform (SIFT), local binary patterns (LBP), and histogram of oriented gradients (HOG). In recent years, deep learning techniques have been widely applied to learn data-driven representations with supervised annotations and achieved great success in different visual content understanding tasks. Representative methods include the ResNet method for image classification, the DeepFace method for face recognition, and the feature pyramid networks (FPNs) method for object detection. Despite recent progresses on deep representation learning with a great amount of annotated data, how to effectively learn visual representation with limited data annotations still requires many efforts. This special section focuses on data-effective representation learning methods for visual content understanding.
engineering, electrical & electronic