Abstract:7 FEATURE representation and learning is at the core ofmany 8 computer vision problems such as image classification, 9 object recognition, action recognition, object tracking, image 10 search, biometrics and many others. In the past two decades 11 remarkable progress has been witnessed in feature represen12 tation and learning, which mainly consist of two important 13 development stages. In the first stage from 1995 to 2012 (i.e., 14 the predeep learning era), the field was dominated by mile15 stone handcrafted feature descriptors such as SIFT, SURF, 16 HOG, LBP, Bag of Visual Words, Fisher Vector, etc. The sec17 ond stage, i.e., the deep learning era, starts from 2012 when a 18 team led byHintonwon the prestigious ImageNet Challenge 19 using deep learning techniques rather than traditional hand20 crafted features. The second stage is featured by deep learn21 ing based representations especially Deep Convolutional 22 Neural Networks (DeepCNNs) which can learn powerful 23 feature representations with multiple levels of abstraction 24 directly from data. 25 Deep learning techniques have attracted enormous atten26 tion and have brought about considerable breakthroughs for 27 many problems in computer vision. Increased computa28 tional power, deeper andmore complicated deep neural net29 works, and the availability of large scale datasets are fueling 30 computer vision systems. Despite the great success, the 31 known deficiencies of deep neural networks have not been 32 fully addressed, such as data hungry, energy hungry, lack of 33 theoretical interpretability, etc. 34 Nowadays, intelligence is moving towards edge devices. 35 Running machine learning systems on the end devices (e.g., 36 smartphones, automobiles, wearable devices or Internet of 37 Things devices) versus in the cloud has various benefits such 38 as immediate response, enhanced reliability, increased pri39 vacy, and efficient use of network bandwidth. However, many 40 realtime applications such as online learning, incremental 41 learning, mobile, embedded, or wearable devices with limited 42 resources and tight power budgets, or real time systems in 43 which constraints are imposed by a limited economical budget, 44 expose the inadequacies of existing algorithms, and require 45 feature representations that are computationally and memory 46 efficient. In addition, those applications where only limited 47 amounts of annotated training data can be gathered (such as 48 withmany visual inspection ormedical diagnostics tasks) pose 49 great challenges for applying state of the art deep neural net50 works. Therefore, despite the great strides, especially over 51 recent years, there is continued need for vigorous research in 52 this area to solve many challenging problems, by developing 53 compact, efficient feature representations from three aspects: 54 computationally efficient, label efficient, and sample efficient. 55 Since 2017, we have organized four international work56 shops associated with top conferences (ICCV2017, 57 ECCV2018, CVPR2019 and ICCV2019), explicitly devoted to 58 the topic “Compact and Efficient Feature Representation and 59 Learning in Computer Vision”. This is a clear sign of the 60 growing interest in computer vision around these themes. 61 The goal of this special section has been to solicit and publish 62 high quality papers that bring a clear picture of the state of 63 the art along this direction, and identify future promising 64 research directions. As guest editors of this special section, 65 we were happy to receive 25 submissions to our special sec66 tion. After a careful review process, we accepted ten papers 67 for publication. We thank the reviewers who provided 68 detailed, insightful, and timely reviews, leading to the high 69 quality of accepted papers. We also thank TPAMI EIC Sven 70 Dickinson and Associate EICs for recognizing the wide71 spread interest in this field, which warrants this special sec72 tion. The accepted 10 papers in this special section can be 73 grouped into five differentmain categories:

Guest Editorial Introduction to the Special Section on Intelligent Visual Content Analysis and Understanding

Guest Editorial Introduction to the Special Section on Representation Learning for Visual Content Understanding

Guest Editorial Introduction to the Special Section on Video and Language

Guest Editorial Special Section on Visual Saliency Computing and Learning.

Editorial IEEE Transactions on Multimedia Special Section on Video Analytics: Challenges, Algorithms, and Applications.

Introduction to the Special Section on Contextual Object Analysis in Complex Scenes

Guest Editorial Introduction to the Special Issue on Advanced Machine Learning Methodologies for Large-Scale Video Object Segmentation and Detection

Guest Editorial Introduction to the Special Issue on Label-Efficient Learning on Video Data

Introduction to the Special Section on Intelligent Visual Interfaces for Text Analysis

IEEE ACCESS SPECIAL SECTION EDITORIAL: RECENT ADVANTAGES OF COMPUTERVISION

IEEE Access Special Section Editorial: Biologically Inspired Image Processing Challenges and Future Directions

Intelligent Visual Media Processing: when Graphics Meets Vision.

’ Introduction to the Special 2 Section on Compact and Efficient Feature 3 Representation and Learning in Computer Vision

New advances in visual computing for intelligent processing of visual media and augmented reality

Guest Editorial: Content-Aware Visual Systems: Analysis, Streaming, and Retargeting

Introduction to the Special Section on Visual Computing in the Cloud: Fundamentals and Applications

Introduction to the Special Section on Deep Learning in Video Enhancement and Evaluation: the New Frontier

Community-Aware Photo Quality Evaluation by Deeply Encoding Human Perception

Guest Editorial: Visual Analytics in Multimedia—Opportunities and Research Challenges

Visuals to Text: A Comprehensive Review on Automatic Image Captioning

Guest Editorial Introduction to the Issue on Pre-Trained Models for Multi-Modality Understanding