Abstract:Convolutional neural network (CNN) is quite popular in computer vision, especially in image classification with excellent performance. However, limited by the convolution kernels, CNN-based classifiers are hard to extract global feature from the original image, while exact object locations in the environment are included in the global feature. One popular way to improve global feature extraction performance is to use graph neural network (GNN) which can aggregate global information through the connection relationship of different nodes. In this work, a novel end-to-end graph neural network architecture is proposed, in which local and global-attention feature are used simultaneously to achieve more accurate predictions. In this architecture, a CNN block is designed to learn local feature and graph convolutional neural network (GCN) is used to learn global feature. Global-attention feature for final prediction is down-sampled from global feature by the proposed global multi-head self-attention pooling (GMSAPool) based on self-attention mechanism, which reconstructs the input graph by introducing virtual node and automatically assigns different weights to each node to obtain a more representative global-attention feature. In addition, the proposed architecture can be trained without converting images to graphs in advance, and the computational burden can also be reduced. This approach is demonstrated on three open datasets (Agricultural Disease, Caltech256 and CIFAR-100) to validate the effectiveness. The determined experimental results showed that: (1) The proposed model achieve 84.46%, 77.80%, and 83.33% on the Macro-F1 in three datasets respectively, improving over the best baselines; (2) Global-attention feature that is more conducive to the final prediction is extracted by GMSAPool from numerous nodes, in which Macro-P,Macro-R and Macro-F1 are respectively improved 3.655%,1.12%,2.715% on average.

Multilabel Recognition Algorithm With Multigraph Structure

Multi-Label Image Recognition With Graph Convolutional Networks

Learning Graph Convolutional Networks for Multi-Label Recognition and Applications

Multi-Label Classification with Label Graph Superimposing

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

Graph Convolution Network Based Representation for Multi-View Multi-Label Learning

Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition

MG-GCN: Multi-Granularity Graph Convolutional Neural Network for Multi-Label Classification in Multi-Label Information System

Multi-graph Fusion Graph Convolutional Networks with pseudo-label supervision

Multi-label remote sensing image classification with deformable convolutions and graph neural networks

Multigraph Fusion for Dynamic Graph Convolutional Network

Multi-scale Graph Convolutional Networks with Self-Attention

Multi-Modal Multi-Instance Multi-Label Learning with Graph Convolutional Network

Multi-Channel Graph Convolutional Networks for Graphs with Inconsistent Structures and Features

Attention Multihop Graph and Multiscale Convolutional Fusion Network for Hyperspectral Image Classification

Multi-label Image Classification using Adaptive Graph Convolutional Networks: from a Single Domain to Multiple Domains

G-CAM: Graph Convolution Network Based Class Activation Mapping for Multi-label Image Recognition.

Multilabel Classification Based on Graph Neural Networks

A GNN Architecture with Local and Global-Attention Feature for Image Classification