Abstract:The field of image classification has experienced remarkable improvements with the advent of deep learning techniques, especially Deep Convolutional Neural Networks. The present study provides an extensive exploration of the junction where image classification based on Deep Convolutional Neural Networks meets human visual cognition. Utilizing the inherent ability of these networks to automatically learn hierarchical features from raw pixel data, this research examines their potential in classifying images from diverse complex datasets, emphasizing predominantly on the extensively utilized ImageNet dataset. The initial aspect of this study involves training and evaluating models based on Deep Convolutional Neural Networks on the ImageNet dataset, which comprises millions of labeled images spanning across thousands of categories. Well-established network architectures such as AlexNet, VGGNet, GoogLeNet, and ResNet are employed, and their performance in the challenging task of image classification is assessed. Rigorous experiments highlight the strengths and weaknesses of each model while emphasizing the prospects of transfer learning and fine-tuning. Following this, the interpretability of Deep Convolutional Neural Networks is explored by using visualization techniques to comprehend the learned feature representations. By visualizing activation maps and class-specific saliency maps, valuable insights are gained into the regions of interest that guide the decision-making of these models. Moreover, the correlation between the features extracted by these models and human visual attention mechanisms is examined to shed light on the focus of attention of the models. The study also addresses the difficulties that adversarial attacks, data bias, and generalization capabilities present to Deep Convolutional Neural Networks. Strategies to enhance the robustness and adaptability of the models across various domains are examined, linking these observations to human cognitive behavior.

Leveraging Attention-Based Visual Clue Extraction for Image Classification

Class attention network for image recognition

Learning More Discriminative Clues with Gradual Attention for Fine-Grained Visual Categorization.

Attention-Aware Deep Feature Embedding for Remote Sensing Image Scene Classification

Attention-based cropping and erasing learning with coarse-to-fine refinement for fine-grained visual classification

Attention Graph: Learning Effective Visual Features for Large-Scale Image Classification

GEA-net - Global Embedded Attention Neural Network for Image Classification.

Research on image classification based on residual group multi-scale enhanced attention network

Beyond the Attention: Distinguish the Discriminative and Confusable Features For Fine-grained Image Classification

Aggregate Attention Module for Fine-Grained Image Classification

An Image Classification Method Based on Adaptive Attention Mechanism and Feature Extraction Network

Attention-aware Perceptual Enhancement Nets for Low-Resolution Image Classification

Dense Attention Convolutional Network for Image Classification

Visual Attention in Multi-Label Image Classification.

Research on image classification leveraging deep convolutional neural networks and visual cognition

Attend and Guide (AG-Net): A Keypoints-driven Attention-based Deep Network for Image Recognition

Feature Channel Adaptive Enhancement for Fine-Grained Visual Classification

Fine-grained Image Recognition Via Attention Interaction and Counterfactual Attention Network

Deep Learning Methods With the Improved Attention for Explainable Image Recognition

Attending Category Disentangled Global Context for Image Classification

Focus Longer to See Better: Recursively Refined Attention for Fine-Grained Image Classification