Abstract:Introduction: During the last few years, a heightened interest has been shown in classifying scene images depicting diverse robotic environments. The surge in interest can be attributed to significant improvements in visual sensor technology, which has enhanced image analysis capabilities. Methods: Advances in vision technology have a major impact on the areas of multiple object detection and scene understanding. These tasks are an integral part of a variety of technologies, including integrating scenes in augmented reality, facilitating robot navigation, enabling autonomous driving systems, and improving applications in tourist information. Despite significant strides in visual interpretation, numerous challenges persist, encompassing semantic understanding, occlusion, orientation, insufficient availability of labeled data, uneven illumination including shadows and lighting, variation in direction, and object size and changing background. To overcome these challenges, we proposed an innovative scene recognition framework, which proved to be highly effective and yielded remarkable results. First, we perform preprocessing using kernel convolution on scene data. Second, we perform semantic segmentation using UNet segmentation. Then, we extract features from these segmented data using discrete wavelet transform (DWT), Sobel and Laplacian, and textual (local binary pattern analysis). To recognize the object, we have used deep belief network and then find the object-to-object relation. Finally, AlexNet is used to assign the relevant labels to the scene based on recognized objects in the image. Results: The performance of the proposed system was validated using three standard datasets: PASCALVOC-12, Cityscapes, and Caltech 101. The accuracy attained on the PASCALVOC-12 dataset exceeds 96% while achieving a rate of 95.90% on the Cityscapes dataset. Discussion: Furthermore, the model demonstrates a commendable accuracy of 92.2% on the Caltech 101 dataset. This model showcases noteworthy advancements beyond the capabilities of current models.

Object segmentation for image indexing in large database

ObjectFusion: an Object Detection and Segmentation Framework with RGB-D SLAM and Convolutional Neural Networks

Resource-Constrained Simultaneous Detection and Labeling of Objects in High-Resolution Satellite Images

Joint Segmentation and Recognition of Categorized Objects from Noisy Web Image Collection

Unsupervised Object Cosegmentation Method Devoted to Image Classification

Pos-DANet: A dual-branch awareness network for small object segmentation within high-resolution remote sensing images

CNN Refinement Based Object Recognition Through Optimized Segmentation

Pixel Objectness: Learning to Segment Generic Objects Automatically in Images and Videos

Efficient Object Detection and Segmentation for Fine-Grained Recognition

A Lightweight YOLOv5-Based Model with Feature Fusion and Dilation Convolution for Image Segmentation

Efficient Object Segmentation and Recognition Using Multi-Layer Perceptron Networks

Attention-Guided Multi-Scale Fusion Network for Similar Objects Semantic Segmentation

HB-net: Holistic bursting cell cluster integrated network for occluded multi-objects recognition

Segmenting objects with Bayesian fusion of active contour models and convnet priors

Fast and Accurate Online Video Object Segmentation Via Tracking Parts.

Fast-SegNet: fast semantic segmentation network for small objects

Cooperative Object Search and Segmentation in Internet Images

Object-Guided Instance Segmentation for Biological Images

Remote intelligent perception system for multi-object detection

SegEIR-Net: A Robust Histopathology Image Analysis Framework for Accurate Breast Cancer Classification

Real-Time Image Segmentation via Hybrid Convolutional-Transformer Architecture Search