Abstract:Countries in South Asia experience many catastrophic flooding events regularly. Through image classification, it is possible to expedite search and rescue initiatives by classifying flood zones, including houses and humans. We create a new dataset collecting aerial imagery of flooding events across South Asian countries. For the classification, we propose a fine-tuned Compact Convolutional Transformer (CCT) based approach and some other cutting-edge transformer-based and Convolutional Neural Network-based architectures (CNN). We also implement the YOLOv8 object detection model and detect houses and humans within the imagery of our proposed dataset, and then compare the performance with our classification-based approach. Since the countries in South Asia have similar topography, housing structure, the color of flood water, and vegetation, this work can be more applicable to such a region as opposed to the rest of the world. The images are divided evenly into four classes: 'flood', 'flood with domicile', 'flood with humans', and 'no flood'. After experimenting with our proposed dataset on our fine-tuned CCT model, which has a comparatively lower number of weight parameters than many other transformer-based architectures designed for computer vision, it exhibits an accuracy and macro average precision of 98.62% and 98.50%. The other transformer-based architectures that we implement are the Vision Transformer (ViT), Swin Transformer, and External Attention Transformer (EANet), which give an accuracy of 88.66%, 84.74%, and 66.56% respectively. We also implement DCECNN (Deep Custom Ensembled Convolutional Neural Network), which is a custom ensemble model that we create by combining MobileNet, InceptionV3, and EfficientNetB0, and we obtain an accuracy of 98.78%. The architectures we implement are fine-tuned to achieve optimal performance on our dataset.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of low - efficiency search and rescue (SAR) faced by South Asian countries during flood disasters. Specifically, the authors propose an architecture based on the fine - tuned attention mechanism, which classifies flood scenes using aerial images taken by drones to accelerate rescue operations. #### Main problem background 1. **Frequent and severe flood disasters**: - South Asian countries such as Bangladesh, India and Pakistan are often hit by floods. These floods not only damage houses and infrastructure but also cause a large number of casualties and displacements. - For example, the flood in the northeast of Bangladesh in June 2024 affected about 1.8 million people and many houses were flooded; in 2022, the flood in Pakistan flooded one - third of the country's territory and affected 33 million people. 2. **Limitations of traditional rescue methods**: - When a flood occurs, government agencies and other aid organizations usually rely on boats and planes for physical search. This method is time - consuming and reduces the rescue efficiency. - From the ground perspective, houses and landmarks are covered by floods, making it difficult to quickly locate survivors. #### Solutions To solve the above problems, the authors propose the following methods: 1. **Construction of a new data set**: - Collect and construct a new aerial image data set covering flood events in South Asian countries, which is divided into four categories: 'flood', 'flood with residences', 'flood with people' and 'no flood'. 2. **Fine - tuned Compact Convolutional Transformer (CCT) and other models**: - Use the fine - tuned Compact Convolutional Transformer (CCT) and other cutting - edge Transformer and Convolutional Neural Network (CNN) architectures for classification experiments. - The experimental results show that the CCT model exhibits a high accuracy rate (98.62%) and macro - average precision (98.50%) on this data set. 3. **Application of the target detection model**: - Implement the YOLOv8 target detection model to detect houses and humans in images and compare the results with the classification method. 4. **Cross - data set verification**: - Apply the same model to another public flood image data set FloodNet to verify its generalization ability. #### Expected effects By introducing image classification technology, especially using aerial images obtained by drones, the location of flood areas, houses and people can be identified and mapped more quickly and accurately, thereby improving the efficiency of rescue operations and reducing casualties and property losses caused by floods. In conclusion, this paper is committed to enhancing the flood disaster response capacity in South Asia by combining advanced deep - learning technologies and practical application requirements, especially providing more efficient support in search and rescue.

Aerial Flood Scene Classification Using Fine-Tuned Attention-based Architecture for Flood-Prone Countries in South Asia

Transformer-based Flood Scene Segmentation for Developing Countries

UAVs in Disaster Management: Application of Integrated Aerial Imagery and Convolutional Neural Network for Flood Detection

DAM-Net: Global Flood Detection from SAR Imagery Using Differential Attention Metric-Based Vision Transformers

PDCA-FORMER: PRIOR-DIAGONAL CROSS ATTENTION-GUIDED TRANSFORMER FOR FLOOD MAPPING FROM SAR IMAGERY: A CASE IN KHARTOUM

Enabling Quick, Accurate Crowdsourced Annotation for Elevation-Aware Flood Extent Mapping

Flood Detection Using Multi-Modal and Multi-Temporal Images: A Comparative Study

Application of Deep Learning on UAV-Based Aerial Images for Flood Detection

Optimized Deep Learning Model for Flood Detection Using Satellite Images

Multi-Scale and Context-Aware Framework for Flood Segmentation in Post-Disaster High Resolution Aerial Images

Floodwater Extraction from UAV Orthoimagery Based on a Transformer Model

DeepFlood: A deep learning based flood detection framework using feature-level fusion of multi-sensor remote sensing images

A Deep Learning-based Approach to Predict the Flood Patterns Using Sentinel-1A Time Series Images

Inferring the past: a combined CNN-LSTM deep learning framework to fuse satellites for historical inundation mapping

Flood Detection in Urban Areas Using Satellite Imagery and Machine Learning

FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding

Near Real-Time Flood Monitoring Using Multi-Sensor Optical Imagery and Machine Learning by GEE: An Automatic Feature-Based Multi-Class Classification Approach

Integrating deep learning, satellite image processing, and spatial-temporal analysis for urban flood prediction

Flood Susceptibility Assessment in Urban Areas via Deep Neural Network Approach

Detecting floodwater on roadways from image data with handcrafted features and deep transfer learning

Cross-Geography Generalization of Machine Learning Methods for Classification of Flooded Regions in Aerial Images