Abstract:To support the general problem of Autonomous Underwater/Surface Vehicle (AUV/ASV) based chemical detection, source localization, we propose the design of a system that is a fusion of AUV/ASV with Q-learning, and a real-time underwater mass spectrometer, used to provide the feedback and reward signal for in situ source localization. Additionally, an autonomous sampler can be coupled to the system permitting molecular material archiving for subsequent expanded measurement and validation in the lab. This real-time chemical sensor and archived sample capture and verification approach yields an adaptive sensing and sampling system. The in situ mass spectrometer allows for real time measurement of membrane compatible chemistries such as volatile oxidative compounds (VOC’s) and lightweight gases, while the sampler purifies, enriches and accurately isolates targeted molecular compounds in the field for subsequent full mass spectrometer analysis back in the lab. In the overall AUV system design, the battery driven mass spectrometer provides real-time mass spectrometer signals for reinforcement learning (RL) behaviors and the portable adaptive sampling system automates sample collection, molecular purification/concentration and preservation. The mass spectrometer is of the membrane inlet type and the automated sampler system is a combination of customizable fluidic management systems, pumps, valve arrays and motion control systems. For the field sampling use, the prototype sampling module is designed for triggered sensing and sampling but also can be variably actuated to sample variable volumes over any period of time. The mass spectrometer and sampling systems can be hosted on AUVs/ASVs for most chemical source localization activities. The entire mobile system: AUV mobile platform, reinforcement learning controller, mass spectrometer, and sampler, constitute an adaptive chemical sampling platform. The ‘back end’ laboratory identification is performed using any type of mass spectrometers and can provide a high confidence verification of the specific material archived. The results from the lab verification can also constitute the design of a reward signal for subsequent Q-learning training, mass spectrometer data sub-system to increase the accuracy of the source localization policy. The potential of using mass spectrometer data to train a Q-learning based agent allows the team to pretrain the agent with real sensory data similar to that which will be seen in the field for future deployments. Appropriately simulated data can approximate the environment and distribution patterns that are anticipated for the development of a custom reward function, representative of the mission objective. Preliminary simulations testing the agent’s performance, utilizing a trained policy in a similar environment in which the location of a generic `pollution source’ has been perturbed from the training scenario, have shown promising results. The policy is acquired by training on pollution data for a set environment in which the trade-off between exploration and exploitation is defined appropriately for the environment size, pollution distribution and training duration to optimize the agent’s learning. That policy is then tested in a similar but slightly perturbed environment. This method can be applied to future missions to allow for continual policy update based on the observed data. This would be an advantageous approach as it limits the necessity for operator-vehicle communication giving the agent sufficient autonomy to locate the source based on its prior training as well as circumvents the need for a model-based decision and control approach as the agent becomes better trained through real world observations. This is a model-free learning approach requiring no a priori knowledge of the environment. This has a distinct benefit over model-based approaches which are dependent on the accuracy and fidelity of the environmental model during the training of the agent, which is notoriously difficult both logistically and computationally.

Deep Reinforcement Multi-agent Learning framework for Information Gathering with Local Gaussian Processes for Water Monitoring

Informative Deep Reinforcement Path Planning for Heterogeneous Autonomous Surface Vehicles in Large Water Resources

Intelligent Wide-Area Water Quality Monitoring and Analysis System Exploiting Unmanned Surface Vehicles and Ensemble Learning

Censored deep reinforcement patrolling with information criterion for monitoring large water resources using Autonomous Surface Vehicles

UW-MARL: Multi-Agent Reinforcement Learning for Underwater Adaptive Sampling using Autonomous Vehicles

AquaFeL-PSO: A Monitoring System for Water Resources using Autonomous Surface Vehicles based on Multimodal PSO and Federated Learning

Multi-AUV Cooperative Localization in Adaptive Sampling for Marine Environmental Monitoring

Asynchronous Localization for Underwater Acoustic Sensor Networks: A Continuous Control Deep Reinforcement Learning Approach

Smart Underwater Pollution Detection Based on Graph-Based Multi-Agent Reinforcement Learning Towards AUV-Based Network ITS

Multi-vehicle Dynamic Water Surface Monitoring

Multi-agent reinforcement learning framework for real-time scheduling of pump and valve in water distribution networks

DeepAqua: Self-Supervised Semantic Segmentation of Wetland Surface Water Extent with SAR Images using Knowledge Distillation

Multi-Agent Deep Reinforcement Learning Framework Strategized by Unmanned Aerial Vehicles for Multi-Vessel Full Communication Connection

Integrating vision-based AI and large language models for real-time water pollution surveillance

Multi-USVs Coordinated Detection in Marine Environment with Deep Reinforcement Learning.

Hierarchical Heterogeneous Multi-Agent Cross-Domain Search Method Based on Deep Reinforcement Learning

Using Model-Free Reinforcement Learning Combined With Underwater Mass Spectrometer and Material Archiving Coupled to Lab Analysis for Autonomous Chemical Source Verifications

Hierarchical probabilistic regression for AUV-based adaptive sampling of marine phenomena

A Deep Reinforcement Learning Framework and Methodology for Reducing the Sim-to-Real Gap in ASV Navigation

Data-Driven Learning and Planning for Environmental Sampling

Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles with Deep Reinforcement Learning