M. Planer,J. M. Sierchio,for BAE Systems
Abstract:We explore the impact of environmental conditions on the competency of machine learning agents and how real-time competency assessments improve the reliability of ML agents. We learn a representation of conditions which impact the strategies and performance of the ML agent enabling determination of actions the agent can make to maintain operator expectations in the case of a convolutional neural network that leverages visual imagery to aid in the obstacle avoidance task of a simulated self-driving vehicle.
What problem does this paper attempt to address?
This paper attempts to address the issues of performance and reliability of machine learning (ML) systems under different environmental conditions, especially how to evaluate and enhance the competitiveness of ML systems in real - time environments. Specifically, the paper focuses on:
1. **The impact of environmental conditions on the capabilities of machine - learning agents**:
- Machine - learning models (such as random forests, neural networks, etc.) usually lack inherent interpretability, and their outputs do not have a clear expected accuracy estimate.
- ML models often encounter situations beyond their training scope, which may lead to performance degradation or unreliable results.
2. **Real - time competitiveness evaluation to improve the reliability of ML systems**:
- The paper proposes a method to determine what actions an agent can take to maintain the operator's expectations by learning the conditional representations that affect the ML agent's strategy and performance.
- In particular, in the simulated autonomous vehicle obstacle - avoidance task, a convolutional neural network (CNN) is used to utilize visual images for navigation assistance.
### Main research contents
- **Definition and evaluation of competitiveness**:
- **Performance indicators**: For example, when estimating the distance to an obstacle, the mean squared error (MSE) can be used as a performance indicator.
- **Strategy**: This refers to the specific behavior pattern adopted by the ML agent when completing a task. For a CNN, the activation pattern of the input image can be regarded as a behavior, and a collection of similar activation patterns is considered a strategy.
- **Learning and predicting the competitiveness of ML agents**:
- An agent was trained using the AlexNet CNN to be able to recognize the distance to obstacles for collision avoidance assistance. The training data included 80,000 images under different environmental conditions (such as rain, snow, dusk, night) and were generated using the Gazebo simulation environment.
- Topic distributions representing competitiveness - controlling conditions were generated through hierarchical Dirichlet processes (HDPs) to better understand the impact of environmental conditions on the performance of ML agents.
- **Evaluating the performance of the competitiveness - aware system**:
- **Coverage**: Measures the ability of the CAML system to correctly identify competitiveness - controlling conditions. Experimental results show that the coverage rate reached more than 95%.
- **Correctness**: Reports the proportion of correct CAML prediction strategies. When estimating a single strategy, the correct rate is 90%, and when considering the internal variability of the ML agent, the correct rate rises to 100%.
- **Fidelity**: Verifies whether a given estimate falls within the expected performance range. When evaluated using complex conditions derived from HDPs, the fidelity score reached 99%.
- **Reliability**: Measures the frequency with which the ML agent meets or exceeds the requirements defined by the operator. For example, in sunny conditions, 99% of obstacles must be detected in a timely manner to avoid collisions.
### Conclusion
The research in this paper aims to improve the reliability and performance of ML agents under different environmental conditions by real - time evaluation of their competitiveness. This method not only helps to improve the reliability of existing ML systems but can also be extended to other complex ML tasks, such as deep reinforcement learning and drone swarm control.