Abstract:International Journal of Software Engineering and Knowledge Engineering, Ahead of Print. Continuous automated testing throughout each cycle can ensure the security of the continuous integration (CI) development lifecycle. Test case prioritization (TCP) is a critical factor in optimizing automated testing, which prioritizes potentially failed test cases and improves the efficiency of automated testing. In CI automated testing, the TCP is a continuous decision-making process that can be solved with reinforcement learning (RL). RL-based CITCP can continuously generate a TCP strategy for each CI development lifecycle, with the reward mechanism as the core. The reward mechanism consists of the reward function and the reward strategy. However, there are new challenges to RL-based CITCP in real-industry CI testing. With high-frequency iteration, the reward function is often calculated with a fixed length of historical information, ignoring the spatial characteristics of the current cycle. Therefore, the dynamic time window (DTW)-based reward function is proposed to perform the reward calculation, which adaptively adjusts the recent historical information range based on the integration cycle. Moreover, with low-failure testing, the reward strategy usually only rewards failure test cases, which creates a sparse reward problem in RL. To address this issue, the similarity-based reward strategy is proposed, which increases the reward objects of some passed test cases, similar to the failure test cases. The DTW-based reward function and the similarity-based reward strategy together constitute the proposed adaptive reward mechanism in RL-based CITCP. To validate the effectiveness of the adaptive reward mechanism, experimental verification is carried out on 13 industrial data sets. The experimental results show that the adaptive reward mechanism can improve the TCP effect, where the average NAPFD is maximally improved by 7.29%, the average Recall is maximally improved by 6.04% and the average TTF is improved by 6.81 positions with a maximum of 63.77.

Dynamic Time Window Based Reward for Reinforcement Learning in Continuous Integration Testing.

Adaptive Reward Computation in Reinforcement Learning-Based Continuous Integration Testing

Security Development Lifecycle-Based Adaptive Reward Mechanism for Reinforcement Learning in Continuous Integration Testing Optimization

A systematic study of reward for reinforcement learning based continuous integration testing

TRCC: Transferable Congestion Control with Reinforcement Learning

Dynamic TCP Initial Windows and Congestion Control Schemes Through Reinforcement Learning

Reinforcement Learning for Test Case Prioritization

A Comparison of Reinforcement Learning Frameworks for Software Testing Tasks

Reducing Web Latency Through Dynamically Setting TCP Initial Window with Reinforcement Learning

ZiXia: A Reinforcement Learning Approach via Adjusted Ranking Reward for Internet Congestion Control

RAN Information-Assisted TCP Congestion Control Using Deep Reinforcement Learning with Reward Redistribution

A deep transfer‐learning‐based dynamic reinforcement learning for intelligent tightening system

DeepQTest: Testing Autonomous Driving Systems with Reinforcement Learning and Real-world Weather Data

An Incremental Optimization Approach to Address the Spatiotemporal Reward Coupling Effects in Deep Reinforcement Learning for Path Planning

RLTF: Reinforcement Learning from Unit Test Feedback

Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination

The Why, When, What, and How about Predictive Continuous Integration: A Simulation-Based Investigation

Reinforcement Learning for Safety Testing: Lessons from A Mobile Robot Case Study

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Diagnosing Reinforcement Learning for Traffic Signal Control

Continual portfolio selection in dynamic environments via incremental reinforcement learning