D3D: Conditional Diffusion Model for Decision-Making under Random Frame Dropping

Bo Xia,Yifu Luo,Yongzhe Chang,Bo Yuan,Zhiheng Li,Xueqian Wang
DOI: https://doi.org/10.1109/ro-man60168.2024.10731194
2024-01-01
Abstract:The occurrence of frame drops due to issues such as corrupted communications or malfunctioning sensors presents a significant challenge to an agent’s decision-making, especially in remote control scenarios. Classical reinforcement learning (RL) usually assumes a continuous data stream without frame drops and relies heavily on online interactions, which is time-consuming, resource-intensive, and often impractical in certain scenarios. Consequently, the performance of RL may deteriorate significantly in face of non-negligible frame drops. To tackle this challenge caused by frame dropping, We propose Conditional Diffusion Model for Decision-Making under Random Frame Dropping (D3D), an offline algorithm that can effectively enhance performance robustness in frame dropping scenarios. D3D addresses this issue through a two-phase approach: 1) During the policy generation phase, D3D adopts a return-conditional diffusion model for decision making rather than the temporal difference learning, whose policy is derived using offline datasets of return-labeled trajectories without information loss. 2) When frame dropping occurs during evaluation, D3D seamlessly substitutes the missing state with its corresponding prediction in the horizon made by the diffusion model. Extensive experiments are conducted on MuJoCo and Adroit tasks to validate D3D’s robustness and efficiency. The results demonstrate that D3D consistently outperforms state-of-the-art RL algorithms, especially excelling on tasks featuring severe drop rates.
What problem does this paper attempt to address?