Subspace-wise Hybrid RL for Articulated Object Manipulation

Yujin Kim,Sol Choi,Bum-Jae You,Keunwoo Jang,Yisoo Lee
2024-12-12
Abstract:Articulated object manipulation is a challenging task, requiring constrained motion and adaptive control to handle the unknown dynamics of the manipulated objects. While reinforcement learning (RL) has been widely employed to tackle various scenarios and types of articulated objects, the complexity of these tasks, stemming from multiple intertwined objectives makes learning a control policy in the full task space highly difficult. To address this issue, we propose a Subspace-wise hybrid RL (SwRL) framework that learns policies for each divided task space, or subspace, based on independent objectives. This approach enables adaptive force modulation to accommodate the unknown dynamics of objects. Additionally, it effectively leverages the previously underlooked redundant subspace, thereby maximizing the robot's dexterity. Our method enhances both learning efficiency and task execution performance, as validated through simulations and real-world experiments. Supplementary video is available at <a class="link-external link-https" href="https://youtu.be/PkNxv0P8Atk" rel="external noopener nofollow">this https URL</a>
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to effectively manipulate complex objects with multiple joints (such as doors, valves and drawers)**. These tasks require constrained motion and adaptive control under unknown dynamic conditions. Although traditional Reinforcement Learning (RL) methods are widely used to handle various scenarios and types of jointed objects, due to the complexity of the tasks and the characteristics of multiple - objective interweaving, it becomes very difficult to learn control strategies in the full - task space. To solve this problem, the author proposes a framework named **Subspace - wise hybrid RL (SwRL)**. This framework decomposes the task space into multiple independent sub - spaces and designs learning strategies based on independent goals for each sub - space. Specifically: 1. **Geometric Constraint Sub - space (SG)**: Deals with constraints related to the geometric configuration of the object. 2. **Kinematics Constraint Sub - space (SK)**: Ensures that the end - effector follows the kinematic structure of the object joints, which is achieved through force control. 3. **Redundancy Sub - space (SR)**: Utilizes the degrees of freedom that are not directly constrained to optimize other criteria, such as avoiding collisions or reducing joint stress. In this way, the SwRL framework can better adapt to the unknown dynamic characteristics of the object and effectively use the redundancy sub - space to enhance the flexibility of the robot. In addition, this method also improves the learning efficiency and task execution performance, which has been verified in simulations and actual experiments. ### Formula Presentation In the SwRL framework, the formula for hybrid force/motion control is as follows: \[ \tau = J^T \left( \Lambda S (K_p X_e + K_d V_e) + (I - S) F_d \right) \] where: - \(\tau\) is the joint torque vector; - \(J\) is the Jacobian matrix; - \(\Lambda\) is the task - space inertia matrix, \(\Lambda = J^+ T M J^+\), \(M\) is the joint - space inertia matrix, and the superscript \(+\) represents the pseudo - inverse; - \(S\) is the selection matrix for motion control; - \(K_p\) and \(K_d\) are the proportional and differential gain matrices respectively; - \(X_e\) and \(V_e\) are the position and velocity error vectors in the task space respectively; - \(F_d\) is the desired force vector. For different types of joints, the generated force formulas are as follows: - For the prismatic joint, the generated force is: \[ F_{\text{prismatic}} = \frac{F_v}{\|v\|} \] where \(v\) is the velocity vector parallel to the \(Z_o\) axis. - For the revolute joint, the generated force is: \[ F_{\text{revolute}} = \frac{F_\omega \times r}{\|\omega \times r\|} \] where \(\omega\) is the angular velocity vector and \(r\) is the position vector relative to the \(Z_o\) axis. In this way, SwRL can achieve more efficient and flexible control in complex jointed - object manipulation tasks.