Abstract:Robotic cloth manipulation faces challenges due to the fabric's complex dynamics and the high dimensionality of configuration spaces. Previous methods have largely focused on isolated smoothing or folding tasks and overly reliant on simulations, often failing to bridge the significant sim-to-real gap in deformable object manipulation. To overcome these challenges, we propose a two-stream architecture with sequential and spatial pathways, unifying smoothing and folding tasks into a single adaptable policy model that accommodates various cloth types and states. The sequential stream determines the pick and place positions for the cloth, while the spatial stream, using a connectivity dynamics model, constructs a visibility graph from partial point cloud data of the self-occluded cloth, allowing the robot to infer the cloth's full configuration from incomplete observations. To bridge the sim-to-real gap, we utilize a hand tracking detection algorithm to gather and integrate human demonstration data into our novel end-to-end neural network, improving real-world adaptability. Our method, validated on a UR5 robot across four distinct cloth folding tasks with different goal shapes, consistently achieves folded states from arbitrary crumpled initial configurations, with success rates of 99\%, 99\%, 83\%, and 67\%. It outperforms existing state-of-the-art cloth manipulation techniques and demonstrates strong generalization to unseen cloth with diverse colors, shapes, and stiffness in real-world <a class="link-external link-http" href="http://experiments.Videos" rel="external noopener nofollow">this http URL</a> and source code are available at: <a class="link-external link-https" href="https://zcswdt.github.io/SSFold/" rel="external noopener nofollow">this https URL</a>

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the challenges that robots face when manipulating cloth. Specifically, the complex dynamics and high - dimensional configuration space of cloth make it difficult for robots to effectively perform cloth flattening and folding tasks. Traditional methods usually focus on isolated smoothing or folding tasks and rely too much on the simulation environment, resulting in a large gap from simulation to reality (sim - to - real) and being unable to adapt well to the manipulation of deformable objects in the real world. To solve these problems, the authors propose a new framework named SSFold. This framework unifies the smoothing and folding tasks by combining human demonstration data with advanced learning techniques and can handle various types of cloth and their initial states. Specific contributions include: 1. **Two - stream architecture**: A two - stream architecture with sequential and spatial paths is proposed, unifying the smoothing and folding tasks into a single adaptable policy model. 2. **Visibility graph construction**: A visibility graph is constructed using partial point - cloud data to overcome the cloth self - occlusion problem, enabling the robot to infer the complete configuration of the cloth from incomplete observations. 3. **Human demonstration data integration**: Hand - tracking detection algorithms are used to collect and integrate human demonstration data, thereby improving the model's adaptability and generalization ability in the real world. 4. **Efficient data collection**: Hand - tracking and keypoint detection are achieved through a low - cost monocular camera system, avoiding the need for complex and expensive traditional equipment. Finally, SSFold was verified in four different cloth - folding tasks on the UR5 robot, successfully achieving the folding of the target shape from an arbitrarily wrinkled initial configuration, with success rates of 99%, 99%, 83% and 67% respectively. This shows that this method not only performs well in a standardized setting but can also be robustly generalized in unseen tasks. ### Formula summary - **Definition of the edges of the visibility graph**: \[ E_C=\{e_{ij}\mid \|v_i - v_j\|_2 < R\} \] where \(e_{ij}\) represents the connection between nodes \(v_i\) and \(v_j\), and \(R\) is the distance threshold. - **Optimal placement position selection**: \[ i^*=\arg\max_i\left(\max_{(u,v)}P_i(u, v)\mid T_{\text{pick}}^i\right) \] \[ T_{\text{place}}=\arg\max_{(u,v)}P_{i^*}(u, v)\mid T_{\text{pick}}^{i^*} \] - **Grasping direction optimization**: \[ x = \frac{T_{\text{pick}}-T_{\text{place}}}{\|T_{\text{pick}}-T_{\text{place}}\|} \] \[ y=\frac{[0,0,-1]\times x}{\|[0,0,-1]\times x\|} \] \[ x = y\times[0,0,-1] \] \[ R=[x\quad y\quad[0,0,-1]] \] These formulas and methods together ensure the efficiency and accuracy of SSFold in handling complex cloth - manipulation tasks.

SSFold: Learning to Fold Arbitrary Crumpled Cloth Using Graph Dynamics from Human Demonstration

Learning Cloth Folding Tasks with Refined Flow Based Spatio-Temporal Graphs

Dynamic Cloth Folding Using Curriculum Learning

Foldsformer: Learning Sequential Multi-Step Cloth Manipulation with Space-Time Attention

FabricFolding: learning efficient fabric folding without expert demonstrations

D-Cloth: Skinning-based Cloth Dynamic Prediction with a Three-stage Network

Learning Visual Feedback Control for Dynamic Cloth Folding

Learning Dense Visual Correspondences in Simulation to Smooth and Fold Real Fabrics

UniFolding: Towards Sample-efficient, Scalable, and Generalizable Robotic Garment Folding

Mesh-based Dynamics with Occlusion Reasoning for Cloth Manipulation

Robotic Fabric Flattening with Wrinkle Direction Detection

Learning Keypoints from Synthetic Data for Robotic Cloth Folding

Learning Dense Visual Object Descriptors to Fold Two-Dimensional Deformable Fabrics

AdaFold: Adapting Folding Trajectories of Cloths via Feedback-loop Manipulation

FlingBot: The Unreasonable Effectiveness of Dynamic Manipulation for Cloth Unfolding

Learning to Singulate Layers of Cloth using Tactile Feedback

Differentiable Cloth Parameter Identification and State Estimation in Manipulation

DeepCloth-ROB$^2_{\text{QS}}$P&P: Towards a Robust Robot Deployment for Quasi-Static Pick-and-Place Cloth-Shaping Neural Controllers

GPT-Fabric: Smoothing and Folding Fabric by Leveraging Pre-Trained Foundation Models

A Virtual Reality Framework for Human-Robot Collaboration in Cloth Folding

Dynamic Layer Detection of a Thin Silk Cloth using DenseTact Optical Tactile Sensors