Abstract:Employing cognitive robots, capable of throwing and catching, is a strategy aimed at expediting the logistics process within Industry 4.0's smart manufacturing plants, specifically for the transportation of small-sized manufacturing parts. Since the flight of mechanically thrown objects is inherently unpredictable, it is crucial for the catching robot to observe the initial trajectory with utmost precision and intelligently forecast the final catching position to ensure accurate real-time grasping. This study utilizes multi-camera tracking to monitor mechanically thrown objects. It involves the creation of a 3D simulation that facilitates controlled mechanical throwing of objects within the internal logistics environment of Industry 4.0. The developed simulation empowers users to define the attributes of the thrown object and capture its trajectory using a simulated pinhole camera, which can be positioned at any desired location and orientation within the in-plant logistics environment of flexible manufacturing systems. The simulation facilitated ample experimentation to be conducted for determining the optimal camera positions for accurately observing the 3D interception positions of a flying object based on its apparent size on the camera's sensor plane. Subsequently, a variety of calibrated multi-camera setups were experimented while placing cameras at identified optimal positions. Based on the obtained results, the most effective multi-camera configuration setup is derived. Finally, a training dataset is prepared for 3000 simulated throwing experiments where the initial part of the trajectory consists of observed interception positions, through derived best multi-camera setup, and the final part consists of actual positions. The encoder–decoder Bi-LSTM deep neural network is proposed and trained on this dataset. The trained model outperformed the current state-of-the-art by accurately predicting the final 3D catching point, achieving a mean average error of 5 mm and a root-mean-square error of 7 mm in 200 real-world test experiments.

Self-Supervised Multi-Modal Learning for Collaborative Robotic Grasp-Throw

LiteGrasp: A Light Robotic Grasp Detection Via Semi-Supervised Knowledge Distillation

Learning Robust Skills for Tightly Coordinated Arms in Contact-Rich Tasks

TossingBot: Learning to Throw Arbitrary Objects With Residual Physics

A Novel Robotic Grasping Method for Moving Objects Based on Multi-Agent Deep Reinforcement Learning

Multifingered Grasping Based on Multimodal Reinforcement Learning

Dynamic Handover: Throw and Catch with Bimanual Hands

Harnessing the Synergy between Pushing, Grasping, and Throwing to Enhance Object Manipulation in Cluttered Scenarios

CMG-Net: An End-to-End Contact-Based Multi-Finger Dexterous Grasping Network

Whole-Body Dynamic Throwing with Legged Manipulators

Learning-Based Multimodal Control for a Supernumerary Robotic System in Human-Robot Collaborative Sorting

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation

DexCatch: Learning to Catch Arbitrary Objects with Dexterous Hands

Advancing robots with greater dynamic dexterity: A large-scale multi-view and multi-modal dataset of human-human throw&catch of arbitrary objects

Multi-camera tracking of mechanically thrown objects for automated in-plant logistics by cognitive robots in Industry 4.0

Throwing Objects into A Moving Basket While Avoiding Obstacles

Implementation and Optimization of Grasping Learning with Dual-modal Soft Gripper.

Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation

A Multi-Agent Approach for Adaptive Finger Cooperation in Learning-based In-Hand Manipulation

GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy