Abstract:<p>Conventional displacement sensing techniques (e.g., laser, linear variable differential transformer) have been widely used in structural health monitoring in the past two decades. Though these techniques are capable of measuring displacement time histories with high accuracy, distinct shortcoming remains such as point-to-point contact sensing which limits its applicability in real-world problems. Video cameras have been widely used in the past years due to advantages that include low price, agility, high spatial sensing resolution, and non-contact. Compared with target tracking approaches (e.g., digital image correlation, template matching, etc.), the phase-based method is powerful for detecting small subpixel motions without the use of paints or markers on the structure surface. Nevertheless, the complex computational procedure limits its real-time inference capacity. To address this fundamental issue, we develop a deep learning framework based on convolutional neural networks (CNNs) that enable real-time extraction of full-field subpixel structural displacements from videos. In particular, two new CNN architectures are designed and trained on a dataset generated by the phase-based motion extraction method from a single lab-recorded high-speed video of a dynamic structure. As displacement is only reliable in the regions with sufficient texture contrast, the sparsity of motion field induced by the texture mask is considered via the network architecture design and loss function definition. Results show that, with the supervision of full and sparse motion field, the trained network is capable of identifying the pixels with sufficient texture contrast as well as their subpixel motions. The performance of the trained networks is tested on various videos of other structures to extract the full-field motion (e.g., displacement time histories), which indicates that the trained networks have generalizability to accurately extract full-field subpixel displacements for pixels with sufficient texture contrast.</p>

Equal-scale Structure from Motion Method Based on Deep Learning

Two-Stage Multi-Camera Constrain Mapping Pipeline for Large-Scale 3D Reconstruction

LODM: Large-scale Online Dense Mapping for UAV

A Scaled Monocular 3D Reconstruction Based on Structure from Motion and Multi-View Stereo

Towards Scale-Aware Self-Supervised Multi-Frame Depth Estimation with IMU Motion Dynamics.

TC-SfM: Robust Track-Community-Based Structure-from-Motion

Spatio-Temporal Depth Recovery of Dynamic Scenes with Multiple Handheld Cameras

Accurate and Robust Scale Recovery for Monocular Visual Odometry Based on Plane Geometry

Inertial-Based Scale Estimation for Structure from Motion on Mobile Devices

Unsupervised Scale-consistent Depth and Ego-motion Learning from Monocular Video

Monocular Depth and Ego-motion Estimation with Scale Based on Superpixel and Normal Constraints

Monocular Vision-Based Structural Out-of-plane Motion Estimation Using a Deep Learning Method

MCSfM: Multi-Camera-Based Incremental Structure-From-Motion

Deep Permutation Equivariant Structure from Motion

Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular Video Depth

Depth-Guided Sparse Structure-from-Motion for Movies and TV Shows

A Monocular Visual SLAM System Augmented by Lightweight Deep Local Feature Extractor Using In-House and Low-Cost LIDAR-camera Integrated Device

Visual Geometry Grounded Deep Structure From Motion

Every Pixel Counts: Unsupervised Geometry Learning with Holistic 3D Motion Understanding

Geometry-aware Feature Matching for Large-Scale Structure from Motion

Extracting full-field subpixel structural displacements from videos via deep learning