Nonlinear Regression of Remaining Surgery Duration from Videos Via Bayesian LSTM-based Deep Negative Correlation Learning.

Junyang Wu,Xiaoyang Zou,Rong Tao,Guoyan Zheng
DOI: https://doi.org/10.1016/j.compmedimag.2023.102314
IF: 7.422
2023-01-01
Computerized Medical Imaging and Graphics
Abstract:In this paper, we address the problem of estimating remaining surgery duration (RSD) from surgical video frames. We propose a Bayesian long short-term memory (LSTM) network-based Deep Negative Correlation Learning approach called BD-Net for accurate regression of RSD prediction as well as estimation of prediction uncertainty. Our method aims to extract discriminative visual features from surgical video frames and model the temporal dependencies among frames to improve the RSD prediction accuracy. To this end, we propose to train an ensemble of Bayesian LSTMs on top of a backbone network by the way of deep negative correlation learning (DNCL). More specifically, we deeply learn a pool of decorrelated Bayesian regressors with sound generalization capabilities through managing their intrinsic diversities. BD-Net is simple and efficient. After training, it can produce both RSD prediction and uncertainty estimation in a single inference run. We demonstrate the efficacy of BD-Net on publicly available datasets of two different types of surgeries: one containing 101 cataract microscopic surgeries with short durations and the other containing 80 cholecystectomy laparoscopic surgeries with relatively longer durations. Experimental results on both datasets demonstrate that the proposed BD-Net achieves better results than the state-of-the-art (SOTA) methods. A reference implementation of our method can be found at: https://github.com/jywu511/BD-Net.
What problem does this paper attempt to address?