Liu Shiyuan, Fan Jingfan, Song Dengpan, Fu Tianyu, Lin Yucong, Xiao Deqiang, Song Hong, Wang Yongtian, Yang Jian
Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Photonics, Beijing Institute of Technology, Beijing, 100081, China.
School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.
Biomed Opt Express. 2022 Apr 11;13(5):2707-2727. doi: 10.1364/BOE.457475. eCollection 2022 May 1.
Building an in vivo three-dimensional (3D) surface model from a monocular endoscopy is an effective technology to improve the intuitiveness and precision of clinical laparoscopic surgery. This paper proposes a multi-loss rebalancing-based method for joint estimation of depth and motion from a monocular endoscopy image sequence. The feature descriptors are used to provide monitoring signals for the depth estimation network and motion estimation network. The epipolar constraints of the sequence frame is considered in the neighborhood spatial information by depth estimation network to enhance the accuracy of depth estimation. The reprojection information of depth estimation is used to reconstruct the camera motion by motion estimation network with a multi-view relative pose fusion mechanism. The relative response loss, feature consistency loss, and epipolar consistency loss function are defined to improve the robustness and accuracy of the proposed unsupervised learning-based method. Evaluations are implemented on public datasets. The error of motion estimation in three scenes decreased by 42.1%,53.6%, and 50.2%, respectively. And the average error of 3D reconstruction is 6.456 ± 1.798mm. This demonstrates its capability to generate reliable depth estimation and trajectory reconstruction results for endoscopy images and meaningful applications in clinical.
从单目内窥镜构建体内三维(3D)表面模型是提高临床腹腔镜手术直观性和精确性的有效技术。本文提出了一种基于多损失重平衡的方法,用于从单目内窥镜图像序列联合估计深度和运动。特征描述符用于为深度估计网络和运动估计网络提供监测信号。深度估计网络在邻域空间信息中考虑序列帧的极线约束,以提高深度估计的准确性。深度估计的重投影信息由运动估计网络利用多视图相对姿态融合机制来重建相机运动。定义了相对响应损失、特征一致性损失和极线一致性损失函数,以提高所提出的基于无监督学习方法的鲁棒性和准确性。在公共数据集上进行了评估。三个场景中运动估计的误差分别降低了42.1%、53.6%和50.2%。三维重建的平均误差为6.456±1.798mm。这证明了其为内窥镜图像生成可靠的深度估计和轨迹重建结果的能力以及在临床上的有意义应用。