Back to articles
Regular Article FastTrack
Volume: 0 | Article ID: 040503
Image
MonoHybrid: Self-Supervised Monocular Depth Estimation with Hybrid Network
Abstract
Abstract

Monocular depth estimation (MDE) is a widely used technique in autonomous driving and 3D reconstruction. However, inconsistent and fragmented depth outputs can significantly undermine the reliability of MDE applications in practice. To address this issue, the authors introduce MonoHybrid, a novel self-supervised hybrid network that effectively integrates Transformer and dilated convolutional architectures. This design enables the extraction of both global and local features, enhancing the receptive field and ensuring robust and continuous depth estimation. Additionally, the authors present a new Feature Fusion Module that fuses convolutional and Transformer features, resulting in improved depth estimation performance. Through comprehensive experiments, the proposed network demonstrates notable accuracy and generalization compared to other advanced methods in the field.

Subject Areas :
Views 0
Downloads 0
 articleview.views 0
 articleview.downloads 0
  Cite this article 

Wei Jiang, Bingfei Nan, Guofa Wang, "MonoHybrid: Self-Supervised Monocular Depth Estimation with Hybrid Networkin Journal of Imaging Science and Technology,  2026,  pp 1 - 12,  https://doi.org/10.2352/J.ImagingSci.Technol.2026.70.4.040503

 Copy citation
  Copyright statement 
Copyright © Society for Imaging Science and Technology 2026
  Article timeline 
  • received March 2025
  • accepted January 2026

Preprint submitted to:
  Login or subscribe to view the content