08/07/2025
By Alimire Nabijiang

The Kennedy College of Science, Richard A. Miner School of Computer & Information Sciences, invites you to attend a doctoral dissertation defense by Alimire Nabijiang titled, "Deep Learning Pipelines for Polyp Size Estimation and Monocular Depth Enhancement in Colonoscopy."

Candidate Name: Alimire Nabijiang
Date:  Friday,  Aug. 22, 2025
Time: 9-10 a.m. EST.
Location: This will be a virtual defense via Zoom.

Committee Members:

  • Yu Cao (Advisor), Professor, Director, Miner School of Computer & Information Sciences, UMass Center for Digital Health (CDH)
  • Benyuan Liu (Advisor), Professor, Director, Miner School of Computer & Information Sciences, UMass Center for Digital Health (CDH), Computer Networking Lab, CHORDS
  • Hengyong Yu (Member), Professor, FIEEE, FAAPM, FAIMBE, FAAIA, FAIIA, Department of Electrical and Computer Engineering
  • QiLei Chen (Member), Research scientist, Miner School of Computer & Information Sciences

Abstract:

Colorectal cancer (CRC) remains a major global health burden, and early detection through colonoscopy plays a critical role in reducing its incidence and mortality. Polyp size is a key clinical parameter influencing surveillance intervals, therapeutic strategies, and long-term follow-up. In current practice, size assessment is primarily limited to linear metrics, particularly maximum diameter. While alternative metrics such as surface area and volume offer richer characterizations of polyp morphology, they remain under-explored in clinical research. Diameter-based measurement is also highly subjective and often inaccurate, particularly when it is estimated visually during procedures. These limitations underscore the need for deep learning-based methods that can deliver automated, objective polyp sizing solutions. Current deep learning approaches for polyp sizing typically depend on monocular depth estimation, which is a well-studied task in natural image domains but poses significant challenges in medical imaging. This is largely due to the lack of ground truth depth data for real endoscopic images. This thesis addresses these challenges by developing deep learning methods that improve polyp size estimation and enhance monocular depth estimation for colonoscopic imagery.

The first contribution of this research introduces a novel pipeline for estimating polyp surface area from monocular endoscopic images. Conventional diameter-based methods fail to represent complex, protuberant 3-D geometries. Our approach combines a novel canonical camera space transformed metric depth estimation network, robust segmentation, and a Poisson surface reconstruction algorithm to generate 3D surface models from single-view images. While the use of surface area as a metric is still theoretical, our results on synthetic datasets demonstrate the technical feasibility of our approach and lay the groundwork for future clinical applications in which surface area could complement existing sizing metrics.

The second contribution presents a practical and clinically aligned pipeline for estimating polyp diameter. We develop a robust end-to-end pipeline that segments the polyp and applies ViTCAN-Depth, a novel monocular depth model that fuses parallel CNN and Vision Transformer encoders via channel-attention gating. This is coupled with a novel Depth–Pixel Linear (DPL), a lightweight module that estimates polyp diameter in real-world units using a learned scalar, thereby eliminating the need for manual calibration or reference tools. Quantitative and qualitative evaluations on synthetic and real colonoscopy frames show that our approach outperforms existing methods and maintains strong performance across varied polyps and temporal variations.

To further advance monocular depth estimation in colonoscopy, we propose a two-phase, self-supervised teacher–student framework . This approach leverages large-scale, unlabeled real data and integrates knowledge-distillation cues with an edge-guided, patch-wise supervision scheme to enhance spatial detail retention, without compromising global metric scale. The framework is intended to bridge the performance gap between synthetic and real data, balance the trade-off between fine-detail preservation and metric accuracy, and address the critical need for both accuracy and computational efficiency in clinical settings.

Collectively, these contributions aim to develop more accurate, efficient, and clinically meaningful solutions for polyp size and depth estimation in colonoscopy, paving the way for real-time, AI-assisted diagnostic tools and future clinical applications.