03/15/2024
By Chenxi Zhang
The Kennedy College of Sciences, Miner School of Computer & Information Sciences, invites you to attend a doctoral dissertation defense by Chenxi Zhang on "Deep learning for automatic endoscope assist system, from disease detection to quality control."
Ph.D. Candidate: Chenxi Zhang
Time: Monday, March 25, 2024
Time: 10 a.m.
Location: This will be a virtual defense via Zoom
Committee Members:
- Yu Cao (advisor), Professor, Director, UMass Center for Digital Health (CDH), Miner School of Computer & Information Sciences
- Benyuan Liu (advisor), Professor, Director, Miner School of Computer & Information Sciences
- Mohammad Arif Ul Alam, Assistant Professor, Miner School of Computer & Information Sciences
- Heyong Yu (member), Professor, Department of Electrical & Computer Engineering
Abstract:
The past decade has witnessed the rise of Convolutional Neural Networks (CNNs) in deep learning. CNN models have been successful in a variety of general application domains such as Computer Vision (CV), Natural Language Processing (NLP), Automatic Speech Recognition (ASR), etc. As of 2017, the transformer-based model has dominated Natural Language Processing, and the vision transformer is even capable of outperforming CNN for Computer vision tasks. The outstanding achievements of CNN and Transformer indicate their potential in the field of health care. For example, endoscopic surgery has been an important application scenario for the CNN-based deep learning method. The purpose of this dissertation is to explore in detail CNN-based and transformer-based systems that are used in various aspects of endoscopic surgery, including algorithms for quality control in colonoscopy and disease classification algorithms that assist physicians in improving diagnosis.
Over one-third of the world’s population suffers from digestive tract diseases, ranging from gastric erosion, ulcers, and intestinal polyps to severe illnesses like cancer. Endoscopy is a standard screening procedure in which a camera with an attached rubber tube is inserted down the patient’s digestive tract to visualize his/her digestive system (upper and lower). The ultimate goal of our project is to develop a system that will assist physicians in improving their surgical skills, diagnostic precision, and efficiency during surgery.
In order to address the problems mentioned earlier, we explored a variety of deep learning methods. First, artifacts are detected using an object detection algorithm. In particular, we studied the advantages and disadvantages of SSD (single-shot multibox detector) and Faster-RCNN, two popular object detection models that belong to single-stage and two-stage approaches, respectively. Second, we build a system for detecting common gastric diseases such as ulcers and erosion, which can be used to improve diagnosis by determining whether an image contains gastric diseases. Lastly, we introduce a system for classifying gastric locations, for which we train and assess various models grounded in deep learning. This includes conventional convolutional neural networks and the more modern transformer models. To enhance the model’s accuracy with nearby location classes, we suggest two approaches, targeting both the architectural design and the training approach. Incorporating an attention module and employing MoCo self-supervised training enables us to further enhancement in classifying adjacent locations.