11/04/2022
By Naga Pranathi Rayavaram

The Richard A. Miner School of Computer & Information Sciences invites you to attend a master’s thesis defense by Pranathi Rayavaram on "Single Image Trained Unsupervised CBIR for Real Time Environments."

Candidate Name: Pranathi Rayavaram
Defense Date: Friday, Nov. 18, 2022
Time: 9 to 10 a.m. EST
Location: WAN 445, Wannalancit Mills, East Campus
Thesis/Dissertation Title: Single Image Trained Unsupervised CBIR for Real Time Environments

Committee Members:

  • Sashank Narain (advisor), Computer Science Department, University of Massachusetts Lowell
  • Ian Chen, Computer Science Department, University of Massachusetts Lowell
  • Mohammad Arif Ul Alam, Computer Science Department, University of Massachusetts Lowell

Abstract:
Due to the increasing accessibility of electronic devices with visual capture capabilities, there has been a significant increase in the amount of image data streaming across the world. Researchers and industry professionals are focusing their efforts in developing image retrieval techniques suitable to retrieve information from these huge image resources. These image retrieval techniques are expected to be performed quickly and effectively. Traditionally, image retrieval is performed by supervised algorithms. In the contemporary digital world, where image data is massive and volatile, effectively classifying such a large amount of rapidly changing data is difficult and time consuming, rendering supervised approach unsuitable for real-time systems. However, unsupervised and semi-supervised similarity learning techniques can solve this problem, by performing image retrieval with an objective of avoiding or using minimum labeling of the data. Even while these techniques solve the data classification problem, many of these strategies fail in terms of hours of training time and resiliency, requiring the model to be retrained on the entire dataset whenever a new class is added. Henceforth, there is a need for a system that is both unsupervised and does not require hours of training time to perform image retrieval in real time environments.

To address the aforementioned concerns, we propose a new unsupervised approach that is fast, resilient and suitable for use in real-time environments. In this approach we trained a light weight autoencoder on a single query image without any labels with an objective to significantly reduce the training time. The test image encoding are generated on the train model and used to calculate the similarity score between the query image and the test image. Additionally, we emphasized on the color aspects of the image by generating histogram for query and test images. The histogram difference value between these distributions are contributed to the similarity score, and images are retrieved in the order determined by this score. We also focused on developing a weighted custom loss function that emphasizes specific aspects of images, such as high level features and structural similarities. We evaluated our system on the Sign-Language MNIST, MNIST, Chinese MNIST, and Fruits datasets, compared our results to a few prevalent image retrieval algorithms, such as deep CBIR and TensorFlow similarity. On the basis of these comparisons, we demonstrate that our technique outperforms many contemporary similarity systems, including the aforementioned systems, in terms of training time, with training time as short as 3-10 seconds, compared to hours for certain related works. We analyzed the precision and retrieval time of the technique, which are also comparable with above-mentioned systems.