11/04/2022
By Neha Mishra

The Richard A. Miner School of Computer & Information Sciences invites you to attend a doctoral dissertation defense by Neha Mishra on "Permissioned Blockchain-based Personal Data Vault using Predictive Prefetching."

Ph.D. Candidate: Neha Mishra
Date: Friday, Nov. 18, 2022
Time: 11 a.m. EDT
Location: To be announced.

Committee Members:

  • Haim Levkowitz (advisor), Department Chair, Department of Computer Science
  • Sashank Narain (member), Assistant Professor, Department of Computer Science
  • Saira Latif (member), Associate Professor, Finance Department, Manning School of Business, UMass Lowell

Abstract:
We predict that in the not-so-distant future, most if not all documents will no longer be available in paper format. We envision a world in which one can store the entirety of their digital documents not unlike a filing cabinet storing paper documents but with additional security, safety, and integrity. To encapsulate this vision we propose a Personal Data Vault (PDV). The PDV is a framework for storing, saving, protecting, and sharing a person’s life-time digital documents in a verifiable secure manner that can ascertain a document’s authenticity and integrity and afford its owner to share it securely.

In PDV, each document is encrypted, compressed, and securely stored in the cloud, and indexes are entered in the distributed ledger of a Permissioned Blockchain (Hyperledger Fabric), which significantly prevents data leakage, and the ledger is immutable and tamper-proof to maintain integrity. All access rights are written in the form of an access-list in Smart Contracts, which are later used for verification. We used the concept of predictive prefetching in combination with the Markov tree to design a model that successfully predicts the next (or sequence of next) requests that may occur and pre-executes (execute before the request has occurred).

We also evaluated and improve the performance of our PDV by focusing on Hyperledger Fabric version 2.x (v2.x), particularly taking advantage of their new chaincode lifecycle. We conducted several experiments by monitoring and optimizing network parameters such as block size, endorsement policy, and number of clients. These experiments will be conducted with the help of a performance measuring tool, the Hyperledger Caliper Benchmark version 0.4.2 (v0.4.2).

First, we observed changes in performance by varying network parameters (e.g., block size, endorsement policy (EP), and a number of clients). Then, for further evaluation, we selected a sets of network parameters that will show the best performance for a given number of clients and we continued to increase the throughput and decrease latency by evaluating and improving the performance of the PDV up to theoretical limits.

The contribution of this dissertation is :
A permissioned blockchain-based personal data vault (PDV) that provides the ability to share personal data while guaranteeing that the data is valid, authentic, and has not been tampered with.
To reduce latency, predictive prefetching is one of the most efficient solutions, which estimates a data object's expected time can be done even though the data objects are only accessed once. Our framework makes predictions to determine what kind of data request a recipient is likely to make, and thus initiate a required document as fast as possible. By training a predictive model based on a Markov tree with k-means clustering, we can take the guesswork of our predictions and more accurately initiate data ahead of time.
We will evaluate and improve the performance of our Personal Data Vault by focusing on Hyperledger Fabric (HLF) version 2.x (v2.x), taking advantage of their new chaincode lifecycle.