07/19/2021
By Dalila Megherbi

Leith Namou will defend his master's thesis in Computer Engineering, titled "Design and Implementation of a Computationally Optimized and Resource Efficient Distributed Multi-soft-Processor FPGA Hardware Acceleration Architecture for Image Processing Applications," on Wednesday, July 28, 2021, at 2:30 p.m.

This will be a virtual defense via Zoom. Those interested in attending should contact Committee Chair Dalila_Megherbi@uml.edu and leith_namou@student.uml.edu at least 24 hours before the defense to request access to the meeting.

Committee Chair: D. B. Megherbi, ECE Department, (CMINDS), UML, (Thesis Advisor)

Committee Members:

  • Kanti Prasad, ECE Department, UML
  • Xuejun Lu, ECE Department, UML

Abstract

This thesis focuses on designing, implementing, and optimizing a distributed multi-soft-processor FPGA-based software-hardware co-design architecture for image processing acceleration. The proposed methodology and techniques demonstrate an on-and-off-the-chip augmented hardware architecture implementation used to overcome Memory, Cost, Size, and Power prohibitive resource constraints. This implementation achieves an optimized accelerated image processing and digital filtering execution time. We show experimentally that 80 times computational speedup is achieved with the proposed augmented hardware-accelerated optimization scheme compared to our initial hardware-software practical implementation.

This design is implemented on a DE1-SoC Altera Cyclone-V FPGA. In this design, an FPGA Multi Soft-Core Processor was instantiated in the fabric and interfaced with an external SDRAM via a 16-bit Memory Controller Intellectual Property (IP) core. An 8-bit image (up to 1280 x 1280 pixels, but not limited) was imported from a PC (via Serial Communication) to the FPGA for Sobel processing.

Each Soft Core was memory mapped to each image segment to create a parallel multi-processing architecture. Multiple Sobel Filters are written and synthesized in the fabric. The Register Transfer Logic (RTL) code is written in VHDL and Verilog Hardware Description Language (HDL) to accelerate the design. A different number of Soft-Core Processors in conjuncture with accelerated hardware are tested to compare the resulting computational efficiency. The timings between accessing the on-chip SRAM and the off-chip SDRAM can also be compared.

In the final analysis, understanding that the FPGA Soft-Core using C is quick to implement and flexible to interface with, but it is a non-deterministic, resource-hungry, and sequentially executed process, and that the FPGA Fabric using HDL is rigid and constraint-driven, but no overhead, implemented in fabric, deterministic, concurrently executed parallel processes. Ultimately, in this thesis, we show a more resource-efficient design that can compete against much larger designs, more expensive with faster memory, and power-hungry designs.

All interested students and faculty members are invited to attend the online defense via remote access.