Skip to main content
eScholarship
Open Access Publications from the University of California

UC Santa Barbara

UC Santa Barbara Electronic Theses and Dissertations bannerUC Santa Barbara

Hardware implementation and analysis of memory interfaces to integrate a vector accelerator into a manycore Network-on-Chip

Abstract

In recent years, there has been a growing demand for vector processors due to their increasing application in deep-learning applications. On the other hand, with the strong need for energy efficiency and high performance, heterogeneous architecture plays an important role and becomes increasingly complex. However, the way of connecting the memory hierarchy to the vector processor in SOC (System-on-Chip) is critical to the system’s performance [8]. This work presents tile design which is based on OpenPiton and BYOC [4] [3]. Tile consists of a 64-bit, single-issue, in-order RISC-V core Ariane [14], along with a 64-bit vector processor ARA [7] [13] which implemented RISC-V V extension version 1.0. This work makes the following contributions. First, it involves the design and implementation of an adapter (bridge) that converts memory request from AMBA AXI to OpenPiton NoC. This adapter enables ARA memory access functionality and facilitates the integration of future accelerators into OpenPiton. Secondly, a tile design is presented, which includes ARA, a RISC-V vector processor, Ariane (a RISC-V core), L1.5 cache, L2 cache, and the implemented bridge. The performance of the tile is evaluated using different versions of bridges connected to the last-level cache (LLC) or off-chip memory. The analysis indicates that a wide data width bridge does not necessarily improve performance significantly. Several factors, such as NoC traffic confliction or unused data fetch, can narrow the performance gap between small and large width bridges. Furthermore, the experiments demonstrate that memory exhibits advantages when dealing with large data widths, and memory saturation also occurs during LLC access. Finally, the thesis proposes the implementation of MSHR (Miss Status Handling Register) and extends this design to manycore architectures to enhance performance.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View