A survey of field programmable gate array (FPGA)-based graph convolutional neural network accelerators: challenges and opportunities

View article
PeerJ Computer Science

Main article text

 

Introduction

  • Efficient processing of sparse matrix

  • The inference process of neural networks is full of large matrix operations, which have different sparsity, and can lead to irregular memory accesses. The inefficiency of matrix operations will seriously affect the speed of model inference, and how to effectively resolve sparsity and data reuse is critical for efficient processing of sparse matrix.

  • Unbalanced workload

  • Since graph data has a different sparsity, as well as the fact that the memory location of the neighbors of each node and the number of neighbors of each node are irregular, it will result in an unbalanced workload among the nodes of the graph, thus reducing the computational efficiency.

  • Execution order differences

  • There are two steps in the GCNs model: aggregation and combination. The aggregation phase collects neighbor node information, and the combination phase completes the feature update. The combination phase relative to the aggregation phase can be considered a rule calculation.

  • Quantification and preservation of accuracy

  • Compared with full-precision computing, fixed-point computing can significantly improve the speed of inference, but it will bring a certain loss of precision. At the same time, maintaining accuracy is very challenging when many optimizations are used in the model.

  1. To our knowledge, this is the first survey of current FPGA-based inference accelerators for GCNs. We list the current accelerators with excellent performance, introduce their characteristics and compare their performance, and introduce the details of some designs according to different challenges.

  2. We review three famous GCNs models based on convolutional operations: GCN, GraphSAGE, and GAT. There are many GCNs based on convolution operations. In this article, we detail the inference process of three representative models.

  3. We look forward to the future development direction and challenges of FPGA-based GCNs accelerators. The complexity of graph data will continuously challenge the acceleration of GCNs, and accelerators of software and hardware co-design can often maximize performance. Due to the unbalanced development between the algorithms and accelerators of GCNs, maintaining generality and accelerating the development speed are significant challenges for future FPGA-based GCNs accelerators.

Survey methodology

Background

Gnn and gcns models

GNN

GCN

GraphSAGE

GAT

Fpga based hardware accelerators

Overview

Efficient operations on sparse matrix

Load balance

Execution order

Quantify and accuracy

Performance and discussion

Conclusion and discussion

Conclusion

Discussion

Additional Information and Declarations

Competing Interests

Ruiqi Chen is a visiting researcher for VeriMake Innovation Lab of Nanjing Renmian Integrated Circuit Co., Ltd.

Author Contributions

Shun Li conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Yuxuan Tao conceived and designed the experiments, performed the experiments, prepared figures and/or tables, and approved the final draft.

Enhao Tang conceived and designed the experiments, prepared figures and/or tables, and approved the final draft.

Ting Xie performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Ruiqi Chen conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

There is no data or code; this is a literature review.

Funding

The authors received no funding for this work.

7 Citations 2,095 Views 146 Downloads