Accelerating Large-Scale GNN Training with Programmable SSDs

Relative Posts

Problem setting

Our work focuses on the training process of Graph Neural Networks (GNN). A GNN is a type of neural network that consists of edges and nodes. Each node has a feature vector and a label, you can image it as the users on Facebook and their profiles. In practical terms, we can use a GNN to predict the category of a new node being added to a network, by using their profile to predict their label. However, the problem arises when the graph grows so large that its size cannot fit entirely into memory, making the training process somewhat challenging.

GNN and sampling

During training, there is a critical stage known as ‘sampling,’ which involves selecting an N-hop neighborhood from the target node. This step is repeated many times. When the graph exceeds the memory capacity, this process becomes highly time-consuming. According to paper of gSampler[1], preparing training data accounts for more than 90% of the total training time. Our project aims to find a solution to this issue.

Our solution

The tool we are utilizing is the Samsung SmartSSD. This device is a type of SSD with an onboard FPGA. We are programming in C++ to enable the FPGA to perform the on-board sampling. We anticipate that this will expedite the process, as it circumvents the need for host CPU and memory involvement, and is not constrained by PCIe bandwidth limitations.

For more details I will update soon.

Reference

[1] gSampler: General and efficient GPU-based graph sampling for graph learning, https://www.amazon.science/publications/gsampler-general-and-efficient-gpu-based-graph-sampling-for-graph-learning

Relative Posts#

Problem setting#

GNN and sampling#

Our solution#

Reference#

Relative Posts

Problem setting

GNN and sampling

Our solution

Reference