Articles | Open Access |

Advancing Graph Processing: A Hardware-Software Co-Design Approach

Rohan Mehta , Department of Computer Engineering, Indian Institute of Technology Bombay, India

Abstract

Graph processing is increasingly important in numerous domains, including recommender systems, neuroscience, cybersecurity, and social network analysis (Wu et al., 2023; Bullmore & Sporns, 2009; Wang et al., 2019; Yin et al., 2023; Luo et al., 2023; He et al., 2024). However, the unique characteristics of graph data, such as irregularity and unstructuredness, pose significant challenges to achieving high performance. This paper explores the latest advancements in hardware and software co-design techniques aimed at addressing these challenges and improving the efficiency of graph processing systems. We examine novel architectural approaches, memory management strategies, and software frameworks that collectively contribute to enhanced performance.

Keywords

Graph processing, hardware-software co-design, FPGA acceleration

References

Bai, J. Y., Guo, J., Wang, C. C., Chen, Z. Y., He, Z., Yang, S., ... & Guo, Y. W. (2023). Deep graph learning for spatially-varying indoor lighting prediction. Science China Information Sciences, 66(3), Article 132106.

Ben-Nun, T., Sutton, M., Pai, S., & Pingali, K. (2017). Groute: An asynchronous multi-GPU programming model for irregular computations. ACM SIGPLAN Notices, 52(8), 235-248.

Bullmore, E., & Sporns, O. (2009). Complex brain networks: Graph theoretical analysis of structural and functional systems. Nature reviews neuroscience, 10(3), 186-198.

Chen, D., Gui, C. Y., Zhang, Y., Jin, H., Zheng, L., Huang, Y., & Liao, X. F. (2022). GraphFly: Efficient asynchronous streaming graphs processing via dependency-flow. In 2022 International Conference for High Performance Computing, Networking, Storage and Analysis.

Chen, D., He, H. H., Jin, H., Zheng, L., Huang, Y., Shen, X. Y., & Liao, X. F. (2023). MetaNMP: Leveraging Cartesian-like product to accelerate HGNNs with near-memory processing. In Proceedings of the 50th Annual International Symposium on Computer Architecture, Article 56.

Chen, D., Jin, H., Zheng, L., Huang, Y., Yao, P. C., Gui, C. Y., ... & Zheng, R. (2022). A general offloading approach for near-DRAM processing-in-memory architectures. In 2022 IEEE International Parallel and Distributed Processing Symposium (pp. 246-257).

Chen, X. Y., Chen, Y., Cheng, F., Tan, H. S., He, B. S., & Wong, W. F. (2022). ReGraph: Scaling graph processing on HBM-enabled FPGAs with heterogeneous pipelines. In 55th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 1342-1358).

Chi, P., Li, S. C., Xu, C., Zhang, T., Zhao, J. S., Liu, Y. P., ... & Xie, Y. (2016). PRIME: A novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In 43rd Annual International Symposium on Computer Architecture (pp. 27-39).

Dai, G. H., Huang, T. H., Chi, Y. Z., Xu, N. Y., Wang, Y., & Yang, H. Z. (2017). ForeGraph: Exploring large-scale graph processing on multi-FPGA architecture. In 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (pp. 217-226).

Dong, W., Moses, C., & Li, K. (2011). Efficient k-nearest neighbor graph construction for generic similarity measures. In Proceedings of the 20th international conference on World Wide Web (pp. 577-586).

Fang, P., Wang, F., Shi, Z., Feng, D., Yi, Q. X., Xu, X. H., & Zhang, Y. X. (2022). An efficient memory data organization strategy for application-characteristic graph processing. Frontiers of Computer Science, 16(1), 1-14.

Fey, M., & Lenssen, J. E. (2019). Fast graph representation learning with PyTorch geometric. arXiv preprint arXiv:1903.02428.

Gui, C. Y., Zheng, L., He, B. S., Liu, C., Chen, X. Y., Liao, X. F., & Jin, H. (2019). A survey on graph processing accelerators: Challenges and opportunities. Journal of Computer Science and Technology, 34(2), 339-371.

Ham, T. J., Wu, L. S., Sundaram, N., Satish, N., & Martonosi, M. (2016). Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In 49th Annual IEEE/ACM International Symposium on Microarchitecture.

He, D. L., Yuan, P. P., & Jin, H. (2024). Answering reachability queries with ordered label constraints over labeled graphs. Frontiers of Computer Science, 18(1), 1-14.

Hu, M., Strachan, J. P., Li, Z. Y., Grafals, E. M., Davila, N., Graves, C., ... & Williams, R. S. (2016). Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication. In 53rd Annual Design Automation Conference, Article 19.

Huang, Y., Zheng, L., Yao, P. C., Wang, Q. G., Liao, X. F., Jin, H., & Xue, J. L. (2020). A heterogeneous PIM hardware-software co-design for energy-efficient graph processing. In 2020 IEEE International Parallel and Distributed Processing Symposium (pp. 684-695).

Huang, Y., Zheng, L., Yao, P. C., Wang, Q. G., Liao, X. F., Jin, H., & Xue, J. L. (2022). Accelerating graph convolutional networks using crossbar-based processing-in-memory architectures. In 2022 IEEE International Symposium on High-Performance Computer Architecture (pp. 1029-1042).

Article Statistics

Downloads

Download data is not yet available.

Copyright License

Download Citations

How to Cite

Rohan Mehta. (2025). Advancing Graph Processing: A Hardware-Software Co-Design Approach. International Journal of Computer Science & Information System, 10(05), 1–6. Retrieved from https://scientiamreearch.org/index.php/ijcsis/article/view/161