TY - GEN
T1 - Multi-Task Processing in Vertex-Centric Graph Systems
T2 - 26th International Conference on Extending Database Technology, EDBT 2023
AU - Luo, Siqiang
AU - Zhu, Zichen
AU - Xiao, Xiaokui
AU - Yang, Yin
AU - Li, Chunbo
AU - Kao, Ben
N1 - Publisher Copyright:
© 2023 OpenProceedings.org. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Vertex-centric (VC) graph systems are at the core of large-scale distributed graph processing. For such systems, a common usage pattern is the concurrent processing of multiple tasks (multiprocessing for short), which aims to execute a large number of unit tasks in parallel. In this paper, we point out that multi-processing has not been sufficiently studied or evaluated in previous work; hence, we fill this critical gap with three major contributions. First, we examine the tradeoff between two important measures in VC-systems: The number of communication rounds and message congestion. We show that this tradeoff is crucial to system performance; yet, existing approaches fail to achieve an optimal tradeoff, leading to poor performance. Second, based on extensive experimental evaluations on mainstream VC systems (e.g., Giraph, Pregel+, GraphD) and benchmark multi-processing tasks (e.g., Batch Personalized PageRanks, Multiple Source Shortest Paths), we present several important insights on the correlation between system performance and configurations, which is valuable to practitioners in optimizing system performance. Third, based on the insights drawn from our experimental evaluations, we present a cost-based tuning framework that optimizes the performance of a representative VC-system. This demonstrates the usefulness of the insights.
AB - Vertex-centric (VC) graph systems are at the core of large-scale distributed graph processing. For such systems, a common usage pattern is the concurrent processing of multiple tasks (multiprocessing for short), which aims to execute a large number of unit tasks in parallel. In this paper, we point out that multi-processing has not been sufficiently studied or evaluated in previous work; hence, we fill this critical gap with three major contributions. First, we examine the tradeoff between two important measures in VC-systems: The number of communication rounds and message congestion. We show that this tradeoff is crucial to system performance; yet, existing approaches fail to achieve an optimal tradeoff, leading to poor performance. Second, based on extensive experimental evaluations on mainstream VC systems (e.g., Giraph, Pregel+, GraphD) and benchmark multi-processing tasks (e.g., Batch Personalized PageRanks, Multiple Source Shortest Paths), we present several important insights on the correlation between system performance and configurations, which is valuable to practitioners in optimizing system performance. Third, based on the insights drawn from our experimental evaluations, we present a cost-based tuning framework that optimizes the performance of a representative VC-system. This demonstrates the usefulness of the insights.
UR - https://www.scopus.com/pages/publications/85137567368
U2 - 10.48786/edbt.2023.20
DO - 10.48786/edbt.2023.20
M3 - Conference contribution
AN - SCOPUS:85137567368
T3 - Advances in Database Technology - EDBT
SP - 247
EP - 259
BT - Proceedings of the 26th International Conference on Extending Database Technology, EDBT 2023
PB - OpenProceedings.org
Y2 - 28 March 2023 through 31 March 2023
ER -