TY - GEN
T1 - Persistent Memory Disaggregation for Cloud-Native Relational Databases
AU - Ruan, Chaoyi
AU - Zhang, Yingqiang
AU - Bi, Chao
AU - Ma, Xiaosong
AU - Chen, Hao
AU - Li, Feifei
AU - Yang, Xinjun
AU - Li, Cheng
AU - Aboulnaga, Ashraf
AU - Xu, Yinlong
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/3/25
Y1 - 2023/3/25
N2 - The recent emergence of commodity persistent memory (PM) hardware has altered the landscape of the storage hierarchy. It brings multi-fold benefits to database systems, with its large capacity, low latency, byte addressability, and persistence. However, PM has not been incorporated into the popular disaggregated architecture of cloud-native databases. In this paper, we present PilotDB, a cloud-native relational database designed to fully utilize disaggregated PM resources. PilotDB possesses a new disaggregated DB architecture that allows compute nodes to be computation-heavy yet data-light, as enabled by large buffer pools and fast data persistence offered by remote PMs. We then propose a suite of novel mechanisms to facilitate RDMA-friendly remote PM accesses and minimize operations involving CPUs on the computation-light PM nodes. In particular, PilotDB adopts a novel compute-node-driven log organization that reduces network/PM bandwidth consumption and a log-pull design that enables fast, optimistic remote PM reads aggressively bypassing the remote PM node CPUs. Evaluation with both standard SQL benchmarks and a real-world production workload demonstrates that PilotDB (1) achieves excellent performance as compared to the best-performing baseline using local, high-end resources, (2) significantly outperforms a state-of-the-art DRAM-disaggregation system and the PM-disaggregation solution adapted from it, (3) enables faster failure recovery and cache buffer warm-up, and (4) offers superior cost-effectiveness.
AB - The recent emergence of commodity persistent memory (PM) hardware has altered the landscape of the storage hierarchy. It brings multi-fold benefits to database systems, with its large capacity, low latency, byte addressability, and persistence. However, PM has not been incorporated into the popular disaggregated architecture of cloud-native databases. In this paper, we present PilotDB, a cloud-native relational database designed to fully utilize disaggregated PM resources. PilotDB possesses a new disaggregated DB architecture that allows compute nodes to be computation-heavy yet data-light, as enabled by large buffer pools and fast data persistence offered by remote PMs. We then propose a suite of novel mechanisms to facilitate RDMA-friendly remote PM accesses and minimize operations involving CPUs on the computation-light PM nodes. In particular, PilotDB adopts a novel compute-node-driven log organization that reduces network/PM bandwidth consumption and a log-pull design that enables fast, optimistic remote PM reads aggressively bypassing the remote PM node CPUs. Evaluation with both standard SQL benchmarks and a real-world production workload demonstrates that PilotDB (1) achieves excellent performance as compared to the best-performing baseline using local, high-end resources, (2) significantly outperforms a state-of-the-art DRAM-disaggregation system and the PM-disaggregation solution adapted from it, (3) enables faster failure recovery and cache buffer warm-up, and (4) offers superior cost-effectiveness.
KW - Cloud-native database
KW - Memory disaggregation
KW - Persistent memory
UR - https://www.scopus.com/pages/publications/85159317346
U2 - 10.1145/3582016.3582055
DO - 10.1145/3582016.3582055
M3 - Conference contribution
AN - SCOPUS:85159317346
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 498
EP - 512
BT - ASPLOS 2023 - Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
A2 - Aamodt, Tor M.
A2 - Jerger, Natalie Enright
A2 - Swift, Michael
PB - Association for Computing Machinery
T2 - 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2023
Y2 - 25 March 2023 through 29 March 2023
ER -