TY - GEN
T1 - Massively parallel genomic sequence search on the Blue Gene/P architecture
AU - Lin, Heshan
AU - Balaji, Pavan
AU - Poole, Ruth
AU - Sosa, Carlos
AU - Ma, Xiaosong
AU - Feng, Wu Chun
PY - 2008
Y1 - 2008
N2 - This paper presents our first experiences in mapping and optimizing genomic sequence search onto the massively parallel IBM Blue Gene/P (BG/P) platform. Specifically, we performed our work on mpiBLAST, a parallel sequence-search code that has been optimized on numerous supercomputing environments. In doing so, we identify several critical performance issues. Consequently, we propose and study different approaches for mapping sequence-search and parallel I/O tasks on such massively parallel architectures.We demonstrate that our optimizations can deliver nearly linear scaling (93% efficiency) on up to 32,768 cores of BG/P. In addition, we show that such scalability enables us to complete a large-scale bioinformatics problem - sequence searching a microbial genome database against itself to support the discovery of missing genes in genomes - in only a few hours on BG/P. Previously, this problem was viewed as computationally intractable in practice.
AB - This paper presents our first experiences in mapping and optimizing genomic sequence search onto the massively parallel IBM Blue Gene/P (BG/P) platform. Specifically, we performed our work on mpiBLAST, a parallel sequence-search code that has been optimized on numerous supercomputing environments. In doing so, we identify several critical performance issues. Consequently, we propose and study different approaches for mapping sequence-search and parallel I/O tasks on such massively parallel architectures.We demonstrate that our optimizations can deliver nearly linear scaling (93% efficiency) on up to 32,768 cores of BG/P. In addition, we show that such scalability enables us to complete a large-scale bioinformatics problem - sequence searching a microbial genome database against itself to support the discovery of missing genes in genomes - in only a few hours on BG/P. Previously, this problem was viewed as computationally intractable in practice.
UR - https://www.scopus.com/pages/publications/70350782517
U2 - 10.1109/SC.2008.5222005
DO - 10.1109/SC.2008.5222005
M3 - Conference contribution
AN - SCOPUS:70350782517
SN - 9781424428359
T3 - 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
BT - 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
T2 - 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008
Y2 - 15 November 2008 through 21 November 2008
ER -