TY - GEN
T1 - MOON
T2 - 19th ACM International Symposium on High Performance Distributed Computing, HPDC 2010
AU - Lin, Heshan
AU - Ma, Xiaosong
AU - Archuleta, Jeremy
AU - Feng, Wu Chun
AU - Gardner, Mark
AU - Zhang, Zhe
PY - 2010
Y1 - 2010
N2 - MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for distributed volunteer computing systems. However, unlike on dedicated resources, where MapReduce has mostly been deployed, such volunteer computing systems have significantly higher rates of node unavailability. Furthermore, nodes are not fully controlled by the MapReduce framework. Consequently, we found the data and task replication scheme adopted by existing MapReduce implementations woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. Our tests on an emulated volunteer computing system, which uses a 60-node cluster where each node possesses a similar hardware configuration to a typical computer in a student lab, demonstrate that MOON can deliver a three-fold performance improvement to Hadoop in volatile, volunteer computing environments.
AB - MapReduce offers an ease-of-use programming paradigm for processing large data sets, making it an attractive model for distributed volunteer computing systems. However, unlike on dedicated resources, where MapReduce has mostly been deployed, such volunteer computing systems have significantly higher rates of node unavailability. Furthermore, nodes are not fully controlled by the MapReduce framework. Consequently, we found the data and task replication scheme adopted by existing MapReduce implementations woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. Our tests on an emulated volunteer computing system, which uses a 60-node cluster where each node possesses a similar hardware configuration to a typical computer in a student lab, demonstrate that MOON can deliver a three-fold performance improvement to Hadoop in volatile, volunteer computing environments.
KW - Cloud computing
KW - MapReduce
KW - Volunteer computing
UR - https://www.scopus.com/pages/publications/78650029124
U2 - 10.1145/1851476.1851489
DO - 10.1145/1851476.1851489
M3 - Conference contribution
AN - SCOPUS:78650029124
SN - 9781605589428
T3 - HPDC 2010 - Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
SP - 95
EP - 106
BT - HPDC 2010 - Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Y2 - 21 June 2010 through 25 June 2010
ER -