Adding the Easy Button to the Cloud with SnowFlock and MPI

Philip Patchin , H. Andres Lagar-Cavilla, Eyal de Lara, Michael Brudno

3rd Workshop on System-level Virtualization for High Performance Computing (HPCVirt 2009) , Nuremberg, Germany, April 2009



Cloud computing promises to provide researchers with the ability to perform parallel computations using large pools of virtual machines (VMs), without facing the burden of owning or maintaining physical infrastructure. However, with ease of access to hundreds of VMs, comes also an increased management burden. Cloud users today must manually instantiate, configure and maintain the virtual hosts in their cluster. They must learn new cloud APIs that are not germane to the problem of parallel processing. Those APIs usually take several minutes to perform their VMmanagement tasks, forcing users to keep VMs idling and pay for unused processing time, rather than shut VMs down and power them on as needed. Furthermore, users must still configure their cluster management framework to launch their parallel jobs. <br /> In this paper we show that all this management pain is unnecessary. We show how to combine a cloud API – SnowFlock – and a parallel processing framework – MPI – to truly realize the potential of the cloud. SnowFlock allows users to fork VMs as if they were processes, occupying in sub-second time multiple physical hosts. We exploit the synergy between this paradigm and MPI’s job management to completely hide all details of cloud management from the user. Maintaining a single VM and starting unmodified applications with familiar MPI commands, a user can instantaneously leverage hundreds of processors to perform a parallel computation. Besides making use of cloud resources trivial, we also eliminate the cost of idling – VMs exist only for as long as they are involved in computation.