SnowFlock leverages virtual machine (VM) technology to enable high performance computing on cloud environments. Cloud computing has the potential to simplify the deployment of high performance applications by shifting the significant fixed costs of provisioning and operating the data center to a third party service provider, such as Amazon or Yahoo, who offers computation and storage for rent as a metered commodity. VM execution provides security, performance isolation, and the flexibility of running in a programmer customized environment.
SnowFlock supports parallel execution on virtual clusters. In SnowFlock, a VM is swiftly cloned into multiple copies that execute simultaneously on different physical hosts, and then disappear when the computation ends. SnowFlock simplifies the development of parallel applications and reduces management burden by enabling the agile (within hundreds of milliseconds) instantiation of new stateful computing elements: workers that need no setup time because they have a memory of the application state achieved up to the point of cloning. In contrast, the provisioning of additional elements in existing clouds requires minutes and is stateless.
In addition to lending itself to the efficient processing of large remote datasets, SnowFlock is also well suited for the cloud-based deployment of web services that leverage parallel execution to deliver interactive performance (seconds to a few minutes) for resource-intensive applications. Compute-intensive web services exist in diverse domains such as bioinformatics, finance, graphics rendering, and search. For example, the NCBI BLAST web service, perhaps the most widely used bioinformatics tool, accepts DNA or protein sequences as queries, and leverages cluster computing to enable biologists to quickly learn what other known biological sequences are similar.