Upload
lyingcom
View
146
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Opportunities and Challenges for Running Scientific Workflows on the Cloud
Citation preview
Opportunities and Challenges for Running Scientific Workflows
on the Cloud Yong Zhao, Xubo Fei, Ioan Raicu, Shiyong Lu
Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference
Ying Lian
Computer Science, WSU
Overview
INTRODUCTION
OPPORTUNITIES
CHALLENGES
RESEARCH DIRECTIONS
CONCLUSIONS
INTRODUCTION
There is something in the air.
INTRODUCTION
Cloud computing is gaining tremendous momentum in both academia and industry.
“Cloud Computing”: a large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet.
Mostly applied to Web applications and business applications. To support workflow applicationsa link is missing
INTRODUCTION
Manage and run workflow applications on the cloud (especially data-intensive scientific workflows)
Several Scientific workflow management systems (SWFMSs) have been applied.
Cloud Workflow: specification, execution, and provenance tracking of scientific workflows, as well as the management of data and computing resources to enable the running of scientific workflows on the Cloud
Following sections: Meaning, challenges, research opportunities
OPPORTUNITIES
Keywords: Infinite computing resource
1. The scale of scientific problems that can be addressed by scientific workflows is now greatly increased, which was previously upbounded by the size of a dedicated resource pool with limited resource sharing extension in the form of virtual organizations. data size (e.g. GenBank double/9-12m )—vast storage
space complexities of the applications (e.g. protein simulation
by iterative algorithm with huge parameters) – massive computing resources
OPPORTUNITIES
2. The on-demand resource allocation mechanism in Cloud has a number of advantages over the traditional cluster/Grid environments for scientific workflows: Improve resources utilization. Unequal numbers of
recourses are required for different stages. Faster turn-around time for end users: dynamic
scale out/in Enable new generation workflow: collaborative
scientific workflow. In which user interaction and collaboration patterns are favored
OPPORTUNITIES
3. Much bigger room for trade-off between performance and cost. Spectrum of resource investment: from delicate
private resources, hybrid local & cloud, full outsourcing on clouds
Cloud computing bring the opportunities to improve the performance/cost ratio
But the optimization of this ratio and automatic trade-off mechanism remain challenging.
CHANLLENGES
Architectural challenges
Integration challenges
Computing challenges
Data management challenges
Language challenges
Service management challenges
Architectural ChallengesUser interface customizability and
support
Reproducibility support
Heterogeneous and distributed services and software tools integration
Heterogeneous and distributed data product management
High-end computing support
Workflow monitoring and failure handling
Interoperability
Reference Architecture for SWFMSs
Deploy the architecture: solutions
Operation
SWFMS running out of
the Cloud
No concern of vendor lock-in
SWFMS itself cannot benefit
from the scalability
Task Management
Not on a batch-based
schedule
Deploy immediately
without sequence
Cost of storage of
provenance & data products
Workflow management
Presentation Layer
deployed at a client machine
Suitable for ad hoc domain-
specific requirement
More dependent on Cloud platform
All_in_the_could
SWFMS inside the cloud, and accessed via Web browser
Highly scalable:
Software as a Service
Cost; Dependency; Vendor lock-in
Integration Challenges
How to integrate scientific workflow systems with Cloud infrastructure and resources ?
Operation layer : Applications, services, and tools hosted in the Cloud and the scheduling and management of a workflow are outside the Cloud. (e.g. Google Map service use ad hoc scripts and programs to glue the services together)
Task management layer: resource provisioning. (e.g. Nimbus)
Workflow management layer: Debugging, monitoring, and provenance tracking
All in cloud: porting issue. Need a workflow engine at cloud end, and web interface or thin client at user end
Language Challenges
MapReduce: a widely used computing model, with two key function, Map and Reduce. --White-Box
SwiftScript serves as a general purpose coordination language, where existing applications can be invoked without modification. --Black-Box
Language Challenges
Handle the mapping from input and output data into logical structures.
Support large-scale parallelism via either implicit parallelism, or explicit declaratives.
Support data partitioning and task partitioning.
Require a scalable, reliable, and efficient runtime system that can support Cloud-scale task scheduling and dispatching, provide error recovery and fault tolerance.
Computing Challenges
Workflow system may not be able to talk to Cloud resources directly middleware services needed. (Nimbus or Falkon to handle the resource provisioning and task dispatching)
More complicated if consider: workflow resource requirement, data dependencies, Cloud virtualization.
A SWFMS will try to automatically recover when non-fatal errors happen. Smart-return: detailed execution info be logged, for workflow restart.
Data management challenges
When data intensiveness increase, the management of data resources and dataflow between the storage and compute resources become the bottleneck. Data Locality: CPU cheaper, data inflate location is the
most challenge, rather than the computational resources Combining compute and data management: need to
minimize the amount of data movement. Otherwise, significant underutilization of raw resources will be yield.
Provenance: derivation history of a data product. Tracking across service providers, and across different abstraction layers. Secure access is another missing now.
Service management challenges
The engineering of the components of an SWFMS as services: thousands of services developed and available for the
myExperiment project the LEAD system has developed a tool to wrap and
convert ordinary science applications into services
The orchestration and invocation of services from an SWFMS managing the large number of service instances data movements across different service instances
RESEARCH DIRECTIONS Emphasis on workflow reference architecture and
direct research effort to foregoing layers
Great leap on Middleware development: resource management, monitoring, messaging
Many Task computing (MTC): preliminary applied in Grids and supercomputer, expected to largely improved for Cloud
Scripting: mixture of semantics, combination of application of services…
Cost optimization: very challenging, but rewarding too
RESEARCH DIRECTIONS
SWFMS security Access control: critical because of the natures of
clouds ( Dynamic, large data and service sharing) Information flow control: assure the scientific flow
related info propagated to an authorized end Secure electronic transaction protocol: pay-as-you-
go pricing model
CONCLUSIONS
As more customers and applications migrate into Cloud, the requirement to have workflow system to manage complex tasks will become more urgent
Now mash-up’s and MapReduce style task management have been acting in place of a workflow system in the Cloud
The opportunities and challenges in bringing workflow systems into the Cloud are discussed
They identify key research directions in realizing scientific workflows in Cloud environments
Thank You!