View
220
Download
0
Category
Tags:
Preview:
Citation preview
WP2 Data and Compute Cloud
Platform
Marian Bubak, Piotr Nowakowski, Tomasz Bartyński, Jan Meizner,
Adam Belloum, Spiros Koulouzis, Eric Sarries, Stefan Zasada, David Chang
VPH-Share (No 269978)
WP2 in the VPH-Share WP2 in the VPH-Share frameworkframework
Mission: To develop a cloud platform which will enable easy access to compute and data resources by providing methods and services for:
1. Resource allocation management dedicated for VPH application scenarios;2. Execution environment for flexible deployment of scientific software on the virtualized infrastructure;3. Virtualized access to high performance (HPC) execution environments;4. Cloud data access for very large objects and data transfer between services;5. Data reliability and integrity to ensure sound use of biomedical data;6. Security framework.
WP2: ObjectivesWP2: Objectives and tasks and tasks
A cloud platform enabling easy access to compute and data resources
•Scientific coordination of the development of VPH-Share cloud computing solutions (Task 2.0);•Providing a means by which the Cloud resources available to the Project can be managed and allocated on demand (Task 2.1);•Developing and deploying a platform which will manage such resources and deploy computational tasks in support of VPHShare applications (Tasks 2.2 and 2.3);•Ensuring reliable, managed access to large binary objects stored in various Cloud frameworks (Tasks 2.4 and 2.5);•Ensuring secure operation of the platform developed in tasks 2.1-2.5 (Task 2.6);•Gathering user requirements, liaise with application teams and advising on migration to Cloud computational and storage resources; testing WP2 solutions in the scope of real-world application workflows (Task 2.7);•Collaborating with p-medicine forthe purposes of sharing experience with Cloud computing technologies (Task 2.8).
Cloud computingCloud computingWhat is Cloud computing?
◦„Unlimited” access to computing power and data storage
◦Virtualization technology (enables running many isolated operating systems on one physical machine);
◦Lifecycle management (deploy/start/stop/restart);◦Scalability;◦Pay-per-use accounting model;
However, Cloud computing isn’t:◦…a magic platform to automatically scale your
application up from your PC;◦…a secure place where sensitive data can be stored
(this is why we require security and data anonymization…).
WP2 offer for workflowsWP2 offer for workflows
Scale your applications in the Cloud („unlimited” computer power/reliable storage);
Utilize resources in a cost-effective way; Install/configure each Atomic Service once – then
use them multiple times in different workflows; Many instances of Atomic Services can be
instantiated automatically; Large-scale computation can be delegated from
the PC to the cloud/HPC; Smart deployment: computation can be
executed close to data (or the other way round); Multitudes of operating systems to choose from; Install whatever you want (root access to Cloud
Virtual Machines).
Partner PM
Description
CYFRONET 119 Coordination of work package. Architecture of the cloud platform. Cloud execution environment and access to computational resources.
ATOS 57 Integrated authentication and authorization framework.Integration with VPH semantics (WP4)
UCL 30 Providing high level access to virtualized HPC application as services
USFD 8 Integration of application workflows with the cloud platform. Coordinator of all flagship workflows.
UvA 50 Multiple protocol data transfer between services.
Integration of ViroLab workflow with the cloud platform.
IOR 2 Integration of VPHOP workflow with the cloud platform.
KCL 4 Integration of euHeart workflow with the cloud platform
UPF 4 integration with relational data access (WP3), and user access systems (WP6)
Integration of @neurIST workflow with the cloud platform.
WPWP22: Partner Roles: Partner Roles
3 (new) words 3 (new) words ……
Appliance: A running instance of an atomic service, hosted in the Cloud and capable of being directly interfaced, e.g. by the workflow management tools or VPH-Share GUIs.
!
Virtual Machine: A self-contained operating system image, registered in the Cloud framework and capable of being managed by VPH-Share mechanisms.
!
Atomic service: A VPH-Share application (or a component thereof) installed on a Virtual Machine and registered with the WP2 cloud management tools for deployment.
!
Raw OS
OS
VPH-Share app.(or component)
External APIs
OS
VPH-Share app.(or component)
External APIs
Cloud host
WP2 vision at a glance WP2 vision at a glance (1/3)(1/3)
Installing a VPH-Share application in the Cloud (developer action):•Upon the application developers’ request, the Atmosphere component (developed in T2.1 and T2.2) spawns a fresh Virtual Machine, which resides in the Cloud and contains all the features typically expected of an „out of the box” operating system (virtualized storage, standard libraries, root account, initial configuration etc.) If needed, many such VMs can be spawned, each encapsulating a single VPH-Share atomic service.•It is the application developers’ task to install components of their applications on these templates so that they can be wrapped as atomic services.•WP2 tools can then further manage the atomic services and deploy their instances (also called appliances) on Cloud resources as requested by the Workflow Composer.
VM
Cloud platform
OStemplateregistry
T2.1
Atmosphere
UI
1. Browse avail. OS templates
2a. Create VM with selected OS
2b. Spawn VM
2c. Return VM IP
4. Installrequiredsoftware
5. Save as atomic service
Preparing a VPH-Share application for execution (user action):•The user requests the Workflow Composer (via the WP6 Master UI) to execute an application.•The Workflow Composer informs Atmosphere which atomic services to deploy in the Cloud so that the workflow may be executed.•Atmosphere takes care of deploying the required assets and returns a list of service endpoints (typically IP addresses) whereupon workflow execution may commence.•Atmosphere can be designed to „talk to” many different computing stacks, and thus interface both commercial and private Clouds – we are currently eyeing Eucalyptus, OpenStack, OpenNebula and Nimbus.•Depending on the underlying Cloud computing stack we expect to be able to define deployment heuristics, enabling optimization of resource usage.1. Log in to
MI
VPH Master Int.
WorkflowComposer
tool
2. Execute application
3a. Get atomic services
Appl. #1
Cloud platform
Appl. #2
Atomicservice reg.
Atmosphere
API Cloud resource reg.
3b. Spawn appliances
3c. Return list of
appliances
4. Runworkflow
T2.2
WP2 vision at a glance WP2 vision at a glance (2/3)(2/3)
Managing binary data in Cloud storage (developer action):•Atmosphere will contain a registry of binary data for use by VPH-Share applications (T2.5) and assume responsibility for maintaining such data in Cloud storage.•Appliances may produce data and register it with Atmosphere.•Atmosphere will provide a query API where authorized applications may locate and retrieve Cloud-based data.•If required, Atmosphere may also shift binary data between Cloud storage systems (not depicted)•As an optional tool, we can develop a data registry browsing UI for integration with the VPH-Share Master Interface (not depicted).•For access to the underlying Cloud storage resources, we intend to apply tools developed in Task 2.4.
VPH-Share application
2. Inform Atmosphere(passing handle to stored data)
Binarydata
registry
Atmosphere
API
4. Getdata
3a. Query
3b. Retrievehandle
T2.5Cloud
storage
1a. Generate data1b. Save datain Cloud storage
VPH-Shareapplication
WP2 vision at a glance WP2 vision at a glance (3/3)(3/3)
Problem description•If the appliance is to be accessed by external users, its corresponding atomic service needs a remote interface through which its functionality can be invoked. According to the Technical Annex, we expect that all such interfaces assume the form of Web Services (cf. p. 41) exposed by the hosts on which the VPH-Share appliances will reside.•While Atmosphere can manage atomic services, it falls to application developers to encapsulate the functionality of their applications (or parts thereof) in the form of Web Services. We believe that this is a crucial task to be addressed by Task 2.7 as soon as feasible.
OS
VPH-Share app.Component #2
Cloud host
OS
Visualizationcomponent
Cloud host
OS
VPH-Share app.Component #1
Cloud host
OS
VPH-ShareMaster Interface
User host
1. Executeapplication2. Run calculations
3. Preparevisualization 4. Display output
Issue: application Issue: application interfacesinterfaces
Problem description:•Early studies suggest that we should adopt OpenNebula for private VPH-Share cloud installations due to its simple yet effective design – however, we are eager to discuss the matter with WP2 partners.•For larger private cloud deployments OpenStack (or, at least, its Object Store) seems a good choice.Eucalyptus OpenNebula
OpenStack Nimbus
Advantages Drawbacks Advantages Drawbacks
Advantages AdvantagesDrawbacks Drawbacks
• Excellent compatibility with EC2
• Advanced network features
• Comes with its own Cloud storage engine (Walrus)
• Overly complex architecture given the features it offers
• Advanced networking modes require a dedicated network switch with VLANs
• Poor Walrus functionality and performance
• Heavyweight software and communication protocols
• Simple yet effective design
• Standard communication protocols (SSH)
• Lightweight technology (Ruby + bash scripts)
• Standard Shared Storage (NFS, SAN + Cluster FS)
• Contextualization support
• Poor network autoconfiguration features
• No dedicated storage engine
• Minor but irksome bugs in configuration scripts
• Advanced features, including complex computing and (object) storage engine
• Relatively simple design given the rich feature set
• Support for various hypervisors (including KVM, Xen, MS Hyper-V)
• IPv6 support
• Advanced networking modes require a dedicated network switch with VLANs (much like Eucalyptus)
• Higly complex architecture and convoluted deployment
• Questions regarding maturity
• Advanced storage engines – Cumulus- and LANTorrent-based
• Support for legacy technology (PBS)
• Manageable architecture
• EC2/S3 compatible
• Requires installation of heavyweight components on HEAD node
• Largely a conglomerate of various technologies – may cause maintainace issues
Issue: selection of software stacks for Issue: selection of software stacks for private Cloud installationsprivate Cloud installations
T2.T2.0 Scientific 0 Scientific Management Management
Key issues: Software development should be based on top quality research Collaboration and exchange of research results within WP, VPH-
Share, and with related projects Encouraging publications (FGCS, IEEE Computer, Internet
Computing, …) Participation in conferences (EuroPar, e-Science, CCGrid, ICCS,
…) Organization of workshops PhD and MSc research related to the project topics Promotion of best software engineering practices and research
methodologies Hosting an e-mail list, wiki and telcos (1 per month); managing
P2P contacts and WP meetings (semiannually)
Partners involved and contact persons: CYFRONET, Marian Bubak, Maciej Malawski {bubak,malawski}@agh.edu.pl
and task leaders
Main goals: Overseeing the scientific progress and synchronisation of tasks Interim (6 monthly) and annual reports
Main goal: Multicriterial optimization of computing resource usage (private and public clouds as well as the HPC infrastructure provided by T2.3);Key issues:• Applications (atomic services) and workflow
characteristics;• Component Registry (T6.5);• Workflow execution engine interfaces (T6.5);• Atomic Services Cloud Facade interface (T6.3);• Security (T2.6);Partners involved and contact persons:•CYFRONET (Tomasz Bartyński; t.bartynski@cyfronet.pl); •UCL (David Chang; d.chang@uci.ac.uk);•AOSAE (Enric Sarries; enric.sarries@atosresearch.eu).
T2.1 Cloud resource Allocation T2.1 Cloud resource Allocation ManagementManagement
T2.1 Deployment planningT2.1 Deployment planningAtmosphere will take into account application characteristics and infostructure status to find an optimal deployment and allocation plan which will specify:•where to deploy atomic services (partner’s private cloud site, public cloud infrastructure or hybrid installation),•should the data be transferred to the site where the atomic service is deployed or the other way around,•how many atomic service instances should be started,•is it possible to reuse predeployed AS (instances shared among workflows)?
The deployment plan will be based on the analysis of:•workflow and atomic service resource demands,•volume and location of input and output data,•load of available resources,•cost of acquiring resources on private and public cloud sites,•cost of transferring data between private and public clouds (also between „availability zones” such as US and Europe ),•cost of using cheaper instances (whenever possible and sufficient; e.g. EC2 Spot Instances or S3 Reduced Redundancy Storage for some noncritical (temporary) data),•public cloud provider billing model (Amazon charges for a full hour – thus, five 10-minute tasks would cost 5 more times to run than an individual instance).
1. Inform about req. AS and data Atmosphere
API
T2.1
Cloudstora
ge
WorkflowExecution
3. Collect computing, storage and networking statistics
4. Analyze and prepare optimal deployment
CloudComputing resources
T2.1 in scope of VPH-Share projectT2.1 in scope of VPH-Share project
Component
RegistryT6.5
T6.5
2. Get metadata aboutrequired appliances and data
Atmosphere will:•receive requests from the Workflow Execution stating that a set of atomic services is required to process/produce certain data;•query Component Registry to determine the relevant AS and data characteristics;•collect infostructure metrics,•analyze available data and prepare an optimal deployment plan.
T2.2 Cloud Execution T2.2 Cloud Execution EnvironmentEnvironment
Main goal: Deployment of atomic services in the Cloud according to T2.1 specificationsKey issues:• Cloud usage (public providers, private setups contributed by partners; choice of Cloud computing platform);
• Interfacing infostructure:• Public Cloud providers as well as private (partner-operated) Cloud platforms built using heterogeneous resources?
• Data Access services T2.4;• Moving atomic services across the infostructure;• Atomic Services Invoker interface (T6.3);• Security (T2.6);Partners involved and contact persons:•CYFRONET (Tomasz Bartyński; t.bartynski@cyfronet.pl); •UCL (David Chang; d.chang@uci.ac.uk);•AOSAE (Enric Sarries; enric.sarries@atosresearch.eu).
T2.2 Deployment accordT2.2 Deployment accordiinngg to plan to plan fromfrom T2.1 T2.1
AS #2
Atmosphere
API
T2.2
Cloudstorag
e
Deploy appliance / move data
CloudComputing resources AS #2
T2.2 will receive a deployment plan from T2.1It will implement the deployment plan by instantiating atomic services on private and/or public Clouds, and moving data using T2.4 tools.•It may be required to interface public and/or private clouds built upon different platforms so the choice of Cloud API and client-side library is important;•Cyfronet is currently investigating Amazon EC2, Open Cloud Computing Interface, OpenStack Compute (Nova); S3 and Swift storage APIs.
Atmosphere
APIT2.2
Monitor and scale
AS
T2.2 Monitoring and scaling T2.2 Monitoring and scaling infostructureinfostructure
Atmosphere will monitor the usage of atomic servicesAtomic Services will be scaled:• new instances will be started for overloaded services;
• underutilized instances will be shut down.
T2.3 High Performance Execution T2.3 High Performance Execution EnvironmentEnvironment
Key issues: Key issues: Cloud computing provides an infrastructure on Cloud computing provides an infrastructure on
which to run so called capacity workloadswhich to run so called capacity workloadsSome workflows require access to high Some workflows require access to high
performance computing resorucesperformance computing resorucesCloud computing paradigm has not found wide Cloud computing paradigm has not found wide
uptake in HPC uptake in HPC Introduces performance Introduces performance overheadoverhead
Need to preinstall and optimise applications on Need to preinstall and optimise applications on HPC resourcesHPC resources
Should we seek to integrate HPC access tightly Should we seek to integrate HPC access tightly with cloud computing or treat it separately? with cloud computing or treat it separately?
Partners involved and contact persons:Partners involved and contact persons:UCL: David Chang (d.chang@ucl.ac.uk)UCL: David Chang (d.chang@ucl.ac.uk); ; Stefan ZasadaStefan Zasada
(stefan.zasada@ucl.ac.uk)(stefan.zasada@ucl.ac.uk)Cyfronet: Tomasz Bartynski (t.bartynski@cyf-kr.edu.pl)Cyfronet: Tomasz Bartynski (t.bartynski@cyf-kr.edu.pl)
Main goals: Provide virtualised access to high performance execution
environments seamlessly provide access to high performance computing to workflows that require more computational power than clouds can provide
Deploy and extend Application Hosting Environment – provides a set of web services to start and control applications on HPC resources
T2.3 High Performance Execution T2.3 High Performance Execution EnvironmentEnvironmentVirtualizing access to scientific applicationsVirtualizing access to scientific applications
Tasks:1.Refactor AHE client API to provide similar/same calls as Eucaliptus/cloud APIs Access Grid/HPC in similar way to cloud2.Integrate AHE (via API) with Resource Allocation Management system developed in T 2.1 AHE will publish load information from HPC resources3.HPC typically uses pre-stages applications. UCL will build, optimise and host simulation codes in AHE4.Extend AHE to stage/access data from cloud data facilities developed in T 2.45.Integrate AHE/ACD with security framework developed in T2.7
Application Hosting Environment:• Based on the idea of
applications as stateful WSRF web services
• Lightweight hosting environment for running unmodified applications on grid and local resources
• Community model: expert user installs and configures an application and uses the AHE to share it with others
• Launch applications on Unicore and Globus 4 grids by acting as an OGSA-BES and GT4 client
• Use advanced reservation to schedule HPC into workflow
T2.4 Data Access for Large Binary T2.4 Data Access for Large Binary ObjectsObjects
Key issues: Integrate functionality of the security
framework (T2.6) to provide user- and application-level access control as well as ensuring storage privacy (a, b)
Maintaining and synchronizing multiple replicas on different loosely coupled resources (a)
Dealing with errors in storage systems as well as providing a uniform metadata model (a)
Providing an abstraction on a higher-level transport protocol for fast transfers, checkpoints and parallel streaming (b)
Partners involved and contact persons: UvA; Spiros Koulouzis; S.Koulouzis@uva.nl, Adam Belloum; A.S.Z.Belloum@uva.nl
Main goals: a) Federated cloud storage: uniform access to data storage
resources to transparently integrate multiple autonomous storage resources; as a result a file system will be provided as a service, optimizing data placement, storage utilization, speed, etc.
b) Transport protocols: efficient data transfers for replication, migration, and sharing; to avoid centralisation bottlenecks, connection services will be deployed near the data
Cloud Storage2
Abstraction Layer
Cloud Storage2
Cloud Client1
Service Client
Operations layer
Ma
na
ge
me
nt
Op
timiza
tion
LOB federated storage access
ConnectionService1 Connection
ServiceN
SOAP/REST
Cloud Client1
T2.4 Data Access for Large Binary T2.4 Data Access for Large Binary ObjectsObjects Federated Cloud Storage – Federated Cloud Storage –
GoalsGoals Transparently integrate multiple
autonomous storage resources Provide virtually limitless capacity Uniform access to data storage
resources or a file system like view Optimize:
◦ Data placement based on access frequency, latency, etc.
◦ Storage utilization/cost; this will have to make sure that storage space is used in the most efficient way
Provide file system as a service; this will provide an intuitive and easy way to access the federated storage space
Cloud Storage1
Abstraction Layer
Data
DataData
Operations layer
Managem
entO
ptimization
LOB federated storage access
Cloud Client1
Cloud Client2
T2.4 Data Access for Large BinaryT2.4 Data Access for Large Binary ObjectsObjects Federated Cloud Storage – Federated Cloud Storage –
IssuesIssues
Clearing data handling requests with the security framework (T2.6)◦ Ensuring privacy on the storage locations◦ Obtaining authorization, authentication, access control, etc.
Synchronizing multiple replicas on different loosely coupled resources
Determining the distance between replicasDealing with errors in storage systems beyond our controlProviding a uniform meta-data modelDefining utility functions for optimising multiple targets
such as space usage, access latency, cost, etc.
T2.4 Data Access for Large Binary T2.4 Data Access for Large Binary Objects Objects TransferTransfer protocolsprotocols – Goals – Goals
Interconnect resources for efficient transport
Investigate the state of the art of protocols designed for large-scale data transfers, such as UDT & GridFTP
Provide a higher-level protocol that will be capable of checkpointing and exploiting parallel streams to boost performance
Deploy connection services next to or near data resources
Use direct streaming Take advantage of existing transfer
protocols, such as HTTP(s) Take advantage of 3rd party transfer
abilities offered by underlying protocols e.g. GridFTP
LocalFS
ConnectionService
ConnectionService
Abstraction Layer
Operations layer
Managem
entO
ptimization
LOB federated storage access
Control
(e.g. checkpo
ints)
Transport
Cloud Client1
Service Client
UDTHTTP(S)GridFTP
T2.4 Data Access for Large Binary T2.4 Data Access for Large Binary Objects Objects TransferTransfer protprotoocolscols – Issues – Issues
In case of migrations, transfers, replications, etc. In case of migrations, transfers, replications, etc. between different storage systems we have to consider:between different storage systems we have to consider:◦ Appliances in the case of IaaS◦ Web services in the case of PaaS◦ They will act as connection services, enabling third party
transfers, usage of state-of-the-art transport protocols such as UDR, GridFTP, as well as torrentlike transport models
These connection services will have to:These connection services will have to:◦ Encrypt data while in transit from one provider to another◦ Resume failed transfers◦ Enable checkpoints in transfers to increase fault tolerance
Use the framework provided by Atmosphere to maintain Use the framework provided by Atmosphere to maintain and deploy such servicesand deploy such services
T2.4 Data Access for Large Binary T2.4 Data Access for Large Binary Objects Objects Interactions with other WPInteractions with other WPss & &
TasksTasks
Security Framework Task 2.6
Data Reliability And integrity
Task 2.5
Cloud Execution Environment
Task 2.2
High Performance Execution Environment
Task 2.3
Data Access and mediation
Task 3.3
Fine Grained ACL/
Authorization request
Perform Integrity/checksum
on data
Workflow ExecutionTask 6.5
Move Data/Tasks
around resources
Deployment/Transport of System Images
Move data From/to
resources
Provide access to raw data
Abstraction Layer
Operations layer
Managem
entO
ptimization
LOB federated storage access
Service SOAP/REST
T2.5 Data Reliability and T2.5 Data Reliability and IntegrityIntegrity
Main goals: Provide a mechanism which will keep track of binary data stored in the Cloud infrastructure, monitor its availability; advise Atmosphere when instantiating atomic services and – when required – shift/replicate data between clouds (in collaboration with T2.4);Key issues: Establishing an API which can be used by applications and end users; deciding upon supported set of cloud stacks;Partners involved and contact persons:•CYFRONET (Piotr Nowakowski; p.nowakowski@cyfronet.pl)
Binarydata
registry
Atmosphere
T2.5
Amazon S3
Walrus (Eucalyptus)
ObjectStorage (OpenStack)
Cumulus (Nimbus) T2.4
LOB federated storage access(sample protocol stacks)
Register filesGet metadataMigrate LOBs
Get usage stats(etc.)
Distributed Cloud storage
Store and marshal data
End-user features(browsing, querying, direct access to data)
VPH Master Int.
DataStorage
GUI
T2.5 Data Reliability and T2.5 Data Reliability and IntegrityIntegrity
Enforcing data integrity by means of •access control (requires integration with T2.6);•access log (requires integration with T2.4).Each operation performed by T2.4 tools can be logged in the Atmosphere registry for the purposes of establishing data provenance. Moreover, Atmosphere can enforce fine-grained data access security by means of policies defined in Task 2.6.
Access log
Atmosphere
T2.5
LOB federated storage accessLog access to registered data
Distributed Cloud storage
Store and marshal data
VPH Master
InterfaceOperations on registered data(requested by usersor workflow management tools)
T2.6 Security FrameworkT2.6 Security FrameworkMain goals: Policy-driven access system for the security framework. Providing a solution for an open-source based access control system based on fine-grained authorization policies. Components: Policy Enforcement, Policy Decision, Policy Management, Registry of conditions and effects definitions and values.
Key issues:•Security by design•Privacy & Confidentiality of eHealthcare data•Expressing eHealth requirements & constraints in security policies (compliance)•Deploying security software in clouds (potential threats and inconsistencies).
Partners involved and contact persons:•AOSAE (Enric Sàrries; enric.sarries@atosresearch.eu).•CYFRONET (Jan Meizner; jan.meizner@cyfronet.pl); •UCL (Ali Haidar; ali.haidar@ucl.ac.uk);
VPH Clients
VPH Security Framework
VPH Security Framework
VPH Services
InternetInternet
VPH User
RDBService
ComputeService
StorageService
Computing Service Broker
(2.1)
WorkflowExecution
(6.5)
HPC Infrastructure (e.g. DEISA)
VPH Global Catalogue
(4.2)
Data Access for LargeBinaryObjects(2.4)
Cloud Execution Environment
(2.2)
VPH Workflow Composer
(WP6)
Visualization Tools(WP6)
VPH Master Interface
(WP6)
HPC Execution Environment
(2.3)
Public / PrivateCloud
Semantic Service Models
(4.3)
Relational Data Access
(WP3)
Security Framework (2.6)
Security Framework (2.6)
Data Inference Service
(4.4)
T2.6 Security FrameworkT2.6 Security Framework
?
The Security Components will be located in the frontends of the VPH deployment. There are 3 perimeters in VPH-Share:
◦ User Interfaces
◦ Applications (Workflow execution)
◦ Infostructure
◦ However, the design of the VPH-Share architecture is still not clear. To prevent man-in-the-middle (impersonation) attacks, we must know which are the boundaries, the perimeter to secure.
Security by design implies the following:
◦ Components must not bypass security (services/data not exposed to threats), each message must go to / come from the security framework
◦ Components must be trusted (well specified, standard, “known”) with respect to the security policies
◦ Administrative access to the Security Framework to modify the access rights to the component (system admins can configure security for their software)
◦ Relying on known designs for the whole system.
The Security Components will provide the following features:
◦ Secure messaging between the frontends of each deployment/platform
◦ Authentication of VPH Users
◦ Resource access authorization (based on access control policies)
◦ Specifically, for policy-based access control, we will need the following components:
◦ Policy Enforcement Point: It “executes” the policies. It composes the access control request. Once it gets the authorization response from the Policy Decision Point, it enforces it by allowing/denying the requested access.
◦ Policy Decision Point: It analyses the access control request, the policies and the environment, and issues an access control decision: either “Permit” or “Deny”.
◦ Policy Administration Point: Deploys or provisions policies to the PDP. Enables reconfiguration of access rights to resources.
◦ Policy Information Point: Provides attribute-value pairs needed for the analysis of the access control requests and policies.
T2.6 Security FrameworkT2.6 Security Framework
VPH User
Policy Decision
Point
Security Token Service
Policy Administration
Point
Policy Information
Point
Policy Enforcement
Point
RDB Service
Compute Service
Storage Service
VPH Services
OStemplateregistry
OS
VPH-Share app.(or component)
Secured External APIs
Cloud host
Secure Proxy
Developer
Atmosphere
T2.6 Security FrameworkT2.6 Security Framework Deploying secure appliances in the cloud presents a challenge:
◦ Many different OS templates
◦ Many different VPH-Share appliances
◦ The best way to address this is, in our opinion:
◦ Security Components deployed in the OS templates of Atmosphere;
◦ When deploying a VPH service, the Security component is configured to “proxy” it.“Create a virtual
machine and install a VPH-Share service in it”
T2.7 T2.7 Requirements Requirements AAnalysis and nalysis and IIntegration ntegration of VPHof VPH W Workorkfflow low SServices with the ervices with the CCloudloud
IInfrastructurenfrastructureMain goals: Ensure integration with VPH-Share workflows and deployment of VPH-Share atomic services on Cloud and HPC resources provided by the partnersKey issues: Establishing workflow specification details and atomic service requirementsPartners involved and contact persons:•USFD (workflow coordination – cp. Richard Lycett; Rich.Lycett@gmail.com);•KCL (Cardiovascular Modeling; T5.5 – cp. TBD);•UPF (Neurovascular Modeling; T5.6 – cp. TBD);•IOR (Orthopaedic Modeling; T5.4 – cp. TBD);•UvA (HIV Epidemiology; T5.5 – cp. TBD);•CYFRONET (Marek Kasztelnik; m.kasztelnik@cyfronet.pl)
A preliminary questionnaire has been distributed at the kickoff meeting in Sheffield; results are due by mid-May 2011.
T2.8 Joint Strategy for Cloud T2.8 Joint Strategy for Cloud Computing between Computing between pp--mmedicine and edicine and
VPH-ShareVPH-ShareMain goals:•Exchange of information on Cloud computing and storage environments with focus on how they may support distributed medical information systems;•Exchange of technical knowledge pertaining to the exploitation of specific Cloud technologies;•Joint assessment of the applicability of Cloud platforms to storing, processing and exposing data in the context of medical applications;•Exchange of prototype software and detailed technical documentation thereof, with the possibility of cross-exploitation of synergies between both projects;•Semiannual collaboration workshops of representatives of both Projects to support the above.•1st VHP-Share/P-Medicine meeting: about June 15, at UvA, to discuss (among others) D2.1 and D6.1 as well as plans for platform design .
Partners involved and contact persons:•CYFRONET (cp. Marian Bubak; m.bubak@agh.edu.pl);•USFD (dp. TBD)•UvA (cp. Adam Belloum; A.S.Z.Belloum@uva.nl);•UCL (cp. TBD)
WP2: ServicesWP2: ServicesService Name Description
Atmosphere Computing Service Broker
A tool to instantiate new Virtual Machines in selected Cloud environments (enabling application developers to prepare atomic services), then store and manage the resulting service images; interface extensions for the VPH-Share Master Interface
Atmosphere Atomic Service Deployment Tool
Deployment of available atomic services in the Cloud as requested by the workflow execution tools; interface extensions for the Workflow Execution tool (API)
Atmosphere Data Management Tool
A registry of binary data belonging to VPH-Share applications, enabling storage, querying and direct retrieval (interface extensions for the VPH-Share Master Interface and an API for VPH-Share applications)
Computing access extensions for Cloud stacks and HPC infrastructures
Pluggable support for interaction with various Cloud computing stacks and HPC infrastructures for the purposes of scheduling computations/deploying atomic service instances and retrieving results
Data access extensions for Cloud stacks
Pluggable support for interaction with various Cloud computing stacks for the purposes of storing, processing and retrieving binary data objects
Integrated security Cross-domain component; ensures secure operation of all of the above
Key Key WP2WP2 interactions interactions
ComputingService
Broker (2.1)
Cloud Exe.Environment
(2.2)
HPC Exe.Environment
(2.3)
WorkflowExecution
(6.5)
HPCInfrastructure(e.g. DEISA)
PublicCloud
s
Private
Clouds
VPH MasterInterface
(6.4)
Data IntegrityServices
(2.5)
Binary DataAccess(2.4)
MetadataMgmt
Services(4.2)
WP2 Security(2.6)
Atomic ServiceCloud Facade
(6.3)
1
2
3
46 7
5
8
910
11
12
12
1. Workflow execution requests2. Atomic Service preparation
requests3. AS creation and
management UI4. Invocations of Atmosphere
back-end services5. Data management UI
(possibly integrated with Task 6.3)
6. Sharing Atomic Service metadata (common/distributed registry)
7. Sharing LOB metadata (common/distributed registry)
8. Preparation of HPC resources9. Binary data processing
requests10.Instantiation and
management of Appliances based on AS templates (AS template repository not depicted)
11.Execution of computational jobs
12.Low-level access to Cloud resources
13.Enactment of workflows using preinstantiated Cloud and HPC resources (Appliances)
9
13
Used by Description
Task 6.5, Workflow Execution
Requesting deployment of atomic service instances (appliances) whenever a workflow is to be executed; interacts with Atmosphere (by means of a dedicated API) and with data access extensions (separate APIs provided by Task 2.4)
Task 6.4, Master Interface
Visual management of atomic services and instantiation of additional virtual machines as requested by VPH-Share application developers (via plugin); direct access to binary data (via plugin); interacts with Atmosphere by means of dedicated APIs
Tasks 5.4-5.7, VPH-Share applications
Used directly by application developers to deploy their applications in the VPH-Share infrastructure; also used indirectly by application users whenever a workflow is to be instantiated and executed or whenever binary data needs to be read from the underlying Cloud storage
Uses DescriptionTasks 2.2 and 2.3, Cloud and HPC computing extensions
Internal dependency; Atmosphere requires a means of communication with the underlying resources – hence the dependency of Task 2.1 on Tasks 2.2 and 2.3.
Task 2.4, Data access for large binary objects
Internal dependency; Atmosphere requires a means of manipulating data in Cloud storage – hence the dependency of Task 2.5 on Task 2.4
Tasks 5.4-5.7, VPH-Share applications
Input is required from application teams to establish requirements (functional and nonfunctional) with regard to the underlying computational and storage resources (public/private Cloud infrastructures, preferred deployment environments, operating systems, processors, memory, network bandwidth, storage capacity etc.)
WP2: InteractionsWP2: Interactions
WP2: Workflow InteractionsWP2: Workflow Interactions (tbd)(tbd)
Type Description WF #
DiagramsDoes workflow have use case, data flow, component diagrams? If not, when can workflow deliver them?
All
Development status
What is current status of the workflow atomic services (concept, design, development of first prototype, first prototype, further development and version, release, deployment)? When is the first version of the workflow planned to be released?
All
DataWhat kind of data will workflow store (format/size)? Does workflow need streaming?
All
Computation
What computational (desktop, cluster, grid, cloud) are required? How long does workflow take to serve one request on a target (suitable) computing platform? Are the workflow elements stateless?
All
Operating system, licensing system
What kind of operating system do workflow atomic services require? What kind of licence does workflow (and all required atomic services, libraries) have?
All
Communication protocol
Is the application remotely accessible? What protocols are used? All
OBJECTIVEEVALUATED/
DUE
1. Analysis of the state-of-the art, work package definition D2.1 [PM03]
2. Architecture and design of the cloud platform D2.2 [PM06]M1 [PM12]
3. 1st prototype of the cloud platform – Alpha Release D2.3 and M3 [PM12]
4. 2nd prototype of the cloud platform – Beta Release D2.4 and M7 [PM24]
5. Final deployment of the of the cloud platform – Candidate Release
D2.5 and M11 [PM36]
6. Full integration of four workflows with VPH infostructure M12 [PM42]
7. Comprehensive collection of data sources accessible through Candidate Release
M13 [PM42]
8. Final evaluation of the of the cloud platform, Maintenance Release
D2.6, M15, and M16 [PM48]
WPWP22: Measurable : Measurable ObjectivesObjectives
See DoW VPH-Share (269978) 2010-11-08 (pp. 15-17)
WP2: Mapping to Global WP2: Mapping to Global ObjectivesObjectives
WP2 Goals VPH-Share ObjectivesAnalysis of the state-of-the art, work package definition
To develop and to deploy the VPH infostructure through which the VPH community will be able to store, share, reuse and integrate data, information, knowledge and wisdom on the physiopathology of the human body.
Execution environment for flexible deployment of scientific software on the virtualized infrastructure
Virtualized access to high performance (HPC) execution environments
Cloud data access for very large objects and data transfer between services
Security framework
Data reliability and integrity to ensure sound use of the biomedical data
To develop a process by which models are formulated, analysed and annotated for integration into a workflow, able to exploit the VPH infostructure.
Exploiting synergies and exchanging experience with other similar projects, in particular with P-Medicine
To reach out, firstly to the VPH community, but then to the wider clinical and medical records communities to ensure access to the widest possible range of data and tools, made possible by the effective and easy-to-use annotation tools and compute service access that this project will provide and/or promote.
D2.1: Analysis of the State of the Art and WP Definition (M3 – end of May 2011)Requires contributions from technical developers w.r.t. solutions which will be considered for use in their tools.Proposed TOC and responsibles:1. Introduction (incl. objectives and approach)CYFRONET2. High-level overview of WP2 (incl. generalized view of WP2 architecture)
CYFRONET3. Key challenges in developing a Cloud platform for VPH-ShareCYFRONET4. Targeted SOTA for:
• Cloud Resource Allocation Management CYFRONET
• Cloud Application Execution CYFRONET
• Access to High-Performance Execution Environments UCL• Access to large binary data on the Cloud
UvA• Data reliability and integrity
CYFRONET• Cloud Security frameworks
AOSAE5. Conclusions (incl. summary and references)CYFRONET
As part of Section 4, we ask each contributing partner to conform to the following schema:•Problem statement (Why is this aspect important for VPH-Share?)•SOTA description (along with an in-depth discussion of the advantages and drawbacks of available technologies)•Recommendations for VPH-Share (Which technologies to adopt? Is it necessary to extend them? If so – why and how?)Deadline for contributions is May 6 (submit to p.nowakowski@cyfronet.pl).
Upcoming WP2 deliverableUpcoming WP2 deliverable - - D2.1D2.1
Recommended