4
Easy Virtual Cluster with Cloud Storage for HPC Numerical Simulation Hui-Shan Chen, Kuo-Yang Cheng, Zong-Shuo Jheng, Yi-Lun Pan, Chang-Hsing Wu and Hsi-En Yu Software Technology Department National Center for High-Performance Computing Taichung, Taiwan e-mail: {chwhs, kuoyang, zsjheng, serenapan, hsing, yun}@nchc.narl.org.tw Abstract—As Cloud services are burgeoning and attractive to HPC developers in virtue of the virtualization technology, how to establish their own dedicated virtual cluster easily become a very important problem. To contribute this issue, Easy Virtual Cluster with Cloud Storage (called VClouster) has been developed by Pervasive Computing Team in the National Center for High-performance Computing (NCHC). Just easy operation, NCHC VClouster helps building a Computing Cloud for HPC numerical simulation. It satisfies various customized demands and has high scalability for virtual machines (VM) so the performance of using Cloud service can be improved. NCHC VClouster system is a friendly design for Cloud users to send their requests, through the user specification it would create a customized and dedicated virtual cluster on demand for different users automatically. The system leverages Cloud toolkit combined with Cloud storage service, then Cloud users do not worry about the transmission of a huge amount of application data. NCHC VClouster also introduces International Certificate Authority (CA) services so that authorized users can easily utilize their virtual machines and integrate more computing resources in same CA organization. Keywords: Cloud; Virtualization Technology; Virtual Cluster; Cloud Storage; HPC I. INTRODUCTION In NCHC VClouster system, Cloud users can create and customize their dedicated “Virtual Cluster” by just sending their requirements. After that, they can install specified software and configure variables according to their needs. Users do not have the administrator privilege to the real physical machines, and they also do not worry that their own virtual cluster would be interfered by other users. Multiple virtual machines may host on a single physical machine, however, each virtual machine is independent and does not interference other virtual machine and the physical machine. In most situations, users who want to do scientific or engineering simulations probably spend lots of time on setting their specified environment, and different simulations will need the different software and configurations. It is extremely inconvenient for both administrator and users if many Cloud users utilize the same physical cluster simultaneously. Cloud computing is an emerging service of providing access to remote resources anytime and anywhere. NCHC Pervasive Computing Team introduces Nimbus Cloudkit [10][11] to help building a Computing Cloud for HPC numerical simulation. The proposed system mainly devotes to provide Cloud services for HPC applications in the virtualization environment so it makes users utilize resources at will and easily retrieve data with Cloud storage service. Dropbox iFolder and SparkleShare are all example of Cloud-based file sharing service that allows you to synchronize files among different computers. NCHC VClouster can offer a high quality of research and experiment environment for academia, research center, business community, etc. User authorization is also indispensable, and therefore NCHC VClouster integrates NCHC CA [6] to provide certificate services. Moreover, NCHC CA is a member of Asia Pacific Grid Policy Management Authority (APGrid PMA [5]) so the authorization in Cloud can be trusted more safely. The remainder sections of the paper are organized as follows: Section II gives a briefly review on related studies. In Section III, we proposed the system architecture of NCHC VClouster. And the following Section IV, the design and implementation are elaborately described. Then the performance evaluation is completely presented in section V. Finally we conclude this paper and picture future plans in Section VI. II. RELATED WORKS Many developed projects now provide supports for parallel computing in VM-based environment. In [12], their mentioned environment consists of standard VM hosting hardware infrastructure, the Xen [8] virtualization software, and a collection of common and custom software applications that provide the enhanced managed hosting features. They also integrate and deploy the managed hosting environment based on VM technology. MPI-SWS proposed a design of a trusted Cloud computing platform (TCCP) [13]. TCCP enables Infrastructure as a Service (IaaS) providers such as Amazon EC2 to provide a closed box execution environment that guarantees confidential execution of guest virtual machines. They construct the cloud computing platform based on Eucalyptus. Managing security of virtual machine images in a cloud environment is described in [14]. TTheir paper explains the risks that face administrators and users of a cloud's image repository. To address those risks, they propose an image management system that controls access to images, tracks the provenance of images, and provides users and administrators with efficient image filters and scanners that detect and repair security violations. 2011 IEEE Conference on Commerce and Enterprise Computing 978-0-7695-4535-6/11 $26.00 © 2011 IEEE DOI 10.1109/CEC.2011.36 197

[IEEE 2011 IEEE 13th Conference on Commerce and Enterprise Computing (CEC) - Luxembourg-Kirchberg, Luxembourg (2011.09.5-2011.09.7)] 2011 IEEE 13th Conference on Commerce and Enterprise

  • Upload
    hsi-en

  • View
    217

  • Download
    3

Embed Size (px)

Citation preview

Page 1: [IEEE 2011 IEEE 13th Conference on Commerce and Enterprise Computing (CEC) - Luxembourg-Kirchberg, Luxembourg (2011.09.5-2011.09.7)] 2011 IEEE 13th Conference on Commerce and Enterprise

Easy Virtual Cluster with Cloud Storage for HPC Numerical Simulation

Hui-Shan Chen, Kuo-Yang Cheng, Zong-Shuo Jheng, Yi-Lun Pan, Chang-Hsing Wu and Hsi-En Yu Software Technology Department

National Center for High-Performance Computing Taichung, Taiwan

e-mail: {chwhs, kuoyang, zsjheng, serenapan, hsing, yun}@nchc.narl.org.tw

Abstract—As Cloud services are burgeoning and attractive to HPC developers in virtue of the virtualization technology, how to establish their own dedicated virtual cluster easily become a very important problem. To contribute this issue, Easy Virtual Cluster with Cloud Storage (called VClouster) has been developed by Pervasive Computing Team in the National Center for High-performance Computing (NCHC). Just easy operation, NCHC VClouster helps building a Computing Cloud for HPC numerical simulation. It satisfies various customized demands and has high scalability for virtual machines (VM) so the performance of using Cloud service can be improved. NCHC VClouster system is a friendly design for Cloud users to send their requests, through the user specification it would create a customized and dedicated virtual cluster on demand for different users automatically. The system leverages Cloud toolkit combined with Cloud storage service, then Cloud users do not worry about the transmission of a huge amount of application data. NCHC VClouster also introduces International Certificate Authority (CA) services so that authorized users can easily utilize their virtual machines and integrate more computing resources in same CA organization.

Keywords: Cloud; Virtualization Technology; Virtual Cluster; Cloud Storage; HPC

I. INTRODUCTION In NCHC VClouster system, Cloud users can create and

customize their dedicated “Virtual Cluster” by just sending their requirements. After that, they can install specified software and configure variables according to their needs. Users do not have the administrator privilege to the real physical machines, and they also do not worry that their own virtual cluster would be interfered by other users. Multiple virtual machines may host on a single physical machine, however, each virtual machine is independent and does not interference other virtual machine and the physical machine. In most situations, users who want to do scientific or engineering simulations probably spend lots of time on setting their specified environment, and different simulations will need the different software and configurations. It is extremely inconvenient for both administrator and users if many Cloud users utilize the same physical cluster simultaneously.

Cloud computing is an emerging service of providing access to remote resources anytime and anywhere. NCHC Pervasive Computing Team introduces Nimbus Cloudkit [10][11] to help building a Computing Cloud for HPC numerical simulation. The proposed system mainly devotes

to provide Cloud services for HPC applications in the virtualization environment so it makes users utilize resources at will and easily retrieve data with Cloud storage service. Dropbox、 iFolder and SparkleShare are all example of Cloud-based file sharing service that allows you to synchronize files among different computers. NCHC VClouster can offer a high quality of research and experiment environment for academia, research center, business community, etc. User authorization is also indispensable, and therefore NCHC VClouster integrates NCHC CA [6] to provide certificate services. Moreover, NCHC CA is a member of Asia Pacific Grid Policy Management Authority (APGrid PMA [5]) so the authorization in Cloud can be trusted more safely.

The remainder sections of the paper are organized as follows: Section II gives a briefly review on related studies. In Section III, we proposed the system architecture of NCHC VClouster. And the following Section IV, the design and implementation are elaborately described. Then the performance evaluation is completely presented in section V. Finally we conclude this paper and picture future plans in Section VI.

II. RELATED WORKS

Many developed projects now provide supports for parallel computing in VM-based environment. In [12], their mentioned environment consists of standard VM hosting hardware infrastructure, the Xen [8] virtualization software, and a collection of common and custom software applications that provide the enhanced managed hosting features. They also integrate and deploy the managed hosting environment based on VM technology.

MPI-SWS proposed a design of a trusted Cloud computing platform (TCCP) [13]. TCCP enables Infrastructure as a Service (IaaS) providers such as Amazon EC2 to provide a closed box execution environment that guarantees confidential execution of guest virtual machines. They construct the cloud computing platform based on Eucalyptus.

Managing security of virtual machine images in a cloud environment is described in [14]. TTheir paper explains the risks that face administrators and users of a cloud's image repository. To address those risks, they propose an image management system that controls access to images, tracks the provenance of images, and provides users and administrators with efficient image filters and scanners that detect and repair security violations.

2011 IEEE Conference on Commerce and Enterprise Computing

978-0-7695-4535-6/11 $26.00 © 2011 IEEE

DOI 10.1109/CEC.2011.36

197

Page 2: [IEEE 2011 IEEE 13th Conference on Commerce and Enterprise Computing (CEC) - Luxembourg-Kirchberg, Luxembourg (2011.09.5-2011.09.7)] 2011 IEEE 13th Conference on Commerce and Enterprise

Figure 1. The System Architecture of NCHC VClouster

Figure 2. The Layer of System Architecture

Figure 3. The Designed NCHC EasyVCWeb

III. PROPOSED NCHC VCLOUSTER SYSTEM

A. Research Motivation and Objective The key points of Cloud Computing lies in its

component-based nature are reusability, substitutability and user friendly. By integrating virtualization technologies and Cloud storage, we provided an approach to acquiring Cloud services via EasyVCWeb in NCHC VClouster system. In order to provide easy virtual clusters with Cloud storage based on decentralized resource discovery architecture, we implemented the NCHC VClouster system. The system architecture of NCHC Vclouster is sketched in the Figure 1. This research focuses on easily building virtual clusters with Cloud storage through an interactive graphical user interface. In addition, NCHC VClouster also can support quick environment deployment for parallel HPC applications so it helps proceeding to do the numerical simulations. When NCHC EasyVCWeb receives a request for virtual clusters from the users via the web browser, and then the request will be sent to internal component of NCHC VClouster system.

B. System Architecture • Based on Nimbus Cloudkit We implement an Infrastructure-as-a-Service (IaaS)

system based on Nimbus Cloudkit. There are three key roles in NCHC VClouster that consist of service node, virtual machine manager nodes (VMM) and virtual machines (virtual clusters) as shown in Figure 2. The service node needs to manage a large number of VMMs and virtual clusters. The users’ images can be uploaded to service node and stored in Nimbus VM image repository for booting VMs. In VMM nodes that we must setup the VM, manage the configuration and communicate with service node. KVM and XEN are both the default virtual machine NCHC VClouster supports. According to user’s requirement, VMM can support Cloud users to build their dedicated virtual clusters with specified number of CPU and memory size to execute various applications.

The most important component is NCHC EasyVCWeb, as Figure 3, we implement the graphical user interface to simplify the establishment of virtual cluster. It adopts the cloud-client package supported by Nimbus group and interacts with the internal component of our system. In order to identify users, each user must have a user certificate. Users can choose an image to boot VM. Users can modify and save the status dynamically on demand. Cloud users can also assign the parameters for number of virtual machines, number of CPUs and memory size through NCHC EasyVCWeb. Then VMMs setup the VMs from the specified images to execute computing jobs.

• With Cloud Storage The properties of Cloud Storage consist of flexibility,

convenience, and automation. Large scale numerical simulation may need to retrieve essential input files and manage generated output files, and NCHC VClouster introduces Cloud Storage to help resolving the problem. Establish and maintain the scalable and highly reliable network storage is a complex and expensive task.

IV. DESIGN AND IMPLEMENTATION

A. Cloud Data Management We support Cloud data management that integrates

Dropbox client, iFolder and SparkleShare in NCHC VClouster system to allow Cloud users exercise the usual used Cloud Storage service and directly save their files locally. Then these Cloud Storages automatically and transparently update the user files on a network server and delivers computing results to the other used machines. This is a simple and secure storage solution that can increase users’ productivity by enabling users to back up, access and manage personal data files from anywhere and anytime.

B. Virtualization Technology Virtualization Technology provides high scalability and

gives users autonomy on managing their own clusters.

198

Page 3: [IEEE 2011 IEEE 13th Conference on Commerce and Enterprise Computing (CEC) - Luxembourg-Kirchberg, Luxembourg (2011.09.5-2011.09.7)] 2011 IEEE 13th Conference on Commerce and Enterprise

Figure 4. The Process of Dynamic Deployment

Figure 5. The Customized Virtual Cluster with Cloud Storage

for long-running MPI jobs. As a consequence, Cloud users do not need to worry about where the computing resources

reside and just use the VMs allocated form them. Even though the performance might be a bit lower in VMs than in

physical ones, but the sacrifice is acceptable.

• VM Setup In Nimbus Cloudkit, the Workspace Service supports

both Xen and KVM [2] in our NCHC VClouster system currently. The Workspace Service is deployed in each standalone site and it plays the role of VMM responsible to control the VMs on this VMM node. The virtual environment is constructed by workspace-control program of Workspace Service using VM images. By means of workspace-control, VMM nodes can use this program mainly to start, stop and suspend VMs. Moreover, the workspace-control also can handle reconstruction and reposition for VM images, and create secure network connects for VMs.

• VM Control VM management is an important issue due to the variety

of VMs. There are customized VM images created and transferred by Cloud users and default VM images stored up by our system in the service node. To support the convenient and sufficient requirements for various types of VM images, NCHC VClouster system have prepared VM images with the frequent use of software, such as MPICH [3], Globus toolkit [1], Torque [4], etc.

C. Dynamic Deployment Mechanism Customization request has been gradually emphasized by

Cloud developers for HPC applications. There is a friendly design for users to send their requests in NCHC VClouster system. We introduce Torque to manage and control VMs for job computation. Cloud user can specify how many nodes

and how many cores in each node by Torque job script in order to submit Cloud jobs. Users can also specify many properties of VMs in the aspect of Nimbus workspace service, such as memory, nodes, cores, and hours of use, etc. Our system can dynamically construct the virtual computing environments on demand when users submit their jobs. Naturally users can store the status of their VMs if the VMs are customized images uploaded by themselves instead of default VM images. The one of scenarios for dynamic deployment is depicted in Figure 4. As Figure 4, (1)the user sends the requirement requests of 8 cores including 4 nodes with 2 cores in each and 1 hour for job computation, (2)the requests are processed automatically by the server, (3)workspace-control creates 4 VMs and each VM has two cores for 1hours according the requirements, (4)the server submits jobs to the VMs after successful creation and deployment of VMs.

As Figure 5 is shown, NCHC VClouster builds the customized virtual clusters on demand by Cloud users. Besides, the designed system would actives Cloud Storage service to access Cloud data for numerical simulation applications. As long as the virtual clusters are set up ready, the Cloud data needed by user jobs is then stored and exist in the virtual environment through Cloud Storage service. Cloud users do not worry that they should transfer large-scale computed data files before the job beginning and also do not worry that how to retrieve the application results back to the local site.

V. PERFORMANCE EVALUATION

A. Experiment Environment We implement an IaaS Cloud and support the Cloud

storage service for users in order to serve various communities and integrate a large number of computers. The preliminaries of experiment are needed to set up, that consist of the physical computing environment and virtual machines. Table I shows the experiment parameter list.

TABLE I. PARAMETER LIST

R Number of real nodes V Number of virtual machines

RV Number of virtual machines created in each real node

B. Performance Results The discussed experimental results consist of various combinations for physical nodes and virtual machines. The solutions of Laplace's equation are important in many fields that consist of science, fluid dynamics, astronomy, etc. Therefore, we adopt the Laplace equation for heat conduction to be our benchmark with a MPI matrix to evaluate the performance. In the Figure 6, we can see the effects due to the virtualization technology. We execute a HPC application with the combination of different proportion of physical machines and VMs.

199

Page 4: [IEEE 2011 IEEE 13th Conference on Commerce and Enterprise Computing (CEC) - Luxembourg-Kirchberg, Luxembourg (2011.09.5-2011.09.7)] 2011 IEEE 13th Conference on Commerce and Enterprise

Figure 6. The Effects due to the Virtualization Technology

The results indicate that the execution time increases with the increase in the proportion of VMs. We also evaluate the performance by the different distribution of VMs in real nodes as shown in following Figure 7. The total number of VMs is fixed as 8, the number of processor core in VMs is set as 1, and we change the parameters ”R” and ”RV”. The purpose of this experiment mainly provides a suggestion for users to distribute the VMs in real nodes when creating a virtual cluster. The results show that there is best performance in the case of “R8 RV1” (8 VMs in 8 real nodes). The turnaround time of the executed job follows to increase when the number of VMs in single real node increases. This is because there is limited memory and CPU power in single real node. The whole performance would go down greatly if the available resource of real node runs out. It can be seen that the evaluation of physical resources is an essential work when creating VMs.

As a result, our experiment evaluations emphasize that MPI applications in NCHC VClouster not merely get the not bad performance but get good efficiency as well. Moreover, we can assume that the VM are scalable and flexible for users despite the disservice of the slight overhead.

VI. CONCLUSIONS AND FUTURE WORKS

NCHC VClouster system provides the authorized users to create and manage their own virtual clusters through NCHC EasyVCWeb in Cloud instead of local machines. The designed system can dynamically construct the virtual computing environments with Cloud Storage on demand. As a result, we do the experiment to prove that our system can not only provide users a high flexible and customized virtual cluster but also offer high performance while dealing with HPC numerical simulation jobs. Eventually, NCHC VClouster can promote the effective and convenient use of VMs, and yet achieve the characteristics of flexibility and scalability. In the future, we even expect that we can integrate lite migration mechanism to ensure the more unfailing execution of HPC applications in the virtual environment. The snapshot and restore method of lite migration mechanism will avoid the re-computation of interrupted jobs from the start after the computing resources suddenly fail, and thus greatly improve the reliability.

Figure 7. The Distribution of VMs in Real Nodes

REFERENCES [1] Globus Tollkit. http://www.globus.org/toolkit/ [2] KVM. http://www.linux-kvm.org/page/Main_Page [3] MPICH. http://www.mcs.anl.gov/research/projects/mpi/ [4] Torque. http://www.clusterresources.com/torque [5] APGrid PMA, “Asia Pacific Grid Policy Management

Authority”, http://www.apgridpma.org/ [6] NCHC CA PKI Service. https://myproxy.nchc.org.tw/ [7] NCHC CA Group, “NCHC Certification Authority Certificate

Policy and Certification Practice Statement (CP/CPS)”, http://ca.goc.nchc.org.tw/nchcca/CPS.html 2008

[8] P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauery, I. Pratt, and A.Warfield, “Xen and the Art of Virtualization”, Proceedings of ACM Symp. Operating Systems Principles, pp. 164-177, Oct. 2003.

[9] H. S. Chen, C. H. Wu, Y. L. Pan, “Integrating Certificate Services with Grid Widgets for a Computational Grid Platform”, 2009 National Computer Symposium (NCS 2009), Taipei, Taiwan, Nov 27-28, 2009.

[10] K. Keahey, and T. Freeman, “Science Clouds: Early Experiences in Cloud Computing for Scientific Applications”, Cloud Computing and Its Applications 2008 (CCA-08), Chicago, IL, Oct. 2008.

[11] K. Keahey, M. Tsugawa, A. Matsunaga, J. Fortes, “Sky Computing”, IEEE Internet Computing, vol. 13, no. 5, Sep./Oct. 2009.

[12] D. Leverman, and H. Tufo, “Experiences with Managed Hosting of Virtual Machines”, The 10th LCI International Conference on High-Performance Clustered Computing, Colorado, USA, Mar. 9-12, 2009.

[13] N. Santos, K. P. Gummadi, and R. Rodrigues, “Towards Trusted Cloud Computing”, Workshop on Hot Topics in Cloud Computing 2009 (HotCloud 09), San Diego, CA, Jun. 15, 2009.

[14] J. Wei, X. Zhang, G. Ammons, V. Bala, and P. Ning, “Managing security of virtual machine images in a cloud environment”, The ACM Cloud Computing Security Workshop 2009 (CCSW 2009), 16th ACM Conference on Computer and Communications Security (CCS), Chicago, IL, Nov. 13, 2009

200