6
National Institute of Advanced Industrial Science and Technology Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation - Grid Technology Research Center Grid Technology Research Center AIST AIST Hiroshi Takemiya, Yoshio Tanaka Hiroshi Takemiya, Yoshio Tanaka

National Institute of Advanced Industrial Science and Technology Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation

Embed Size (px)

Citation preview

  • Slide 1

National Institute of Advanced Industrial Science and Technology Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation - Grid Technology Research Center AIST Hiroshi Takemiya, Yoshio Tanaka Slide 2 Goal of the experiment To verify the effectiveness of our programming approach for large- scale long-run grid applications Flexibility Robustness efficiency Friction simulation Nano-scale prober moves on the Si substrate Requiring hundreds of CPUs Requiring long simulation time over a few months No. of QM regions and No. of QM atoms change dynamically 525fs 40 v=0.009 /fs Gridifying the application Using GridRPC + MPI Gridifying the application Using GridRPC + MPI 2 QM regions with 72 + 41 QM atoms Totally 28598 atoms Slide 3 Used 11 clusters with totally 632 CPUs in 8 organizations. PRAGMA Clusters SDSC (32 CPUs), KU (8 CPUs), NCSA (8 CPUs), NCHC (8 CPUs) Titech-1(8 CPUs), AIST(8 CPUs) AIST Super Cluster M64 (128 CPUs), F32-1(128 CPUs + 128 CPUs) Japan Clusters U-Tokyo (128 CPUs), Tokushima-U (32 CPUs), Titech-2 (16 CPUs) Testbed for the Friction Simulation M64 F32 NCHC NCSA SDSC U-Tokyo Titech-2 Tokushima-U Titech-1 AIST KU Slide 4 Result of the Friction Simulation Experiment Time: 52. 5 days Longest Calculation Time: 22 day Manual restart: 2 times Execution failure: 165 times Succeeded in recovering these failures Changing the No. of CPUs used: 18 times succeeded in adjusting No. of CPUs to the No. of QM regions/QM atoms Slide 5 Summary and future work Our approach is effective for running large- scale grid applications for a long time Need more grid services Getting information on available resources Resource reservation Coordinating with resource manager/scheduler Need cleaner MPI mpich quits leaving processes/IPC resources Using GridMPI in place of mpich Slide 6