17
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

Embed Size (px)

Citation preview

Page 1: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

Opportunities in Parallel I/O for Scientific Data Management

Rajeev Thakur and Rob RossMathematics and Computer Science Division

Argonne National Laboratory

Page 2: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

2

Outline

• Brief review of our accomplishments so far

• Thoughts on component coupling

• Topics for future work

Page 3: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

3

PVFS2• Collaborative effort between ANL,

Clemson, Northwestern, Ohio State, etc.• Very successful as a freely available

parallel file system for Linux clusters• Also deployed on IBM BG/L• Used by many groups as a research

vehicle for implementing new parallel file system concepts

• True open source software • Open source development

• Tightly coupled MPI-IO implementation (ROMIO)

• Forms the basis for higher layers to deliver high performance to applications

CC CC CC CC CC

Comm. NetworkComm. Network

IOSIOS

PVFSPVFS PVFSPVFS PVFSPVFS PVFSPVFS PVFSPVFS

IOSIOSIOSIOS IOSIOS

Page 4: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

4

PVFS2 Performance

Page 5: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

5

PVFS2 Performance

0

100

200

300

400

500

600

700

Number of Processes

Avg

. Cre

ate

Tim

e (m

s)

GPFSLustrePVFS2

Time to Create Files Through MPI-IO

Page 6: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

6

PnetCDF

• Parallel version of the popular netCDF library

• Major contribution of the SDM SciDAC (funded solely by it)

• Collaboration between Argonne and Northwestern• Main implementers: Jianwei Li (NW) and Rob Latham (ANL)

• Addresses lack of parallelism in serial netCDF without the difficulty of parallelism in HDF

• Only minor changes to the standard netCDF API

• Being used in many applications

Page 7: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

7

MPI-IO over Logistical Networking (LN)• LN is a technology that many applications

are using to move data efficiently over the wide area

• Implementing MPI-IO over LN enables applications to access their data directly from parallel programs

• We are implementing a new ROMIO ADIO layer for LN (Jonghyun Lee, ANL)• Nontrivial because the LN API is unlike a

traditional file system API

• Collaboration between Argonne and Univ. of Tennessee

Application

MPI-IO

ADIO

PVFS UFS LN

Local Storage Remote Storage

Page 8: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

8

Fruitful Collaborations

• Key to our successes in this SciDAC have been strong collaborations with other participants in the Center• Northwestern University

• PnetCDF, PVFS2• Jianwei, Avery, Kenin, Alok, Wei-keng

• ORNL• Nagiza’s group• MPI-IO and PnetCDF for visualization (parallel VTK)

• LBNL• Ekow (MPI-IO on SRM)

• Ongoing collaboration with Univ. of Tennessee for MPI-IO/LN

Page 9: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

9

Component Coupling via Standard Interfaces

• We believe that well-defined standard APIs are the right way to couple different components of the software stack

• Having the right API at each level is crucial for performance

Application

HDF-5 PnetCDF

MPI-IO

PVFS Lustre GPFS

Page 10: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

Topics for Future Work

Page 11: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

11

Guiding Theme

• How can we cater better to the needs of SciDAC applications?

Page 12: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

12

Make Use of Extended Attributes on Files

• PVFS2 now allows users to store extended attributes along with files

• Also available in local Linux file systems, so a standard is emerging

• This feature has many applications:• Store metadata for high-level libraries as

extended attributes instead of directly in the file

• avoids the problem of unaligned file accesses

• Store MPI-IO hints for persistence

• Store provenance information

FILE

Xattr Name=“Mesh Size” Value=“1K x 1K”

Page 13: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

13

Next Generation High-Level Library

• HDF and netCDF were written 15-20 years ago as serial libraries

• Explore the possibility of designing a new high-level library that is explicitly built as a parallel library for modern times• What features are needed?

• Can we exploit extended attributes?

• Can the data span multiple files instead of one file, with a directory as the object?

• What is the portable file format?

• New, more efficient implementation techniques

Page 14: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

14

Implement Using Combination of Database and Parallel I/O• Use a real database to store

metadata and a parallel file system to store actual data

• Flexible and high performance• Powerful search and retrieval

capability• Prototype implemented in 1999-

2000 (published in SC2000 and JPDC)• Jaechun No, Rajeev Thakur, Alok

Choudhary• Needs more work; collaboration

with application scientists• Serializability/portability of data is

a challenge• What is the right API for this?

Application

SDM

DatabaseMPI-IO

Parallel file system

Metadata

Data

Berkeley DB,Oracle, DB2

Page 15: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

15

Parallel File System Improvements

• Autonomic• Self-tuning, self-mantaining, self-healing

• Fault tolerant• Tolerate server failures

• Scalability• Ten to hundred-thousand clients

• Active storage• Run operations on the server, such as data reduction, filtering,

transformation

Page 16: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

16

End-to-End Data and Performance Management• Applications run and write data at one site (say NERSC)

• Scientists need to access the data at their home location, which is geographically distant

• Need high-performance and management of this whole process

• We intend to focus on ensuring that our “local access” tools (PVFS, MPI-IO, PnetCDF) integrate well with other tools that access data over the wide area (SRM, Logistical Networking, Gridftp)

Page 17: Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory

17

Summary

• Despite progress on various fronts, managing scientific data continues to be a challenge for application scientists

• We plan to continue to tackle the important problems by• focusing on our strengths in the areas of parallel file systems and

parallel I/O

• collaborating with other groups doing complementary work in other areas to ensure that our tools integrate well with theirs