View
214
Download
1
Tags:
Embed Size (px)
Citation preview
Rapid Raster Projection Transformation and Web Service Using High-performance Computing Technology
2009 AAG Annual MeetingLas Vegas, NVMarch 25th, 2009
Qingfeng (Gene) GuanMichael P. FinnE. Lynn Usery
David M. MattliCenter of Excellence for Geospatial Information Science
U.S. Geological SurveyRolla, MO
Contents• Motivations• Parallelizing raster projection
transformation– Static load-balancing– Dynamic load-balancing
• pRPL – parallel Raster Processing programming Library
• Conclusions
Motivations• Massive data volumes
– High resolutions– Large areas– Easily 500MB+
• Re-sampling method– Multiple projection
transformations for each output pixel (center and corners)
• Example:– Global land-cover at 30-sec resolution, 21,600X43,200
pixels, 900 MB– Geographic → Mollweide– Laptop, Intel Pentinum M 1.5 GHz, 1.25 GB RAM– 45 minutes 10 seconds!!
Motivations• Problem: High computational
intensity v.s. demand for rapid projection (web) service
• Solution: High-performance computing technologies– Parallel computing
Parallel approach for raster data
• Raster is born to be parallelized– A raster dataset is essentially a
matrix of values, each of which represents the attribute of the corresponding cell of the field
– A matrix can be easily partitioned into sub-matrices and assigned onto multiple processors so that the sub-matrices can be processed simultaneously
Parallelizing Projection Transformation
• Output image is decomposed• Minimum Bounding Rectangles
(MBRs) of sub-input-images are computed using the MBRs of sub-output-images
Parallelizing Projection Transformation
• Dynamic Load-balancing– Reduced granularity– Master reads sub-input-images
and distributes them in response to requests
– Workers/Slavers request for new tasks (MBRs of sub-output-images & corresponding sub-input-images)
• An open-source general-purpose parallel Raster Processing programming Library
• Encapsulates complex parallel computing utilities and routines specifically for raster processing– Enables the implementation of parallel raster-processing
algorithms without requiring a deep understanding of parallel computing and programming
• Possible usage– Massive-volume geographic raster processing– Image (including remote sensing imagery) processing– Cellular Automata (CA) and Agent-based Modeling (ABM)
• Freely downloadable and open source– http://sourceforge.net/projects/prpl/
pRPL: parallel Raster Processing Library
• Object-Oriented programming style– Written in C++– Built upon the Message Passing
Interface (MPI)
• Provides Transparent Parallelism
• Supports almost all types of raster-based processing– Local-scope– Neighborhood-scope– Regional-scope– Global-scope
pRPL (cont.)
pRPL 2.0 – under development
• Workers/Slavers– Initially assigned
with some subsets of data
– Request for more data when finish the assigned subsets
– Receive new input subsets to process from the master
– Submit completed output subsets to the master
• Supports dynamic load-balancing for data parallelism
• Master-worker formation• Master
– Reads data dynamnically– Distributes the initial subsets to the
workers– Maintains the task farm which
contains the remaining subsets of data
– Sends the subsets to the workers in respond to requests
– Receives completed output subsets from workers
Conclusions• Massive-volume raster
projection transformation needs high-performance computing technology
• Dynamic load-balancing technique improves performance– Reduces the I/O overhead and
the requirement for memery space
– Improves the utilitization rate (efficiency) of a heterogenerous parallel computing system
• pRPL reduces the devolopment complexity of a parallel raster-based processing