22
A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith, jtafas}@r2labs.org Renaissance Research Labs Department of Computer Science California State University San Bernardino, CA 92407 Supported by NSF ITR #0331697

A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Embed Size (px)

DESCRIPTION

Retina Images Normal (n) 3 month detachment (3m) 1 day detachment followed by 6 day reattached with increased oxygen (1d+6dO2) 3 day detachment (3d) Laser scanning confocal microscope images of the retina

Citation preview

Page 1: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

A Scalable Distributed Datastore for BioImaging

R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas{jcurnutt, egomez, keith, jtafas}@r2labs.orgRenaissance Research LabsDepartment of Computer ScienceCalifornia State UniversitySan Bernardino, CA 92407 Supported by

NSF ITR #0331697

Page 2: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Background CSUSB Institute for Applied

Supercomputing Low Latency Communications

UCSB Center for BioImage Informatics Retinal images Texture map searches Distributed consortium (UCB, CMU)

Page 3: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Retina Images

Normal (n)

3 month detachment (3m)1 day detachment followed by 6 day reattached with increased

oxygen (1d+6dO2)

3 day detachment (3d)

Laser scanning confocal microscope images of the retina

Page 4: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Environment

UCSB

Raven Cluster

Imageand

metadataserver

search

external internal

BISQUE

features

analysis

Hammer/Nail ClusterLocal

LAN

CSUSB

Imageand

metadataserver

Imageand

metadataserver

WAN

Loca

l

Lustre

Page 5: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Software Open source OME

Postgresql 7 Bisque

Distributed datastore Clustering NFS Lustre

Benchmark: OSDB

Page 6: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Hardware - Raven 5 year old dual processor 1.4 GHz Pentium 3 256MB RAM 60GB SCSI Compaq Proliant DL-360

servers. Raven has been latency

tuned.

Page 7: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Hardware – Hammer/Nail

UCSB

CSUSB

Hammer headnode 5 Nail nodes quad CPUs

3.2 Ghz Xeon 4GB RAM 140GB SCSI Dell servers Bandwidth tuned

(default)

Page 8: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Outline Effect of node configuration Comparison of network file systems Effects of a wide area network (WAN)

Page 9: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Relative LAN Performance

Page 10: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

NFS LAN/WAN Performance

Page 11: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Design Effects? A few expert users

Metadata searches• Small results to user

Texture searches• Heavy calculation on cluster• Small results to user

Latency tuning

Page 12: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Outline Effect of node configuration Comparison of network file systems Effects of a wide area network (WAN)

Page 13: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

NFS / Luster Performance. NFS

well known standard Configuration problems with OME performance comparison of the Lustre file system

Lustre Journaling Stripe across multiple computers Data redundancy and failover

Page 14: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Relative Performance on LAN NSF/Lustre

Compared to local DB

1GB LAN two significant

differences

Page 15: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Significant Differences

NSF caching bulk deletes and

bulk modifies

Lustre stripes across computers increase the

bandwidth

Page 16: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Outline Effect of node configuration Comparison of network file systems Effects of a wide area network (WAN)

Page 17: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Effect on Wide Area Network WAN Compared three connections

Local Switched, high speed LAN (1 Gb/s) WAN between UCSB and CSUSB (~50 Mb/s)

NFS only UCSB didn’t have Lustre installed Active research prevented reinstalling OS

Page 18: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Local/LAN/WAN Performance

Page 19: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Effect on Wide Area Network WAN Most significant effect

Not bandwidth intensive operations Latency intensive operation Next generation WAN will not solve the problem.

Frequently used data must be kept locally Database cluster Daily sync of remote databases

Page 20: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Conclusions Scientific researchers

Latency tune network Don’t bandwidth tune

Latency of WAN is too large replicate data and update.

Bisque/OME NFS issues Lustre

High bandwidth operations Stripe Lustre across systems

Page 21: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Future directions: Agent based texture search engine

Loosely coupled cluster• WAN connection• Unreliable connection• Fault tollerant• Parallelize Jobs

Open source components• Scilab• Convert NSF funded algorithms in Matlab

Simple interface Superior caching scheme for Lustre

Page 22: A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,

Questions…