1 P2P Workshop for Automatic Synchronization and Distribution of Biological Databases and Software...

Preview:

Citation preview

1

P2P Workshop for

Automatic Synchronization and Distribution of Biological

Databases and Software over Low-Bandwidth Networks

Unitsa Sangket, Amornrat Phongdara, Wilaiwan Chotigeat, Darran Nathan, Woo-Yeon Kim, Tin Wee Tan, Jong Bhak,

Chumpol Ngamphiw, Sissades Tongsima, Asif M. Khan and Honghuang Lin

By

2/17

Outline Background Objectives Methods Results Setup of the P2P node

3/17

Background Bioinformatics and the need for network

bandwidth Bioinformatics involves the collection,

organization and analysis of large amounts of biological data.

normally update database by a file transfer over FTP

Network bandwidth within developing countries still very low the low reliability of connections means breaks /

aborts in downloads are common

4/17

Background the revolution in file sharing technology

Peer-to-Peer (P2P) file exchange 1st generation

Napster exchanging files among client through server

2nd generation FastTrack/Kazza continued to evolve and improve, no server

3rd generation Azureus advance over previous P2P protocols with BitTorrent a large file will be broken up into smaller fragments and

reassemble

5/17

Background the revolution in file sharing technology

source: http://no.wikipedia.org/wiki/BitTorrent

traditional client/server distribution of files P2P distribution of files using BitTorrent protocol

6/17

Background 3rd generation P2P technology solve

Low international bandwidth provide additional bandwidth, speed up the overall

download rate Unreliable connections

download automatic from the best connections

7/17

Background applied in three areas – the distribution of

biological software (size of files ~ < 1GB) courseware (size of files ~ 1GB – 10GB) databases (size of files ~ 10GB - 100GB)

8/17

Objectives To extend a P2P client application base on 3rd

generation P2P protocol for use in the distribution of biological software, courseware, and databases.

To set up and test the performance with nodes in countries in the Asia-Pacific region.

9/17

Methods The Azureus P2P

(http://azureus.sourceforge.net/) suite was selected because open-source runs on Java well documented plugin interface

10/17

Methods A RSSFeed Scanner Plugin was used to

trigger automatic synchronization of data at regular intervals

11/17

Methods Four trial nodes have been setup

Prince of Songkla University (PSU, Thailand) Korean Bioinformation Center (KOBIC, Korea) National University of Singapore (NUS, Singapore) National Center for Genetic Engineering and Biotechnology

(BIOTEC, Thailand).

Node Tracker url

PSU http://biotracker.psu.ac.th:6969

KOBIC http://ftp.kobic.re.kr:6969

BIOTEC http://protcluster.biotec.or.th:6969

12/17

Results

Fig. Total data size downloaded

13/17

Acknowledgements the International Development Research

Centre (IDRC ) Canada the Asia-Pacific Bioinformatics Network

(APBioNet) MOST and KADO of MIC in Korea.

14/17

P2PWiki http://everest.bic.nus.edu.sg/p2p/

index.php/Main_Page

15/17

Setup of the P2P node1. Install Azureus and Java2. Set up Azureus

Set up client of Azureus To download and upload data

Set up server of Azureus To create the .torrent files and publish them

3. Install and set up RSSFeed Scanner Plugin To synchronize data automatically

4. Install and set up Advanced Statistics Plugin To test performance

16/17

1. Install Azureus and Java ftp://192.168.1.2 http://azureus.sourceforge.net/download.php

17/17

Recommended