34
Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve Tuecke Vas Vasiliadis Raj Kettimuthu

Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview

Steve Tuecke Vas Vasiliadis Raj Kettimuthu

Page 2: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Presentations and other useful information available at

globus.org/events/sc14/tutorial

goo.gl/s2Ew50

2

Page 3: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Thank you to our sponsors!

U . S . D E PA RT M E N T O F

ENERGY

3

Page 4: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Agenda

•  Research data management scenarios and challenges (15 min)

•  Introduction to Globus (10 min) •  Demonstrations and Exercises

– Accessing Globus and Transferring Files (10 min) – File sharing and Group Management (20 min) – Data Publication and Discovery (20 min) – Globus Command Line Interface (10 min)

•  Globus: today and tomorrow (5 min) 4

Page 5: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Research data management scenarios and challenge

5

Page 6: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

“I need to easily, quickly, & reliably move or mirror portions of my data to other places.”

Research Computing HPC Cluster

Lab Server

Campus Home Filesystem

Desktop Workstation

Personal Laptop

XSEDE Resource Public Cloud

6

Page 7: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

“I need to get data from a scientific instrument to my analysis server.”

Next Gen Sequencer

Light Sheet Microscope

MRI Advanced Light Source

7

Page 8: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

“I need to easily and securely share my data with my colleagues at other institutions.”

8

Page 9: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

“I need a good place to store / backup / archive my (big) research data, at a reasonable price.”

Public Cloud Archive Mass Store Campus Store 9

Page 10: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

“I need to publish my data so that others can find it and use it.”

Scholarly Publication

Reference Dataset

Active Research Collaboration

10

Page 11: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus introduction and demonstration

11

Page 12: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus is…

Research data management…

…delivered via SaaS

12

Page 13: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus delivers…

Big data transfer, sharing, publication, and discovery…

…directly from your own storage systems

13

Page 14: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

It’s about the user experience…

…for your photos

…for your e-mail

…for your entertainment

…for your research data

14

Page 15: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus is SaaS

•  Web, command line, and REST interfaces •  Reduced IT operational costs •  New features automatically available •  Consolidated support & troubleshooting •  Easy to add your laptop, server, cluster,

supercomputer, etc. with Globus Connect

15

Page 16: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Reliable, secure, high-performance file transfer and replication

•  “Fire-and-forget” transfers

•  Automatic fault recovery

•  Seamless security integration

•  Powerful GUI and APIs

Data Source

Data Destination

User initiates transfer request

1

Globus moves and replicates files

2

Globus notifies user

3

16

Page 17: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Simple, secure sharing off existing storage systems

Data Source

User A selects file(s) to share, selects user or group, and sets permissions

1

Globus tracks shared files; no need to move files to cloud storage!

2

User B logs in to Globus and

accesses shared file

3

•  Easily share large data with any user or group

•  No cloud storage required

17

Page 18: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Curated publication of data, with relevant metadata for discovery

•  Identify •  Describe •  Curate •  Verify •  Access •  Preserve

Researcher assembles data set; describes it using metadata (Dublin core and domain-specific)

1Peers, public

search and discover data sets; transfer using Globus

3

Published Data Store

Curator reviews and approves; data set published on campus or other storage

2Metadata

18

Page 19: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Demonstration: Accessing Globus

File Transfer

19

Page 20: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

1.  Go to: www.globus.org/signup 2.  Create your Globus account 3.  Validate e-mail address 4.  Optional: Login with your campus/

InCommon identity 5.  Install Globus Connect Personal 6.  Move file(s) from esnet#lbl-diskpt1 to your

laptop

Exercise 1: Sign up & transfer files

20

Page 21: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Demonstration: File Sharing

Group Management

21

Page 22: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Exercise 2: File sharing and Group Management

1.  Join the “SC14 Tutorial Users” –  Go to “Groups” and search for “SC14”

2.  Create a shared endpoint on your laptop 3.  Grant your neighbor permissions on your

shared endpoint 4.  Access your neighbor’s shared endpoint 5.  Optional: Create group, and grant share

access

22

Page 23: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Demonstration: Data Publication

Discovery (sneak preview)

23

Page 24: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Argonne Storage System

Univ. of Chicago Argonne IIT UIUC

Typical publication workflow

3. Assemble Dataset (Transfer Data)

Argonne Curator

2. Describe Submission

Scientist

Shared Endpoint

4. Curate Dataset

1. Publish Data 6. Download

5. Search

24

Page 25: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Exercise 3: Publish a data set

1.  Create a new data set 2.  Enter some metadata 3.  Assemble data set from the endpoint

sc14#tutorial_data (or another endpoint) 4.  Complete the workflow and submit 5.  Curators (a.k.a. presenters) will “review”

your submission and publish 6.  Browse the published data set

25

Page 26: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Demonstration: Globus Command Line

Interface (CLI)

26

Page 27: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Exercise 4: Globus CLI

1.  Optional: Generate SSH key 2.  See globus.org/account/ManageIdentities 3.  Add your SSH key to your Globus identity 4.  SSH to cli.globusonline.org!5.  Check on status of earlier transfer(s) 6.  Optional: Transfer a file using the scp

command

27

Page 28: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus: today and tomorrow

28

Page 29: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

~70PB moved to date 29

Page 30: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Our challenge:

Sustainability

We are a non-profit, delivering a production-grade service to the non-profit research community

30

Page 31: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus Subscriptions •  Provider

–  Managed endpoints –  Host shared endpoints –  Management console –  Data publication* –  Amazon S3 endpoints –  Usage reports –  Priority support –  Mass Storage System optimization –  Transfer integration support

•  Branded Web Site

•  Alternate Identity Provider (InCommon is standard)

globus.org/provider-plans 31

Page 32: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Some early Globus supporters

32

Page 33: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

Globus Platform-as-a-Service

Identity, Group, and Profile Management

Globus Toolkit

Glo

bus

API

s

Glo

bus

Con

nect

Data Publication & Discovery

File Sharing

File Transfer & Replication

33

Page 34: Delivering Research Data Management Services with Globus ... · Delivering Research Data Management Services with Globus and a Science DMZ: Introduction and Service Overview Steve

End: Introduction and Service Overview

34