25
NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration Dr. Gregor von Laszewski Assistant Director of Community Grids Lab, Adjunct Associate Professor Indiana University NIST Campus Gaithersburg, Maryland June 1, 2017

NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

NIST Big Data Public Working Group

Overview of Big Data Reference Architecture Software and Demonstration

Dr. Gregor von LaszewskiAssistant Director of Community Grids Lab, Adjunct Associate ProfessorIndiana University

NIST CampusGaithersburg, MarylandJune 1, 2017

Page 2: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Presentation Overview

• Volume Presentation Outline• Volume 1, Definitions (Nancy Grady, SAIC)• Volume 2, BD Taxonomies (Nancy Grady, SAIC)• Volume 3, Use Cases and General Requirements (Geoffrey Fox, Indiana University)• Volume 6, Reference Architecture (David Boyd, InCadence Corp.)• Volume 4, Security and Privacy (Arnab Roy, Fujitsu; Mark Underwood, AVP, Strategic

Initiatives, Controls and Countermeasures)• Volume 8, Reference Architecture Interface (Gregor von Laszewski, Indiana

University)• Reference Architecture Software Implementation Environment and Demonstration

(Gregor von Laszewski, Indiana University)• Volume 7, Standards Roadmap (Russell Reinsch, Center for Government

Interoperability)• Volume 9, Adoption and Modernization (Russell Reinsch, Center for Government

Interoperability)

2

Page 3: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

NBDIF Volume Overview

3

Vol. 1 BD DefinitionsDefines common language

Vol. 3 Use Cases & Vol. 5 Arch SurveyInfo gathered; requirements extracted

Vol. 2 BD TaxonomiesHierarchy of NBDRA components

Vol.4 S&PInterwoven topics of S&P examined

Vol. 7 Standards RoadmapExamine standards wrt NBDRA

Vol. 6 NBDRADeveloped NBDRA

Vol. 8 NBDRA InterfacesImplementation of NBDRA

Vol. 9 Adoption & Modernization

Page 4: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Volume Presentation Outline

• For each volume– Scope of the volume– Brief recap of version 1– Highlights of version 2 accomplishments– Summary of version 2 areas needing contributions– Topics that could be considered for version 3

4

Page 5: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Reference Architecture Software ImplementationEnvironment and Demonstration

5

• Cloudmesh provides a first reference implementation – Features include IaaS,

Hadoop, and software stack deployment.

– It was tested based on Application from Use Case document.

– Code is hosted in github and is available.

• Focus on Cloudmesh command shell and REST service as it is – Scriptable– Interpretable into other

frameworks– Accessible through other

frameworks via REST.

• Disclaimer: we move from our original cm implementation to cms to distinguish the two efforts. The new implementation can use the NIST specification and generates a REST service automatically.

Page 6: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Cloudmesh Arcitecture

6

• Abstraction essential to Cloudmesh design

• Abstractions at different levels and interaction points– IaaS– Container– HPC– PaaS

• Virtual Cluster• Integration with Providers

– IU OpenStack, NSF Chameleon cloud, NSF Comet, AWS, Azure, SLURM/XSEDE, …

• Used by hundreds of users

Page 7: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Cloudmesh Layered Architecture

7

• Easy extensibility• Developed with command

shell in mind• Developed with REST in

mind• Horizontal Integration

– Access – Data – Compute• Vertical Integration• `Security - Choreography

Page 8: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Deployment Abstractions

8

• Possible interaction with different DevOps frameworks

• Leveraging large DevOps community

• Warning we found that there are many DevOps “templates” but not all of them are usable:– lack generality– do not work – too complex– not properly documented

Page 9: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Continuous Improvement vs. Continuous Deployment via DevOps

9

• DevOps is integrated• Leads to improvement when

not only targeting application but also deployment environment.

Page 10: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Simple Interface Usecase: Boot a vm on

10

Page 11: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Simple Interface Usecase: Boot & Provision

11

Page 12: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Phase 1: Interface Objects

12

Page 13: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Specification -> Reference implementation

13

1. Specification 2. Cloudmesh schema 3. Schema 4. Rest Service

1. Vol 8. Specification2. Cloudmesh schema generates …3. … a valid schema from the specification4. The schema is used to automatically generate a REST service

Page 14: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Showcase document

• https://laszewski.github.io/papers/NIST.SP.1500-8-draft.pdf

14

Page 15: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

Account Management

15

Page 16: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Account Management Example: Extension to Architecture

• Accounting across hybrid services• Integrating of accounting records for individuals (in case group account

does not provide this feature)• User Management issues

– Removal of “Dracula Users”: I suck you dry and cinsume all your hours as I will ignore your policies will fully (yes, they do exist)

– Removal of “Uniformed User”: let the know what an experiment costs upfront before you start it.

• Provider Management Issues– Provide feedback to providers: We found that some providers gave us

incomplete information in regards to their accounting practice– Comparison of cost between providers

• Application Benchmarking– If we do make it too easy some will ignore alternatives, Expose

benchmarking results to the community

16

Page 17: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Account Management

17

Page 18: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Account management

18

• Register• Deposit• Use• Deactivation

Page 19: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

Fingerprint ExampleGregor von Laszewski, [email protected]

Badi’ Abduhl-Wahid

19

Page 20: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Fingerprint Application

20

• Requires– Application knowledge– Deployment/DevOps

knowledge

• What if application user could do also the deployment?– Use newest software– Use newest hardware– Benchmark different setups

Page 21: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Use Case Fingerprint: Deployment is complex

21

Page 22: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Cloudmesh Shell – Make Booting Simple

22

$ emacs cloudmesh.yaml$ cms default cloud=NAME$ cms default image=NAME$ cmd default flavor=NAME$ cms vm boot

$ cms vm login

$ cms vm delete

• cloudmesh.yaml

• Prepare defaults

• Boot

• Login

•Management …

Page 23: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Cloudmesh Shell – Manage Hybrid Clouds

23

$ cms aws boot $ cms vm boot

$ cms default cloud=chameleon$ cms vm boot

$ cms default cloud=IUCloud$ cms vm boot

•Boot Cloud A

•Boot Cloud B

•Boot Cloud C

Page 24: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Cloudmesh Shell – Create a Hadoop Cluster

24

$ cm default cloud=chameleon$ cm cluster define - -count=10

- -flavor=m1.large$ cm hadoop define spark

$ cm hadoop sync # ~30 sec

$ cm hadoop deploy # ~ 7 min

•Set cloud

•Define cluster

•Define hadoop Cluster

•Sync definition to db

•Deploy the cluster

Page 25: NIST Big Data Working Group › Day1_09_NBDIF-RADemo-Volume... · 2017-06-06 · NIST Big Data Public Working Group Overview of Big Data Reference Architecture Software and Demonstration

June 1, 2017

Cloudmesh Shell – Create a Hadoop Cluster

25

$ cm default cloud=IUCloud$ cm cluster define - -count=10

- -flavor=m1.large

$ cm nist fingerprint # ~ 30 min

•Set cloud

•Define cluster

•Run NIST usecase

Additional resources: https://github.com/cloudmesh/classes/blob/master/docs/source/notebooks/fingerprint_matching.ipynb