61
Standards, Platforms, and IT Ram D. Sriram National Institute of Standards and Technology, U.S.A [email protected] Slides Courtesy: Mary Brady, Alden Dima, Eric Lin, E. (Sub) Subrahmanian

Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Standards, Platforms, and IT

Ram D. Sriram

National Institute of Standards and Technology, U.S.A

[email protected]

Slides Courtesy:

Mary Brady, Alden Dima, Eric Lin, E. (Sub) Subrahmanian

Page 2: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Session 7 Questions

• Q1. How can we evolve unified formats for microstructure representation?• Q2. How can the flow of information through the chain of simulations be

managed both effectively and accurately?• Q3. What are the foundational principles for platformization of approaches for

the integrated realization of engineered materials and products? • Q4. How does one build an enhanced materials innovation infrastructure to

enable academic and industrial researchers to effectively exploit the capabilities of advanced computers and networks?

• Q5. What information models can be flexible in the development of ontologies and inter-operability of disparate systems? Specifically for applications in evolution of design and manufacturing information and design processes?

• Q6. Computational Tools: What is the best approach to provide scalable and maintainable computational infrastructures for ICME?

• Q7. How can user-friendly computational platforms be developed to manage, store and retrieve information which is useful for various types of users –engineers, designers and researchers?

• Q8. Security and privacy issues: How can security and privacy related barriers to collaboration be reduced?

2

Page 3: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

MGI @ NIST

Page 4: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum
Page 5: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum
Page 6: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

• Communities are diverse, and efforts are organic rather than top-down

• Data, metadata, and toolsets may be unique to each community

• Need new methods for rapid definition, discovery and automatic integration

Page 7: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Scope: Goals of the Initiative

Goal 1: NIST establishes essential

materials data and model exchange

protocols

Goal 2: NIST establishes the means to

ensure the quality of materials data and

models

Goal 3: NIST establishes new methods,

metrologies and capabilities necessary

for accelerated materials development

• Digital Data• Collaborative

Networks

• Computational Tools

Page 8: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Microstructure Prediction Tools

Processing Modeling Tools

Materials Properties

Prediction Tools

Data Informatics &

Tools

Designed New Material

MaterialPerformance

Criteria

First Principles

CALPHAD

DATA(Experimental and

Calculated)

Atomistic Simulations

Tools: e.g. CMS

Reference Data

Data Informatics and Tools

MATERIALS DATA AND INFORMATICS

Page 9: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Data Tools and Informatics for Materials Data

Page 10: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

mmmm m →

Coarse-Grained

Atomistic

s

ms

ms

ns

← Å nm

Macroscopic Properties

Suspension Rheology

Particle Properties

Molecular Modeling

Rh / Rg = 1.661

MODELING TOOLS FOR ADV COMPOSITES

Page 11: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

NISTIR 7898

11

Page 12: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Topics Covered

• Data Representation and Challenges– Types of data and most efficient representation

– Meta data

– Standards

• Data Management– Organization and maintenance of digital data repositories

– Methods for access and use that respect intellectual property rights

– Long time sustainability

• Data Quality– Uncertainty properly quantified both from experiments and simulations

• Data Usability– Easy of use

12

Page 13: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Two Primary Breakout Groups

• Data Challenges at Different Length Scales– Atomic, Nano, Micro, and Macro -- What kind of data infrastructure is

needed

• Specific Domains– Electrochemical Storage

– High Temperature Alloys

– Catalysis

– Lightweight Structural Materials

13

Page 14: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

14

Page 15: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

15

Page 16: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

16

Page 17: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Materials Genome: From CALPHAD to Flight: Olson and Keuhmann, Scripta Materialia 70 (2014)

Page 18: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Materials Genome: From CALPHAD to Flight: Olson and Keuhmann, Scripta Materialia 70 (2014)

Page 19: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

REPRESENTATION: ROLE OF ONTOLOGIES

19

Page 20: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

• Advanced materials often consist of several components (n > 5) and multiple phases.

• The material properties are dependent on the microstructure.

• The microstructures changes as a function of processing and service conditions.

• The properties of the multiphase microstructures can be predicted from the properties of

the individual phases and their interactions in the microstructure.

Examples of phase based properties:

• Molar Volume and Thermal Expansion

• Specific heat

• Heat capacity

• Enthalpy

• Entropy

• Intrinsic diffusivity

• Electrical conductivity and resistivity

• Thermal conductivity

• Bulk modulus

• Magnetic properties (e.g. Curie Temperature)

• Optical properties

Example: g/g´ Superalloys (Ni or Co based

g

0.4 mm

g/g´ structure in a Co-Al-W at 900 °C for 1000 h

PHASE-BASED MATERIALS DATA

Page 21: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Melting Temperatures

Critical Temperatures(Phase Changes)

Lattice Parameters

TracerDiffusivities

Phase Fractions and Compositions

Heat of Formations

Activation Energies

For each assessment: Evaluated data file (e.g., POP, DOP)Functional descriptions for phase quantity (e.g., TDB)

Emphasis on binary and ternary data to predict multicomponent properties Data can be experimental or computational.

0

0.02

0.04

0.06

0.08

0.1

0.12

-1000 -500 0 500 1000

Ma

ss F

raction

Distance (mm)

Cr

Co

W

Ta

Al

Ti

Mo

Re

HfNb

Composition Profiles

Enthalpies of Mixing

Heat Capacities

Crystal Structures

Micrographs/Morphologies

3-D Atom Probe Tomography

Page 22: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

OntologiesXML Schemas

Low-Level Domain

LanguageTransform

XSLTReasoners

ExtractCALPHADData

Other Formats

MatML RDBMS

ToolFormats

ThermoML

Lab EquipmentComputation

Text filesSpreadsheets

DiagramsArticles

Databases

Materials Ontology currently being developedNote this is a work in progress

Tool ChainSection of XML Schema

NoSQLDatabases

MongoDB, Neo4J

Page 23: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

What Is An Ontology

• An ontology is an explicit description of a domain:– concepts

– properties and attributes of concepts

– constraints on properties and attributes

– Individuals (often, but not always)

• An ontology defines – a common vocabulary

– a shared understanding

Page 24: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Example: A biological ontology is:

• A machine interpretable representation of some aspect of biological reality

eye

– what kinds of things exist?

– what are the relationshipsbetween these things?

ommatidium

sense organeye disc

is_a

part_of

develops

from

Page 25: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

The Foundational Model of Anatomy

Page 26: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Engineering Ontology

Domain

Ontology

Thing

Individual

Spatial ThingTemporal Thing

Upper

Ontology

Event

Hydraulic System

Fuel System

Pumping

Hydraulic Pump

Aircraft Engine Driven Pump

Pump

Mechanical Device

Engine

Jet EngineFuel Pump

Fuel Filter

has-part

done-by

part-of connected-to

Collection

supplies-fuel-to

= Generalization

= Other

Relationships

Page 27: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Other Ontologies

Process Specification Language (PSL)

Product Specification Representation Language (PSRL)

Gene Ontology (GO)

DAML ontologies

Cyc

Standard Upper Ontology

Ontolingua

TOVE

Enterprise Ontology

STEP

Page 28: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

28

Ontology Spectrum

weak semantics

strong semantics

Is Disjoint Subclass of with transitivity property

Modal Logic

Logical Theory

Thesaurus Has Narrower Meaning Than

TaxonomyIs Sub-Classification of

Conceptual ModelIs Subclass of

DB Schemas, XML Schema

UML

First Order Logic

Relational

Model, XML

ER

Extended ER

Description LogicDAML+OIL, OWL

RDF/SXTM

Syntactic Interoperability

Structural Interoperability

Semantic Interoperability

Courtesy: Leo Obrst, MITRE

Page 29: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

29

Ontology Spectrum: Application

Logical Theory

Thesaurus

Taxonomy

Conceptual Model

Exp

ressiv

it

y

Categorization,

Simple Search &

Navigation,

Simple Indexing

Synonyms,

Enhanced Search

(Improved Recall)

& Navigation,

Cross Indexing

Application

Enterprise Modeling

(system, service, data),

Question-Answering

(Improved Precision),

Querying, SW Services

Real World Domain Modeling, Semantic

Search (using concepts, properties, relations,

rules), Machine Interpretability (M2M, M2H

semantic interoperability), Automated

Reasoning, SW Services

Ontology

weak

strongConcept (referent

category) based

Term - based

More Expressive Semantic Models Enable More Complex Applications

Courtesy: Leo Obrst, MITRE

Page 30: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Kinds of Ontologies: Alternate View

Terms

General

Logic

Thesauri

formal

Taxonomies

Frames

(OKBC)

Data Models

(UML, STEP)

Description

Logics

(DAML+OIL)

Principled,

informal

hierarchies

ad hoc

Hierarchies

(Yahoo!)structured

Glossaries

XML DTDs

Data

Dictionaries

(EDI)

‘ordinary’Glossaries

XML

Schema

DB

Schema

Glossaries & Data Dictionaries

MetaData,XML Schemas, & Data Models

Formal Ontologies & Inference

Thesauri, Taxonomies

Page 31: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Benefits from an Ontological Approach

• Semantic Unification– The unification of lexically different representations

that have the same semantics• Example: fcc phase in steels can be referred to as fcc,

austenite or g.

• Ontology-based Data Integration– Using ontologies to unify data that share some

common semantics but originate from unrelated sources• Example: Are property data from two experiments

consistent enough to be combined?

Page 32: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Core CALPHAD Tabular Data (HDF5)

Measurement Units (Units ML)

Page 33: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Sources• Prototype MGI Ontology

• ThermoML

• MatML

• MatSeek

• UnitsML

• ChemML

Tools• UML (Unified Modeling Language)

• Semantic Web (RDF, OWL)

Note: This is a generalized model depicting overall structure

Page 34: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Broad concepts covered in materials data files (data have many types)

– Objects, Materials, and Events

– Physical Properties

– Documents

– Data Objects & Types

– People & Organizations

– Software

– Relations among these

An ontology renders shared vocabulary and taxonomy which models a domain with the definition of objects and/or concepts and their properties and relations.

Page 35: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

problem normal normal

Image segmentation

Feature Vectors Computation

Mapping calculated feature

vectors into disease ontology

Highlighting regions with

deviation from normal conditions

Input Image

Output:

Suggestion of the potential diagnosis Sugessted diagnosis: Ulcer...

Slight Digression:

Methodology for Image to Diagnosis Through

Disease Ontology

35

Page 36: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Fragment of Disease Ontology in Protege’-Ontoviz

36

Page 37: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

UML Representation of Inflammatory Bowel Disease

+endoscopic feature1 : <unspecified> = ulcers+location : string(idl) =distal small interstine / proximal colon/ distal colon/ rectum

Idiopathic inflammatory bowel disease

-endoscopic feature 2 : <unspecified> = strictures-endoscopic 3 : <unspecified> = cobblestonning

Crohns disease

-endoscopic feature 2 : <unspecified> = pseudopolyps

Ulcerative colitis

37

Page 38: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Improving the Process…NLP and Text Mining

38

Page 39: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

• Semantic Medline• Carrot2• And more…

Materials Science Informatics

NIST Materials Science Domain

Advanced Materials

Group

CALPHADAssessments

Standard Reference

Data & Materials

Heterogeneous data

sources & types

• Tool-specific code, inputs, outputs• Database files• Research literature• Related data & calculations• Assessment-specific artifacts

NIST Materials Science Informatics

Applications Tools

• Semantic search• Automatic summarization• Semantic visualization• Knowledge Representation• Knowledge Discovery• Thematic Clustering

39

Research Corpora

Self-Diffusion Research Corpus > 760+ PDFs

CALPHAD Research Corpus > 200+ PDFs

… and more …

Page 40: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Semantic Medline (SM)

Corpus(PDFs) Textraw Textclean SM-DBTextsentences

Semantics

Semantic Medline

Knowledge Representation

• Meta Thesaurus• Semantic Network

Auto Summarization

Visualization

Search Results

Semantic Search

• Thematic• Iterative

convert denoise process

Natural Language Processing (NLP):• Stochastic Part-of-speech Tagging• Word Sense Disambiguation• Semantic Predication

CategoriesRelationsGroups

accesses

Quick Facts:• NIH semantic engine for PubMed• Retargeting from biomedical domain to

materials science

40

Page 41: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Informatics Task (discovery): Investigate Aluminum Alloy

Informatics Task

Investigate Aluminum Alloy

Traditional Approach

OurApproach

Semantic Medline Search(aluminum alloy)

Google Search(aluminum alloy)

• PageRank criteria• Spurious search results• Out-of context browsing

• Domain-specific representation• Graph & knowledge-based browsing• In-context browsing 41

Page 42: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Informatics Task (research): Research Ag Alloys for CALPHAD Assessment

Informatics Task

Research Ag Alloys + CALPHAD Assessment

Traditional Approach

OurApproach

Carrot2 Search(Ag alloy, CALPHAD

assessment)

Web of Science Search(Ag alloy, CALPHAD

assessment)

• Term-only criteria• Relies on pre-processed, structured

corpora

• Domain-specific representation• Tunable clustering• Can process less structured, noisy corpora• Thematic visualization & browsing

42

Page 43: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

CALPHAD Corpus Results (125 files)

43

Property Statistic

Total Files 125

Total Sentences 57,414

Min Sentence Length 1

Max Sentence Length 7,565

Avg Sentence Length 98

Total Concepts 195,053

Total Unique Concepts 5,732

Total Words 195,053

Total Unique Words 10,907

Total Relations 1,245

0

50000

100000

150000

200000

250000

CALPHAD Corpus Statistics

Statistic

Property Log Statistic

Total Files 2.096910013

Total Sentences 4.759017805

Min Sentence Length 0

Max Sentence Length 3.878808932

Avg Sentence Length 1.989147886

Total Concepts 5.290152634

Total Unique Concepts 3.758306182

Total Words 5.290152634

Total Unique Words 4.037705313

Total Relations 3.095169351

0123456

CALPHAD Corpus Statistics ( Log Scale )

Log Statistic

Page 44: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Improving the Process…Semantic Tools

44

Page 45: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

C. Campbell and A. Dima, MGI and Big Data

Page 46: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Reference Data Workflow

Curation

X

ReferenceData

Analysis&

Selection

PropertyDatabases

Literature

Search

RelevantData

AA’

BB’

CC’

Curated Data

Metadata Data

Generate

Computation

Experiment

Page 47: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Reference Data Workflow + Informatics

Curation

X

ReferenceData

Analysis&

Selection

PropertyDatabases

Literature

Search

RelevantData

AA’

BB’

CC’

Curated Data

Metadata Data

Generate

Computation

Experiment

Workflow System

Reasoning

Data Repository

Semi-Automated Curation System

SemanticSearch

Page 48: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

BUILDING MATERIALS INNOVATION INFRASTRUCTURE:

COLLABORATIVE NETWORKS

48

Page 49: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

HubZero

Web Application WorkspacesIn-browser Linux desktopCan share screensUpload/download filesUse development toolsLaunch jobs on the grid

Open source software platformCreate websites supporting• Scientific research• Education

Growth in NanoHUB users

“Facebook for scientists -- but built to facilitate serious research rather than socializing”

Page 50: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Some Existing Hubs

Page 51: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

iRODS

• integrated Rule Oriented Data System

• “Intelligent Cloud” for sharing data

• Features include:– High-performance network data

transfer– A unified view of disparate data– Support for a wide range of

physical storage– Easy back up and replication– Manages metadata– Controlled access. – Policies, Rules and Micro-services– Workflows– Management of large collections

https://www.irods.org

Page 52: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

C. Campbell and A. Dima (NIST), MGI and Big Data

Page 53: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Explosion of Database Systems

Page 54: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Popular Open Source NoSQL Databases

• MongoDB– Written in: C++– Main point: Retains some friendly

properties of SQL. (Query, index)– Best used: If you need dynamic queries. If

you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB

• Riak– Written in: Erlang & C, some JavaScript– Main point: Fault tolerance– Best used: If you need very good single-

site scalability, availability and fault-tolerance

• HBase– Written in: Java– Main point: Billions of rows X millions of

columns– Best used:

• Best if you use the Hadoop/HDFS stack already• Any place where scanning huge, two-

dimensional join-less tables are a requirement

• Redis

– Written in: C/C++

– Main point: Blazing fast

– Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

• CouchDB

– Written in: Erlang

– Main point: DB consistency, ease of use

– Best used: • For accumulating, occasionally changing data, on

which pre-defined queries are to be run

• Places where versioning is important

• Neo4j

– Written in: Java

– Main point: Graph database - connected data

– Best used: For graph-style, rich or complex, interconnected data

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

Page 55: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

55

Page 56: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

PUTTING THINGS TOGETHER

56

Page 57: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

XML Schema

Various Database Platforms

JSON

Data To

ols:

Statistics; Mach

ine

Learnin

gFUTURE MATERIALS INFORMATICS

Page 58: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Toward a Multilevel Representation

• Form (Micro, Meso, Macro)

• Function

• Behavior (Qualitative, Quantitative)

• Meta Information (Rationale, etc..)

• ……..

58

Page 59: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

A Note About Standards• Standards in general

– Enhance compatibility

– Facilitate interoperability

– Reduce technology risk

– Spur creativity and innovation (“The chromatic scale [of music], and its formal notation system, spawned two centuries of the most prolific and original composition” --Buckingham and Coffman)

Source:Information Rules, Shapiro and Harian

Page 60: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Imagine!

• It took about 30,000 people to build the TajMahal

• It took about 100,000 people to build the Great Pyramid

• About 300-400,000 people were involved in putting a man on the moon

• Now, imagine what can the combined intelligence of millions of people on the Internet can achieve!!

Page 61: Standards, Platforms, and ITncgia.buffalo.edu/OntologyConference/PPT/Sriram.pdf · 2016-07-05 · Approach Our Approach Semantic Medline Search (aluminum alloy) Google Search (aluminum

Session 7 Questions

• Q1. How can we evolve unified formats for microstructure representation?• Q2. How can the flow of information through the chain of simulations be

managed both effectively and accurately?• Q3. What are the foundational principles for platformization of approaches for

the integrated realization of engineered materials and products? • Q4. How does one build an enhanced materials innovation infrastructure to

enable academic and industrial researchers to effectively exploit the capabilities of advanced computers and networks?

• Q5. What information models can be flexible in the development of ontologies and inter-operability of disparate systems? Specifically for applications in evolution of design and manufacturing information and design processes?

• Q6. Computational Tools: What is the best approach to provide scalable and maintainable computational infrastructures for ICME?

• Q7. How can user-friendly computational platforms be developed to manage, store and retrieve information which is useful for various types of users –engineers, designers and researchers?

• Q8. Security and privacy issues: How can security and privacy related barriers to collaboration be reduced?

61