Upload
lynn
View
38
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Federated Service Oriented Information Management. Ahmet Sayar [email protected]. Introduction. - PowerPoint PPT Presentation
Citation preview
Federated Service Oriented Federated Service Oriented Information ManagementInformation Management
Ahmet Sayar
2
Introduction Aim: Develop a general Grid architecture based approach to
distributed heterogeneous data, information and knowledge –which are provided by different repositories and producers- in an efficient and robust manner.
Challenges in Representing, Transforming, Integrating and Displayingof Data Information/knowledgefor decision makers in scientific application domains.
Methodology: Create “Federated Service Oriented Information Management
architecture” for the GIS domain based on OGC (Open Geospatial Consortium) specifications.
Determine the requirements for the generalization of the architecture for other domains (Chemistry).
3
Motivation SOA based on Grid or Web Services We use DIKW to describe the hierarchy of Data-Information-
Knowledge-Wisdom that we are attempting to support “Filter Services” are Information Sources:
A service inputs DIKW from other Grids or Services and outputs DIKW – perhaps converting data to information etc.
Web Services, easy to extend and federate. Easy to publish, located and bind. Predictable input/output interfaces defined by metadata
A repository or sensor has or gets DIKW from "outside Grid"; it outputs DIKW; they are “just” filters whose output is Grid compatible DIKW as messages or message streams
Information management through ASIS (Application Specific Information System) framework in Science Domains.
Data and metadata concepts and formats
4
GIS – OGC (Motivation Domain) (1) Geographic Information System (GIS) is a
system for creating and managing spatial data and associated attributes.
OGC (Open Geospatial Consortium) The goal is to make geographic information and services neutral and available across any network, application, or platform.
Challenges (valid for any science domains) Distributed nature of geospatial data. Proprietary data formats, and service methodologies. Lack of interoperable services. Assembling data from distributed sources Format conversions Amount of resources for geoprocessing
5
GIS – OGC (Motivation Domain) (2) GML : Geographic Markup language WFS: Web Feature Server
Provides vector data such as rivers, state and city boundaries in GML.
WCS : Web Coverage Server Provides coverage (raster) data. Grided data, pixel info.
WMS : Web Map Server Provides data in the form of jpeg, svg, png etc. Defined
in its capabilities file. WMS’ : Cascading Web Map Server
Provides data in the form of layers in mages. It is cascading because it provides other WMS layers as if its own.
6
Information Management ArchIn GIS Domain (Sample Scenario)
WFSMD
Vector data
WMS’
FilteringModule
-Core Service-
Filter Container
Raster data
WMS
WCS
Interactive Decision Support
Data
capability
Query : No Standard – Filter specification – query on vector data by WFS using SQL
Data Encodings : GML, images Metadata : Structured Capability doc in
XML. No event notification – WS-Context for
asynchronous run. Registry : WRS – we call it MD.
(Nasa)
(CGL)
(CGL)
(Minnesota)
Data:a
Data:b
Data:b Data:c
Data:a Data:b Data:c
Interactive toolsDecision support
Data:a Data:b Data:c
7
Capabilities
Meta-data
Publishing
Data H
and
ling
Service in
terfaces
Disco
vering
ASFS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
From Raw Data to Information / Knowledge
Raw Data GML (WFS in Filter - ASFS)
GML Map image (WMS in Filter - ASVS)
Each filter provides data in a consistent format.
Formats should be consistent with the systems data model, GML
Any Data Common Data Model
Data Model is XML based hierarchical data Portable across
Languages Operating system
SS
Data base
Raw Data Or Any Data
Capabilities
Meta-data
Publishing
Data H
an
dlin
g
Servic
e interfaces
Disc
ove
ring
ASVS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
8
Interactive Decision Support Tools- Interactive query,- Interactive display, movie and animation- Integration to Application Science Simulations
http://virtualsky.org (R. Williams et al.)
9
Application Use Domains ServoGrid Projects (GIS)
Patter Informatics (PI) GeoFest Virtual California (VC)
Los Alamos National Labs (LANL) IEISS (The Interdependent Energy Infrastructure
Simulation System ) Models infrastructure networks (e.g. electric power
systems and natural gas pipelines) and simulates their physical behavior, interdependencies between systems.
Chemistry and Astronomy (Future) CML (Chemistry Markup Language) representation of
molecules. VOTable (Virtual Observatory Table format)
10
Problem Recognition
Vector data
Bitmapdata
netCDF
Bar graphs
Coveragedata
Imagejpeg
XMLdata
Statisticsdata
Plots images
Binarydata
Interactive Tools
DB
DB
DB
DB
DB
DB
DB
DB
Raw Data
Data
Information
Knowledge
Wisdom Decisions
SS
SS
SS
SS
SS
HDF5
11
Problem Recognition -cont Services like discovery and notification do not need to be made
application specific. BUT If the domain changes then :
choices, database requirements, data format, core service requirements, attributes, and metadata context
CHANGES ! What are the common concepts and characteristics for
data, metadata, query language, services, and communication language,
in order to drive information/knowledge from the heterogeneous data/information sources in any application domains ?
12
Generalization of Service Oriented Information Management Architecture
GIS has some specifications based on standards such as OGC ISO/TC210, But many others do not
GIS ASIS (Science Domain) GML ASL (Representing) WFS ASFS (Storing-Resource) WMS ASVS (Displaying) Capa.xml Metadata (Integrating) SOAP over HTTP. (Communication Protocol)
13
Generalization - Overall Structure Solution ASL : Application Specific Language. XML based
hierarchical data representation format. Cross language, platform and operating system
ASVS : Application Specific Visualization System Last filter before the decision maker. Provides information/knowledge in human readable formats
ASFS : Application Specific Feature Service. Stores and provides common data model (ASL)
Treat binary and common data (in ASL) differently.
ASFS
AS“Sensor”
AS Tool(generic)
AS Service(user defined)
AS Tool(generic)
ASVSDisplay
Message Using ASL
ASRepository
14
ASFS and ASVS in SOA Interfaces, querying, metadata and data model
ASFS ASVS
Routines Return types Routines Return types
GetCapability Capability file XML GetCapability Capability file XML
DescribeData XML-schema GetVis Images, svg, png..
GetData ASL GetDataInformation HTML, Text, XML
Each routine is published in the WSDL, invoked based on predefined request schema and put into SOAP body.
<request> …..<GetCapability> </request>
<SOAP:Envelope> …<SOAP:Body> ……<request> ……..<GetCapability> ……</request> ...<SOAP:Body> <SOAP:Envelope>
15
Sample Capabilities File (too simplified) – GIS Domain <?xml version='1.0' encoding="UTF-8" standalone="no" ?>
<!DOCTYPE WMT_MS_Capabilities SYSTEM "http://toro.ucs.indiana.edu:8086/xml/capabilities.dtd"> <Capabilities version="1.1.1" updateSequence="0"> <Service> <Name>CGL_Mapping</Name> <Title>CGL_Mapping WMS</Title> <OnlineResource xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple“
xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" /> <ContactInformation>
….. </ContactInformation> </Service> <Capability>
<Request> <GetCapabilities> <Format>WMS_XML</Format> <DCPType><HTTP><Get> <OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“
xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" /> </Get></HTTP></DCPType> </GetCapabilities> <GetMap> <Format>image/GIF</Format> <Format>image/PNG</Format> <DCPType><HTTP><Get> <OnlineResource xmlns:xlink="http://w3.org/1999/xlink" xlink:type="simple“
xlink:href="http://toro.ucs.indiana.edu:8086/WMSServices.wsdl" /> </Get></HTTP></DCPType> </GetMap> </Request> <Layer> <Name>California:Faults</Name> <Title>California:Faults</Title> <SRS>EPSG:4326</SRS> <LatLonBoundingBox minx="-180" miny="-82" maxx="180" maxy="82" / > </Layer> </Capability> </Capabilities>
16
Sample Scenario for ASIS
Static linking of filters.Capability aggregation cycle through “GetCapabilities” interfaces of filters.
Interactive Tools
Capabilities
Meta-data
Publishing
Data H
and
ling
Service in
terfaces
Disco
vering
ASVS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
Capabilities
Meta-data
Publishing
Data H
and
ling
Service in
terfaces
Disco
vering
ASVS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
Capabilities
Meta-data
Publishing
Data H
and
ling
Service in
terfaces
Disco
vering
ASVS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
Capabilities
Meta-data
Publishing
Data H
and
ling
Serv
ice interfac
es
Disco
vering
ASFS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
Data A
Capabilities
Meta-data
Publishing
Data H
and
ling
Serv
ice interfaces
Disc
overin
g
ASFS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
Capabilities
Meta-data
Publishing
Data H
and
ling
Serv
ice interfaces
Disc
overin
g
ASFS(Core)
Data Modeling
Domain Knowledge
AnyData
StructuredData
Data D
F
B,C
AA,B,C
A,B,C,D,E,F
E
A,B,C,D,E,F
D
Each Filter publishes its data through its capability file.
A,B,C
A,B,C
E,F
Data B,C
Data FData
E
GetCapability request from client tools at the startup. Later requests will be created based on returned aggregated capabilities
GetVis(A,E) GetData(A)
GetData(A)
GetVis(E)
Client needs to visualize Data A and E and makes a GetVis request to ASVS with specific attributes for querying. GetVis is defined in a schema file.
Successive requests are done, user is not involved. These request chains are created based on filters capabilities that published before
A
E
17
Overall Structure Solution -cont
Common data (ASL) is kept in ASFS with query capability. In a given domain every filter speaks in ASL. Filters (ASVS, ASFS) keep their metadata locally. ASVS both visualize information and provide a way of navigating
ASFS and their underlying DB. ASVS can itself be federated and present output interface. Dynamic metadata update via MD services or P2P metadata
exchange. Utilizing data/information at the application level via filters
ASFS provide ASL. ASVS provide human readable information such as text, graphs
(scalable vector (svg) or portable (png)) and images. Filters have common ports and interfaces
Enable chaining for more complex data and information creation. Filters are easily published, located and invoked over the internet.
18
Applicability to Different Science Domains
How strongly our service definitions in proposed architecture matches to general science domains?
ASL
Filters
ASFS ASVS Metadata
GIS GML WFS WMS capability.xml schema
Astronomy VOTable, FITS
SkyNode VOPlot
TopCat
VOResource
Chemistry CML NO NO standard
JChemPaint
NO
19
Research Issues (1)
Requirements for the domain metadata in capability What does capabilities do and need to have to
federate filters?
Requirements for the ASL (such as CML, GML) What does ASL need to have to federate the filters?
Concept of data (such as feature, coverage) Common representation? Possible? To what extend?
A common information management framework which can be applied to any domain. some instructions- any field, what needs to be done
20
Research Issues (2)
Application level data/information federation. Integrating the system with application science
simulations. Creating interactive decision support tools
utilizing integrated filter services. Tools for map animation, map movies, images Interactive query support to get further information on
the image and/or animation. Enabling binding of services into pipelines with
or without human intervention through metadata. Caching and load balancing to handle large
scientific data in an efficient and robust manner (application based).
21
Related WorkSRB (Storage Resource Broker)
SRB Uniform access to distributed heterogeneous data
resources by attributes. Catalog service is MCAT (Metadata Catalog Service). Resource and data location transparency. Remote authentication authorization – user groups. Not just for access, transferring and replicating. Sample projects using SRB: BIRN and IVOA.
Summary Other important digital library projects and the NGAS
(Next Generation Archive System) from ESO. We will research more these important activities, identify
key architecture ideas and incorporate lessons. SRB can be leveraged in ASIS.
22
Related Work -ContOGSA-DAI
Ogsa-DAI Open Grid Service Architecture–Data Access and
Integration. Access to heterogeneous data via common interfaces
on the grid. Catalog service is MCS (Metadata Catalog Service) OGSI-compliant Grid. Components are Grid services. Resources should be
registered. Sample projects using Ogsa-DAI : LEAD, MyGrid.
Summary OGSA-DAI emphasizes database layer whereas we
are tackling the application specific DIKW. OGSA-DAI can be leveraged in ASIS.
23
Contributions
Instructions how to build ASL and metadata in capability for the application sciences.
Instructions how to build application specific information system (ASIS) federating multiple filters speaking ASL.
Information grid (ASIS) formalization through capabilities metadata, defining all the data/information sources as interacting Web Service filters with standard metadata service ports.
Optimize and enhance the distributed heterogeneous information management.
25
APPENDIX
26
Literature Survey
OGSA-DAI
SRB
27
Discussions on SRB & Ogsa-DAI
SRB Monolithic – does too much MCAT dependent MCAT has limited support for application-level metadata
Need diff metadata for diff domain, and extensions for applications Not standard based – Not open source Not handling data based on DIKW hierarchy
Ogsa-DAI At the data and Database level MCS dependent MCS has limited support for application-level metadata
Need diff metadata for diff domain, and extensions for applications For Grid applications - GGF standards Data only in relational and XML database or ordinary files Not handling data based on DIKW hierarchy
28
Our Work Compared to SRB & Ogsa-DAI (1) Each filter has its own metadata
Distributed metadata handling Peer to peer Through MD services
They provide heterogeneous data access and federation through central metadata services SRB MCAT and Ogsa-DAI MCS
Main motivation is sharing, interpreting and knowledge extraction of the data and information.
Their motivation is storing, accessing and updating of the heterogeneous data.
We leverages their power and usability in our federated service oriented information management architecture.
They are not competitors, instead completers.
29
Our Work Compared to SRB & Ogsa-DAI (2)
MasterSRBOgsa-GDSF
R R R
ASFS
R R R
ASFS ASFS
ASVS ASVS
ASVS
Wisdom decisions, knowledge and information extraction by the user
-Reusable components Filter Services with specific ports and interfaces
-Distributed DIKW abstraction
-Metadata in capability document
-Metadata aggregators
-New metadata for different domains
-Smart data querying
-Web Services based SOA (advantages).
Wisdom Decisions, ready to use information and knowledge
-Central data access abstraction. Uniform access to heterogeneous data sources
-Metadata : SRB/MCAT, Ogsa-DAI/MCS
-Both provides extensible metadata arch for diff domains
-SRB has “zone” concept addresses similar issues but in different way
Wisdom decisions
Information/knowledge
Data access and query
SRB Agents Ogsa/GDS
Interactive Tools
MCAT
GDSReg
30
Why are we different ?Federated Service Oriented Information Management SOA (Service Oriented Architecture)
Easy to extend Reusable components Cross platform and language. XML based hierarchical data representation
Easy data integration Easy querying Human readable information
Easy to access data – no command line Interactive tools On the fly query creation.
Not only accessing data but also transforming through its path to end users.
Ports to integrate application simulations to application specific information system (ASIS) Integrating application simulation data/information with ASIS
outputs
31
An Example of Other Domains:Astronomy Domain (IVOA Standards)
FS-2
DB
FS-1
DB
FS-3
FS-1 : VOPlot Integrating, Interacting
visualization tools FS-2 : SkyNode
ADQL based SOAP interface returning VOTable based results
FS-3 : SIA 2D sky projection, logically a grid of
pixels encoded as a FITS image FS-4 : SSA
URL-based returning a dataset "document" (VOTable)
Query : ADQL –extension of SQL Data Encoding: VOTable, FITS Metadata : UCD, VOResource Event notification : VOEvent Registry : VORegistry QueryableData in : SSAP and SIAP,
VOStore
DB
FS-4
MD
Interactive Decision Support
Data
capability
PORTAL