29
A Framework for A Framework for Modeling, Naming, and Modeling, Naming, and Authoring Authoring Distributed Metadata Distributed Metadata Ozgur Balsoy Ozgur Balsoy 31 October 2009 31 October 2009

A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

Embed Size (px)

Citation preview

Page 1: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

A Framework forA Framework forModeling, Naming, and AuthoringModeling, Naming, and Authoring

Distributed MetadataDistributed Metadata

Ozgur BalsoyOzgur Balsoy

31 October 200931 October 2009

Page 2: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 22

OverviewOverview

IntroductionIntroduction– Research ObjectivesResearch Objectives

Metadata and Metadata ModelingMetadata and Metadata Modeling– XML as a Modeling LanguageXML as a Modeling Language– Metadata ModelsMetadata Models– Metadata RepositoriesMetadata Repositories

A Framework for Distributed MetadataA Framework for Distributed Metadata– Modeling, Naming, and AuthoringModeling, Naming, and Authoring

ConclusionsConclusions

Page 3: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 33

Research Objectives -IResearch Objectives -I

Investigate and propose a metadata solution for Investigate and propose a metadata solution for distributed and collaboration systems. Why?distributed and collaboration systems. Why?In today’s computing systems,In today’s computing systems,– Data is produced in massive amounts.Data is produced in massive amounts.– Resources are spread out.Resources are spread out.– People collaborate online to teach, solve problems.People collaborate online to teach, solve problems.

Effects on data management,Effects on data management,– Organizations are overflowed with data.Organizations are overflowed with data.– Individual units try to solve their own problems.Individual units try to solve their own problems.– Enterprise-wide disunity arises in data management.Enterprise-wide disunity arises in data management.

Page 4: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 44

Research Objectives -IIResearch Objectives -II

Solutions?Solutions?– Identify data and resources in common terms Identify data and resources in common terms

or with clear descriptions; allow flexibility;or with clear descriptions; allow flexibility;– Plan metadata generation as early in data Plan metadata generation as early in data

generation as possible: generation as possible: encourage people, andencourage people, and

automate and simplify the process.automate and simplify the process.

– Make resources available, control accesses if Make resources available, control accesses if necessary.necessary.

Page 5: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 55

MetadataMetadata

Metadata is data that describes other data.Metadata is data that describes other data.– Hence, the same rules are applied to it:Hence, the same rules are applied to it:

Needs to be identified, modeled, stored, shared.Needs to be identified, modeled, stored, shared.

Metamodeling is modeling for metadata, and it’s Metamodeling is modeling for metadata, and it’s everywhere:everywhere:– Designing your DBMS tablesDesigning your DBMS tables– Building a directory structure for your applicationBuilding a directory structure for your application– Planning what information your favorite MP3 player Planning what information your favorite MP3 player

will display in a play list.will display in a play list.– Deciding what type of folders you want to keep in your Deciding what type of folders you want to keep in your

file cabinet. file cabinet.

Page 6: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 66

Modeling for the Web -IModeling for the Web -I

The Web is a giant library of shared resources.The Web is a giant library of shared resources.A A resourceresource is a Web page, an image, an entire is a Web page, an image, an entire site, an application, a workstation, a site, an application, a workstation, a supercomputer; a computing component that supercomputer; a computing component that has and provides with value.has and provides with value.In a distributed computing environment and In a distributed computing environment and collaboration systems, collaboration systems, resources are sharedresources are shared through the networks.through the networks.All shared resources need identification for a All shared resources need identification for a better integration. This means metadata needs better integration. This means metadata needs to be modeled for the Web.to be modeled for the Web.

Page 7: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 77

Modeling for the Web -IIModeling for the Web -II

On the Web,On the Web,– Textual informationTextual information presented and shared in presented and shared in

markup languages.markup languages.– Document-centric approach for content is Document-centric approach for content is

suitable for humans, but applications require suitable for humans, but applications require fine-grained structured forms of data.fine-grained structured forms of data.

– XML specification brings these together:XML specification brings these together:Textual representation of the tree structured data.Textual representation of the tree structured data.

Page 8: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 88

XML as a Modeling LanguageXML as a Modeling Language

BenefitsBenefits– Simple; easy; verbose; has widely available toolsSimple; easy; verbose; has widely available tools

Application areasApplication areas– Defining new markup languages, specificationsDefining new markup languages, specifications– Simplifying configuration file processesSimplifying configuration file processes– Data presentation Data presentation

styling, document or data format transformationsstyling, document or data format transformations– CommunicationsCommunications– Data persistenceData persistence

Document-centric content, variable tree structured dataDocument-centric content, variable tree structured data– Object modelingObject modeling

Models that represent real life objects in XMLModels that represent real life objects in XML

Page 9: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 99

Metadata Models -IMetadata Models -I

Attribute-value pairsAttribute-value pairs– Placed inside and describe their containers.Placed inside and describe their containers.– Simplest and most commonly used method.Simplest and most commonly used method.– The Dublin Core Metadata Set for documents.The Dublin Core Metadata Set for documents.

Title, creator, date, subject, description, language, etc.Title, creator, date, subject, description, language, etc.

<head> <title>Ozgur Balsoy-Current Status</title><meta http-equiv="content-type" content="text/html; charset=ISO-8859-9"><meta name="author" content="Ozgur Balsoy"><meta name="description" content="This is the info to describe this page."><meta name="keywords" content="metadata,xml,web"></head> attribute & value

attribute & value

Page 10: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1010

Metadata Models -IIMetadata Models -II

Metadata StatementsMetadata Statements– Resource Description Framework (RDF)Resource Description Framework (RDF)– Triples (subject, predicate, object) define resources identified by Triples (subject, predicate, object) define resources identified by

URIsURIs– Metadata outside the content.Metadata outside the content.– Can Model document relationships in the form of labeled Can Model document relationships in the form of labeled

directed graphs.directed graphs.– RDF Schema (RDFS): Defines languages (give meanings, RDF Schema (RDFS): Defines languages (give meanings,

semantics, to relationships within RDF).semantics, to relationships within RDF).– Semantic Web and Ontology Languages.Semantic Web and Ontology Languages.

RDF statement semantics form groups or domains of documents: RDF statement semantics form groups or domains of documents: OntologiesOntologiesSimplifies queries on resources that are related within an ontology. Simplifies queries on resources that are related within an ontology.

Page 11: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1111

Metadata Models -IIIMetadata Models -III

http://www.w3.org/Home/Lassila Ora LassilaCreator

Subject (Resource) Predicate(property)

Object(value)

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description about="http://www.w3.org/Home/Lassila"> <dc:creator>Ora Lassila</dc:creator> </rdf:Description></rdf:RDF>

Page 12: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1212

Metadata ModelsMetadata Models

Schema-based structured modelsSchema-based structured models– Allow traditional data models, i.e., E-R diagrams and Allow traditional data models, i.e., E-R diagrams and

OOD models, to be easily represented in XML.OOD models, to be easily represented in XML.– Simplify mapping persistent data to in-memory Simplify mapping persistent data to in-memory

objects and vice versa (XML-data binding)objects and vice versa (XML-data binding)– Allow existing XML tools, storage, management, and Allow existing XML tools, storage, management, and

query systems to be used for schema-based query systems to be used for schema-based metadata as well.metadata as well.

– XML Schema, Schema for OO XML (SOX), RELAX XML Schema, Schema for OO XML (SOX), RELAX NG.NG.

Page 13: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1313

Metadata RepositoriesMetadata Repositories

Current Research on RepositoriesCurrent Research on Repositories– Annotea – a Web-based shared annotation system.Annotea – a Web-based shared annotation system.– Sesame – a middleware with Repository Abstract Layers. Sesame – a middleware with Repository Abstract Layers. – RDFSuite – a suite of tools for RDF validation, storage, query.RDFSuite – a suite of tools for RDF validation, storage, query.– SAM – Scientific Annotation Middleware, electronic notebook. SAM – Scientific Annotation Middleware, electronic notebook. – CMCS – Collaboratory Multi-Scale Chemical Science.CMCS – Collaboratory Multi-Scale Chemical Science.– IKM – Indigenous Knowledge Management.IKM – Indigenous Knowledge Management.

Common FeaturesCommon Features– ModelingModeling– Naming and DiscoveryNaming and Discovery– Generation, Authoring, and PresentingGeneration, Authoring, and Presenting– PersistencePersistence– QueriesQueries

Page 14: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1414

A Common Architecture ofA Common Architecture ofMetadata RepositoriesMetadata Repositories

Co

nte

nt

Acc

ess

Metadata

Metadata& Content

Content

Map

pin

gs

Sys

tem

Acc

ess

an

d R

end

erin

g

ContentManagement

Systems

ElectronicNotebooks

Discovery

Authoring

Modeling

Rendering

Search

Security

Versioning

MetadataRetrieval

ServicesApplications

Dat

a B

ind

ing

s EnterpriseData StoresClients

Admin.Interfaces

Other Repository Systems

ContentRetrieval

Retrieval

Caching

Filtering

Network(p2p,pub/sub)

Interfaces

DocumentCentric

Do

cum

ent

Acc

ess

Object Models

Page 15: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1515

A Common Architecture ofA Common Architecture ofMetadata Repositories -IIMetadata Repositories -II

ClientsClients– Users or other applications; various device capabilitiesUsers or other applications; various device capabilities

ApplicationsApplications– Interface to clients; composed of servicesInterface to clients; composed of services

ServicesServices– Modeling, discovery, authoring, rendering, search, security, Modeling, discovery, authoring, rendering, search, security,

versioningversioning

Data retrievalData retrieval– Data access and retrieval (i.e., JDBC, RMI, UDP, WS, HTTP), Data access and retrieval (i.e., JDBC, RMI, UDP, WS, HTTP),

memory binding; caching, filtering; authorization and memory binding; caching, filtering; authorization and authentication to data.authentication to data.

Data storesData stores– Various forms of data stores: RDBMSs, Native XML DBs, File Various forms of data stores: RDBMSs, Native XML DBs, File

systems, Hybrid systemssystems, Hybrid systems

Page 16: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1616

A Framework for Distributed MetadataA Framework for Distributed Metadata

Modeling for Collaboration and GridsModeling for Collaboration and Grids

MetamodelingMetamodeling– An extensible object specification: GXOSAn extensible object specification: GXOS– Metaobjects and TreeObjectMetaobjects and TreeObject

Naming and DiscoveryNaming and Discovery– A naming and directory interface: GNDIA naming and directory interface: GNDI– Metaobject RetrievalMetaobject Retrieval

AuthoringAuthoring

Event ModelEvent Model

Page 17: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1717

Modeling for CollaborationModeling for Collaboration

Metaobjects are metamodels for all objects incl. Metaobjects are metamodels for all objects incl. resources, i.e., users, organizations, courses.resources, i.e., users, organizations, courses.Common characteristics of metaobjects needed Common characteristics of metaobjects needed for distributed and collaboration systemsfor distributed and collaboration systems– Identification (name, type)Identification (name, type)– Relationships (children)Relationships (children)– Duration, Timings (Start, end, & update times)Duration, Timings (Start, end, & update times)– Realization and References (internal content, Realization and References (internal content,

rendering methods, links to external content)rendering methods, links to external content)– Security (access rights, owners, groups, roles)Security (access rights, owners, groups, roles)– Extendibility (attribute-value pairs, schemaless XML, Extendibility (attribute-value pairs, schemaless XML,

type extensions)type extensions)

Page 18: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1818

A Simple Object Model for A Simple Object Model for TreeObjectTreeObject

TreeObject

NameTypeStartTimeEndTimeUpdateTimeChildrenProfileContentsAccessibilityExtensions

TimeObject

TimeSyntaxClock

NodeChildren

NodeChild

InternalND

InternalAddressHelpDirectory

ExternalND

ExternalURLExternalFileExtlComputer

NodeContents

NodeContent

ObjectRealization

StrategyContentTypeLinkGXOSExternalGXOSLocalGXOS LocalObject

LocalBody

Page 19: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 1919

Metaobjects for Collaboration

CollectionType

DocumentType

MeetingObj.Type

SharedletType

UserObjectType

DeviceObj.Type

EventObj.Type

ProgramObjType

StreamType

VirtualEnv.Type

TreeObject

Collection

Document

MeetingObject

Sharedlet

UserObject

DeviceObject

EventObject

ProgramObject

Stream

VirtualEnvironment

Page 20: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2020

Naming and DiscoveryNaming and Discovery

Metaobjects are collections of Metaobjects are collections of metaobjects.metaobjects.

Each is represented by a unique name Each is represented by a unique name within its parent collection, or context.within its parent collection, or context.

Collections form a tree structure.Collections form a tree structure.

An object is identified by the full path from An object is identified by the full path from the root to itself, the node.the root to itself, the node.

URIs become handy to name metaobjects.URIs become handy to name metaobjects.

Page 21: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2121

Metaobjects Directory TreeMetaobjects Directory Tree

Collection

root

devices usersevents

community grids messages root jdoe

Collection Collection Collection

Device Device Stream User User

Event

Msg001

gxos://root/devices/community gxos://root/users/root

Page 22: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2222

Resolving URIsResolving URIs

Page 23: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2323

Components of the FrameworkComponents of the Framework

Page 24: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2424

Metadata AuthoringMetadata Authoring

Generation of metadata is tedious if done by Generation of metadata is tedious if done by users. Mostly ignored, or time consuming.users. Mostly ignored, or time consuming.Automation of metadata generation is limited to: Automation of metadata generation is limited to: – Size, date, owner, content type, etc. can be detected, Size, date, owner, content type, etc. can be detected,

but,but,– Writing a summary (short descriptions), detecting Writing a summary (short descriptions), detecting

document types (a novel, an article, a talk) require document types (a novel, an article, a talk) require human intervention.human intervention.

Solution is to automate the process as much as Solution is to automate the process as much as possible:possible:– Detect information from content, and Detect information from content, and – Provide simplified and guided interfaces for users.Provide simplified and guided interfaces for users.

Page 25: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2525

Metadata Authoring ArchitectureMetadata Authoring Architecture

Page 26: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2626

Distributed Event ModelDistributed Event Model

In a distributed system, communication between In a distributed system, communication between components is based on messages.components is based on messages.

Events are objects that carry messages with Events are objects that carry messages with time stamps; hence, they are also metaobjects.time stamps; hence, they are also metaobjects.

Events as metaobjects can be integrated into Events as metaobjects can be integrated into the overall framework:the overall framework:– Events can be named, stored, and retrieved like other Events can be named, stored, and retrieved like other

metaobjects.metaobjects.– Since time stamped by the model, they can be Since time stamped by the model, they can be

regenerated in the order of occurrence.regenerated in the order of occurrence.

Page 27: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2727

ConclusionsConclusions

We described metadata and metadata solutions We described metadata and metadata solutions for different application areas.for different application areas.We see that these solutions are also needed for We see that these solutions are also needed for our distributed and collaboration systems our distributed and collaboration systems research.research.We identified common architectural features of We identified common architectural features of metadata repositories.metadata repositories.We proposed a framework with an event model We proposed a framework with an event model for modeling, naming, and authoring distributed for modeling, naming, and authoring distributed metadata.metadata.

Page 28: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2828

Current WorkCurrent Work

Studied other required object types for collaboration, and completed Studied other required object types for collaboration, and completed the metaobject models for TreeObject and other necessary the metaobject models for TreeObject and other necessary extensions for collaboration.extensions for collaboration.Implemented and demonstrated a simple model of the naming and Implemented and demonstrated a simple model of the naming and discovery service on file systems. discovery service on file systems. – Shown that similar services can be built for other means.Shown that similar services can be built for other means.

Studied requirements of automating metadata generation and Studied requirements of automating metadata generation and building interfaces for metadata authoring.building interfaces for metadata authoring.– Built an automated metadata authoring interface system based on Built an automated metadata authoring interface system based on

schema-based metaobject models.schema-based metaobject models.Studied needs of modular event architectures to simplify integration Studied needs of modular event architectures to simplify integration of new applications into distributed systems.of new applications into distributed systems.– Designed & implemented the model on a publish/subscribe messaging Designed & implemented the model on a publish/subscribe messaging

system and SMTP, and its modularity demonstrated with newsgroup system and SMTP, and its modularity demonstrated with newsgroup and training registry applications. and training registry applications.

Page 29: A Framework for Modeling, Naming, and Authoring Distributed Metadata Ozgur Balsoy 31 October 2009

04/21/2304/21/23 XML ModelingXML Modeling 2929

Q&AQ&A