39
ISA /department of mathematics and computer science TU/e eindhoven university of technology April 17, 2003 1 Web Information Systems Engineering Flavius Frasincar [email protected]

Web Information Systems Engineering

  • Upload
    alisa

  • View
    53

  • Download
    0

Embed Size (px)

DESCRIPTION

Web Information Systems Engineering. Flavius Frasincar [email protected]. Contents. What is a Web Information System (WIS)? WIS Features Problem: Data Management in WIS Solution: Model-Driven Methodology (with Tasks Separation) Methodologies for WIS: Strudel Methodology - PowerPoint PPT Presentation

Citation preview

Page 1: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 1

Web Information Systems Engineering

Flavius [email protected]

Page 2: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 2

Contents

• What is a Web Information System (WIS)?• WIS Features• Problem: Data Management in WIS• Solution: Model-Driven Methodology (with Tasks

Separation)• Methodologies for WIS:

– Strudel Methodology– Hera Methodology

• Summary

Page 3: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 3

World Wide Web

• 1990: Tim Berners Lee ( ) invents the World Wide Web

• The Web success is based on:– hypermedia (link) nature: links allow for a natural and

flexible access to information according to the associative nature of human mind

– global availability– interoperability– simplicity– free etc.

Page 4: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 4

Web Information Systems (WISs)

• 1998: Tomas Isakowitz at al. coined the term Web Information Systems for: “information systems that are based on Web technology”

• WISs are different from traditional information systems as they “have the potential of reaching a wider audience” through different platforms

• There is an even increased need to integrate data as the data sources are distributed over the Web and they are possibly heterogeneous

Page 5: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 5

Three Generations of WISs

• First Generation: are based on hand-crafted HTML – Difficult to maintain (update)

• Second generation: generate HTML on demand by automatically filling templates– Data is machine readable/transformable– Difficult to make the data machine understandable

• Third generation: Semantic Web Information Systems (SWISs) are WISs based on Semantic Web technology (RDF, OWL etc.)– Data is machine understandable

Page 6: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 6

Present the Deep Web

Deep Web vs. Surface Web:•500 times larger •1000 times better quality

Page 7: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 7

WIS Features• Data-intensive: integrate data from multiple

heterogeneous sources• Pervasive: support different platforms e.g. network (T1, 128K, 56K), display (PC, Palm, WAP Phone)

• User Adaptable: consider user’s preferences and user’s state of mind while interacting with the system

• Flexible: support semistructured data• Automatic: need little or no human intervention• User interactive: e.g. online shops (Amazon)

Page 8: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 8

Problem: Data Management

• WIS are hard to specify and implement• Methodologies exist for manual WIS design but few of

them target automation• Difficult tasks to perform:

– Multiplatform support– Automatic updates– Automatic site reconstruction (WIS Adaptation)– Optimize WIS performance (WIS Optimization)– Enforce WIS integrity constraints (WIS Analysis)– Achieve flexibility, extensibility etc.

Page 9: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 9

Semistructured Data

• It is characterized by:– Irregular structure: missing or additional attributes,

multiple attributes– Few type constraints: attributes with different types in

different objects, heterogeneous collections– Rapidly evolving schema or missing schema

• It is typically modeled by a DLG (Directed Labeled Graph)

• Examples: HTML, XML, RDF, LaTeX Bib etc .

Page 10: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 10

Solution: Tasks Separation

• Isolate and automate common tasks for WIS design:– Choose and access the data (data integration and retrieval)

to be presented– Design the navigational structure for this data– Design the visual aspects of the presentation

• Use a model-driven approach for task specification (the fairy says it brings “wisdom” [theory], “richness”[money], and “beauty” [judge it yourself] – Stefano Ceri)

Page 11: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 11

WIS Presentation Generation Srategies

• Static (eager approach): presentations are materialized completely, each page is precomputed

• Dynamic or On-demand (lazy approach): after each link “click” the next page to be presented is computed

Page 12: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 12

Methodologies• Dexter-based: HDM (Hypermedia Design Method)• ER-based: RMM (Relationship Management Methodology)• OMT-based: OOHDM• UML-based: OO-H (Conallen), UWE (UML-based Web

Engineering),W2000 (HDM extension)• RDF-based: XWMF (eXtensible Web Modeling

Framework), Hera • Other: Strudel, Araneus, WebML (Web Modeling

Language), Autoweb, Trellis, XAHM (XML-based Adaptive Hypermedia Model ), WSDM, W3DT etc.

Page 13: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 13

Strudel Methodology

http://www.research.att.com/~mff/strudel

AT&T

Page 14: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 14

Strudel Architecture Relational

Database Object - Oriented

Database XML

Database

Uniform Data Model

Site Graph

HTML Template

- HTML

Presentation

HTML Template HTML Template

STRUQL

Page 15: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 15

Input Data

<publications> <pub id=pub1> <title>Declarative spec…</title> <author>Mary Fernandez</author> <author>Dan Suciu</author> <year>2000</year> <journal>VLDB</journal> <abstract>Strudel is a …</abstract> <category>Languages</category> <category>Methods</category> … </pub1>

… <pub id=pub2> <title> Catching the …</ title> <author>Mary Fernandez</author> <author> Daniela Florescu </author> <year>1998 </year> <booktitle> SIGMOD </booktitle> <abstract> The Strudel …</abstract> <category>WIS</category> … </pub2></publications>

Page 16: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 16

Semistructured Data Model

pub pub

publications

year author year

author

2000 1998 M. Fernandez M. Fernandez

… …

pub1 pub2

Root

Direct Labeled Graph (DLG)

Page 17: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 17

STRUQL(Site TRansformation Und Query Language)

where Root”publications”r, r”pub” x, xl v{ where l=“year” link YearPage(v) ”year” v, YearPage(v) ”paperPage” x, RootPage() ”yearPage” YearPage(v) collect RootPage{RootPage()}, YearPage{YearPage(v)}} …

Page 18: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 18

Site Graph

YearPage(1998) YearPage(2000)

“yearPage”

“year” “paperPage”

2000 1998 PaperPage(pub1)

… …

RootPage()

“yearPage”

“year” “paperPage”

PaperPage(pub2)

“paperPage”

“paperPage”

Page 19: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 19

STRUDEL Template Language• RootPage collection:<html><sfor p in yearPage order=ascend key=year> <sfmt @p [email protected]> </sfor></html>

• YearPage collection:<h1><sfmt year></h1><ul> <sfor p in paperPage> <li><sfmt @p></li> </sfor></ul>

• PaperPage collection:<i><sif booktitle> <sfmt booktitle><selse> <sfmt journal></sif></i><br><sfor p in author> <sfmt @p>,</sfor><br><sfmt year><br>

Page 20: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 20

STRUDEL +/-

+ : Tasks separation (content and presentation) Declarative specifications (enables presentation content adaptation)

Verification of integrity constraints (e.g. “All paper pages are reachable from RootPage”)

- : Intermixes schema and content defintion in the data graph Does not separate navigation from visual details of the presentation Does not use standard technologies

Page 21: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 21

Hera Methodology

http://wwwis.win.tue.nl/~hera

TU/e

Page 22: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 22

Hera Architecture

Relational Database

Object-Oriented Database

XML Database

ODB-XML Wrapper

RDB-XML Wrapper

Mediator/ Integrator

Logical Presentation

Logical-WML Presentation

HTML Presentation

SMIL Presentation

WML Presentation

Logical-SMIL Presentation

Logical-HTML Presentation

Information RetrievalHypermedia Presentation

…Query

Use

r/Plat

form

A

dapt

atio

n

Page 23: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 23

Hera Presentation Methodology

Conceptual Design

Application Design

Presentation Design

Adaptation D

esign

Conceptual Model

Application Model

Presentation Model

Transformation

Transformation

Page 24: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 24

Conceptual Model (CM)

• Provides a uniform semantic view over different data sources that are integrated within a given Web application

• Consists of hierarchies of concepts relevant within the given domain

• Concept relationships are:– Attribute relationships: refer to literal values that

characterize a concept– Reference relationships: refer to other concepts

Page 25: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 25

Example: CM

Technique Artifact Creator

Painting

String

String

String

String

String

Image

Integer

Painter

name

description exemplifies

created_by

creates

name year

picture

name

biography

painted_by

paints Property subClassOf subPropertyOf

exemplified_by

Page 26: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 26

Example: CM in RDF/XML

<rdfs:Class rdf:ID="Artifact"/><rdfs:Class rdf:ID="Painting"> <rdfs:subClassOf rdf:resource="#Artifact"/></rdfs:Class>

<rdf:Property rdf:ID="year"> <rdfs:domain rdf:resource="#Artifact"/> <rdfs:range rdf:resource=“#Integer"/> </rdf:Property>

<rdf:Property rdf:ID="picture"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource=“#Image"/></rdf:Property>

<rdfs:Class rdf:ID="Creator"/><rdfs:Class rdf:ID="Painter"> <rdfs:subClassOf rdf:resource="#Creator"/></rdfs:Class>

<rdf:Property rdf:ID="creates" sys:cardinality="multiple" sys:inverse="created_by"> <rdfs:domain rdf:resource="#Creator"/> <rdfs:range rdf:resource="#Artifact"/></rdf:Property>

Page 27: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 27

Application Model (AM)• Captures the logical (navigational) aspects of the

presentation• Based on the concept of slice which contains attributes and

possibly other slices – A slice is a meaningful presentation unit– A slice is associated to a concept from CM

• Slice relationships are:– Aggregation relationships: embed a set of slices (abstraction for

index, tour, indexed guided tour etc).– Reference relationships: link abstraction with an anchor specified

Page 28: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 28

Example: AM

technique

name

exemplified_by

description

Set

painting picture

painting

name

painter name

main main

painted_by

year

picture

Page 29: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 29

Example: AM in RDF/XML

<rdfs:Class rdf:ID="Slice.technique.main" slice:owner=“CM#Technique" slice:main="Yes"> <rdfs:subClassOf rdf:resource=“#Slice"/></rdfs:Class>

<rdfs:Class rdf:ID="S.painting.picture" slice:owner=“CM#Painting" slice:attr-ref=“CM#picture"> <rdfs:subClassOf rdf:resource="#Slice"/></rdfs:Class>

<rdf:Property rdf:ID="media"> <rdfs:domain rdf:resource="#S.p.picture"/> <rdfs:range rdf:resource=“#Image"/></rdf:Property>

<rdfs:Class rdf:ID="Slice.painting.main" slice:owner="CM #Painting"> <rdfs:subClassOf rdf:resource="#Slice"/></rdfs:Class>

<rdf:Property rdf:ID="slice-ref"> <slice:prop-ref rdf:resource="CM #ex_by"/> <rdfs:domain rdf:resource="#S.t.main"/> <rdfs:range rdf:resource="#S.p.picture"/></rdf:Property>

<rdf:Property rdf:ID=“link_1"> <rdfs:subPropertyOf rdf:resource =“#link”> <rdfs:domain rdf:resource="# S.p.picture"/> <rdfs:range rdf:resource="#S.p.main"/></rdf:Property>

Page 30: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 30

Adaptation

• Captures two kinds of adaptation– Adaptability takes into account the device capabilities

and user preferences (UAProf = User Agent Profile) – Adaptivity means that the presentation changes itself

according to the “state of the user’s mind” while being browsed (UM = User Model)

• Adaptation based on conditioning the appearance of slices using UAProf and/or UM

• Adaptivity uses AHAM (Adaptive Hypermedia Application Model) update rules for updating UM

Page 31: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 31

Adapted Application Model

um:Technique < 10 um:Painting < 10

technique

name

exemplified_by

description

Set

painting picture

painting

name

painter name

main main

painted_by

year

picture

prf:ImageCapable = Yes

Page 32: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 32

Presentation Model• Defines the physical appearance of the presentation• Based on the concept of region which contains attributes

and possibly other regions:– Each region has a rectangular area associated– Slices are translated to regions, one slice can be mapped to

several regions• Slice relationships are materialized with:

– Navigational relationships– Spatial relationships– Temporal relationships

Page 33: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 33

Presentation Model

Bookcase regions

Screen rendering

bookcase

shelf

P.picture

painting

P.picture

P.name

Region

Attribute

(Associated to a certain painting P)

xy

right

below

Navigational Relationship

Spatial Relationship

0

1

2 0

P1 P2 P3

P4 P5 P6

P7

P1 ‘Stone Bridge’ 1638 …

Priority (Priority 0 is always fulfilled)

Page 34: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 34

Presentation in Browsers

HTML SMIL WMLHyperText Markup Language

Synchronized Multimedia Integration Language

Wireless Markup Language

Page 35: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 35

Implementation

• Models are represented in RDF and they are serialized in RDF/XML

• User Agent Profile (UAProf): a Composite Capability/Preference Profiles (CC/PP) vocabulary to model device capabilities and user preferences

• XSLT processor for transforming between different model instances (stylesheet-based transformation)– Xalan (XSLT 1.0)– Saxon (XSLT 2.0): multiple output files support

Page 36: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 36

Data Transformations• Step 0: Preparation

– Substep 0.1: Application Model Unfolding creates the skeleton of an AM instance

– Substep 0.2: Application Model Adaptation adds slice visibility conditions to the previous skeleton

– Substep 0.3: Main Transformation Specification Generation builds the specification for the next step

• Step 1: Main Transformation populates the AM with the input CM instance

• Step 2: Presentation Generation produces code for different browsers (HTML, WML, SMIL)

Page 37: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 37

Data Transformations

CC/PP user/platform vocabulary (rdfs)

conceptual model vocabulary (rdfs)

system media vocabulary (rdfs)

application model vocabulary (rdfs)

UAProf vocabulary (rdfs)

user profile vocabulary (rdfs)

conceptual model instance (rdf)

conceptual model (rdfs)

application model (rdfs)

user/platform profile (rdf)

application model instance (rdf)

application model unfolded (rdf)

application model unfolded, adapted (rdf)

HTML

WML

RT

SMIL

cmi2ami (xsl)

adaptation (xsl)

rdf2xsl (xsl)

ami2html (xsl)

ami2wml (xsl)

ami2smil (xsl)

application indepedent

input dependent

(1) (2)

(2)

(2)

(0.1)

(0.2) (0.3)

reference

instantiation

XSLT transf.

application dependent

Page 38: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 38

Hera +/-

+ : Tasks separation (content, navigation, and presentation) Model-based specifications (enables presentation content adaptation)

Uses standard technology: RDF, RDF/XML, XSLT- (Future Work): Specifications are semi-formal (difficult to check integrity constraints)

Does not (yet) support user interaction

Page 39: Web Information Systems Engineering

ISA/department of mathematics and computer

science

TU/e eindhoven university of technology

April 17, 2003 39

Summary• What is a Web Information System (WIS)• Features of WIS: data intensive, pervasive etc.• Design methodologies for WIS:

– Strudel (from industry)– Hera (from university)

• Model-based approach for WIS design• WIS design tasks separation:

– Data Selection– Navigation – Presentation