Upload
alisa
View
53
Download
0
Embed Size (px)
DESCRIPTION
Web Information Systems Engineering. Flavius Frasincar [email protected]. Contents. What is a Web Information System (WIS)? WIS Features Problem: Data Management in WIS Solution: Model-Driven Methodology (with Tasks Separation) Methodologies for WIS: Strudel Methodology - PowerPoint PPT Presentation
Citation preview
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 1
Web Information Systems Engineering
Flavius [email protected]
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 2
Contents
• What is a Web Information System (WIS)?• WIS Features• Problem: Data Management in WIS• Solution: Model-Driven Methodology (with Tasks
Separation)• Methodologies for WIS:
– Strudel Methodology– Hera Methodology
• Summary
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 3
World Wide Web
• 1990: Tim Berners Lee ( ) invents the World Wide Web
• The Web success is based on:– hypermedia (link) nature: links allow for a natural and
flexible access to information according to the associative nature of human mind
– global availability– interoperability– simplicity– free etc.
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 4
Web Information Systems (WISs)
• 1998: Tomas Isakowitz at al. coined the term Web Information Systems for: “information systems that are based on Web technology”
• WISs are different from traditional information systems as they “have the potential of reaching a wider audience” through different platforms
• There is an even increased need to integrate data as the data sources are distributed over the Web and they are possibly heterogeneous
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 5
Three Generations of WISs
• First Generation: are based on hand-crafted HTML – Difficult to maintain (update)
• Second generation: generate HTML on demand by automatically filling templates– Data is machine readable/transformable– Difficult to make the data machine understandable
• Third generation: Semantic Web Information Systems (SWISs) are WISs based on Semantic Web technology (RDF, OWL etc.)– Data is machine understandable
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 6
Present the Deep Web
Deep Web vs. Surface Web:•500 times larger •1000 times better quality
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 7
WIS Features• Data-intensive: integrate data from multiple
heterogeneous sources• Pervasive: support different platforms e.g. network (T1, 128K, 56K), display (PC, Palm, WAP Phone)
• User Adaptable: consider user’s preferences and user’s state of mind while interacting with the system
• Flexible: support semistructured data• Automatic: need little or no human intervention• User interactive: e.g. online shops (Amazon)
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 8
Problem: Data Management
• WIS are hard to specify and implement• Methodologies exist for manual WIS design but few of
them target automation• Difficult tasks to perform:
– Multiplatform support– Automatic updates– Automatic site reconstruction (WIS Adaptation)– Optimize WIS performance (WIS Optimization)– Enforce WIS integrity constraints (WIS Analysis)– Achieve flexibility, extensibility etc.
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 9
Semistructured Data
• It is characterized by:– Irregular structure: missing or additional attributes,
multiple attributes– Few type constraints: attributes with different types in
different objects, heterogeneous collections– Rapidly evolving schema or missing schema
• It is typically modeled by a DLG (Directed Labeled Graph)
• Examples: HTML, XML, RDF, LaTeX Bib etc .
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 10
Solution: Tasks Separation
• Isolate and automate common tasks for WIS design:– Choose and access the data (data integration and retrieval)
to be presented– Design the navigational structure for this data– Design the visual aspects of the presentation
• Use a model-driven approach for task specification (the fairy says it brings “wisdom” [theory], “richness”[money], and “beauty” [judge it yourself] – Stefano Ceri)
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 11
WIS Presentation Generation Srategies
• Static (eager approach): presentations are materialized completely, each page is precomputed
• Dynamic or On-demand (lazy approach): after each link “click” the next page to be presented is computed
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 12
Methodologies• Dexter-based: HDM (Hypermedia Design Method)• ER-based: RMM (Relationship Management Methodology)• OMT-based: OOHDM• UML-based: OO-H (Conallen), UWE (UML-based Web
Engineering),W2000 (HDM extension)• RDF-based: XWMF (eXtensible Web Modeling
Framework), Hera • Other: Strudel, Araneus, WebML (Web Modeling
Language), Autoweb, Trellis, XAHM (XML-based Adaptive Hypermedia Model ), WSDM, W3DT etc.
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 13
Strudel Methodology
http://www.research.att.com/~mff/strudel
AT&T
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 14
Strudel Architecture Relational
Database Object - Oriented
Database XML
Database
Uniform Data Model
…
Site Graph
HTML Template
- HTML
Presentation
HTML Template HTML Template
STRUQL
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 15
Input Data
<publications> <pub id=pub1> <title>Declarative spec…</title> <author>Mary Fernandez</author> <author>Dan Suciu</author> <year>2000</year> <journal>VLDB</journal> <abstract>Strudel is a …</abstract> <category>Languages</category> <category>Methods</category> … </pub1>
… <pub id=pub2> <title> Catching the …</ title> <author>Mary Fernandez</author> <author> Daniela Florescu </author> <year>1998 </year> <booktitle> SIGMOD </booktitle> <abstract> The Strudel …</abstract> <category>WIS</category> … </pub2></publications>
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 16
Semistructured Data Model
pub pub
publications
year author year
author
2000 1998 M. Fernandez M. Fernandez
… …
pub1 pub2
Root
Direct Labeled Graph (DLG)
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 17
STRUQL(Site TRansformation Und Query Language)
where Root”publications”r, r”pub” x, xl v{ where l=“year” link YearPage(v) ”year” v, YearPage(v) ”paperPage” x, RootPage() ”yearPage” YearPage(v) collect RootPage{RootPage()}, YearPage{YearPage(v)}} …
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 18
Site Graph
YearPage(1998) YearPage(2000)
“yearPage”
“year” “paperPage”
2000 1998 PaperPage(pub1)
… …
RootPage()
“yearPage”
“year” “paperPage”
PaperPage(pub2)
“paperPage”
“paperPage”
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 19
STRUDEL Template Language• RootPage collection:<html><sfor p in yearPage order=ascend key=year> <sfmt @p [email protected]> </sfor></html>
• YearPage collection:<h1><sfmt year></h1><ul> <sfor p in paperPage> <li><sfmt @p></li> </sfor></ul>
• PaperPage collection:<i><sif booktitle> <sfmt booktitle><selse> <sfmt journal></sif></i><br><sfor p in author> <sfmt @p>,</sfor><br><sfmt year><br>
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 20
STRUDEL +/-
+ : Tasks separation (content and presentation) Declarative specifications (enables presentation content adaptation)
Verification of integrity constraints (e.g. “All paper pages are reachable from RootPage”)
- : Intermixes schema and content defintion in the data graph Does not separate navigation from visual details of the presentation Does not use standard technologies
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 21
Hera Methodology
http://wwwis.win.tue.nl/~hera
TU/e
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 22
Hera Architecture
Relational Database
Object-Oriented Database
XML Database
ODB-XML Wrapper
RDB-XML Wrapper
Mediator/ Integrator
Logical Presentation
Logical-WML Presentation
HTML Presentation
SMIL Presentation
WML Presentation
Logical-SMIL Presentation
Logical-HTML Presentation
Information RetrievalHypermedia Presentation
…
…Query
Use
r/Plat
form
A
dapt
atio
n
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 23
Hera Presentation Methodology
Conceptual Design
Application Design
Presentation Design
Adaptation D
esign
Conceptual Model
Application Model
Presentation Model
Transformation
Transformation
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 24
Conceptual Model (CM)
• Provides a uniform semantic view over different data sources that are integrated within a given Web application
• Consists of hierarchies of concepts relevant within the given domain
• Concept relationships are:– Attribute relationships: refer to literal values that
characterize a concept– Reference relationships: refer to other concepts
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 25
Example: CM
Technique Artifact Creator
Painting
String
String
String
String
String
Image
Integer
Painter
name
description exemplifies
created_by
creates
name year
picture
name
biography
painted_by
paints Property subClassOf subPropertyOf
exemplified_by
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 26
Example: CM in RDF/XML
<rdfs:Class rdf:ID="Artifact"/><rdfs:Class rdf:ID="Painting"> <rdfs:subClassOf rdf:resource="#Artifact"/></rdfs:Class>
<rdf:Property rdf:ID="year"> <rdfs:domain rdf:resource="#Artifact"/> <rdfs:range rdf:resource=“#Integer"/> </rdf:Property>
<rdf:Property rdf:ID="picture"> <rdfs:domain rdf:resource="#Painting"/> <rdfs:range rdf:resource=“#Image"/></rdf:Property>
<rdfs:Class rdf:ID="Creator"/><rdfs:Class rdf:ID="Painter"> <rdfs:subClassOf rdf:resource="#Creator"/></rdfs:Class>
<rdf:Property rdf:ID="creates" sys:cardinality="multiple" sys:inverse="created_by"> <rdfs:domain rdf:resource="#Creator"/> <rdfs:range rdf:resource="#Artifact"/></rdf:Property>
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 27
Application Model (AM)• Captures the logical (navigational) aspects of the
presentation• Based on the concept of slice which contains attributes and
possibly other slices – A slice is a meaningful presentation unit– A slice is associated to a concept from CM
• Slice relationships are:– Aggregation relationships: embed a set of slices (abstraction for
index, tour, indexed guided tour etc).– Reference relationships: link abstraction with an anchor specified
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 28
Example: AM
technique
name
exemplified_by
description
Set
painting picture
painting
name
painter name
main main
painted_by
year
picture
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 29
Example: AM in RDF/XML
<rdfs:Class rdf:ID="Slice.technique.main" slice:owner=“CM#Technique" slice:main="Yes"> <rdfs:subClassOf rdf:resource=“#Slice"/></rdfs:Class>
<rdfs:Class rdf:ID="S.painting.picture" slice:owner=“CM#Painting" slice:attr-ref=“CM#picture"> <rdfs:subClassOf rdf:resource="#Slice"/></rdfs:Class>
<rdf:Property rdf:ID="media"> <rdfs:domain rdf:resource="#S.p.picture"/> <rdfs:range rdf:resource=“#Image"/></rdf:Property>
<rdfs:Class rdf:ID="Slice.painting.main" slice:owner="CM #Painting"> <rdfs:subClassOf rdf:resource="#Slice"/></rdfs:Class>
<rdf:Property rdf:ID="slice-ref"> <slice:prop-ref rdf:resource="CM #ex_by"/> <rdfs:domain rdf:resource="#S.t.main"/> <rdfs:range rdf:resource="#S.p.picture"/></rdf:Property>
<rdf:Property rdf:ID=“link_1"> <rdfs:subPropertyOf rdf:resource =“#link”> <rdfs:domain rdf:resource="# S.p.picture"/> <rdfs:range rdf:resource="#S.p.main"/></rdf:Property>
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 30
Adaptation
• Captures two kinds of adaptation– Adaptability takes into account the device capabilities
and user preferences (UAProf = User Agent Profile) – Adaptivity means that the presentation changes itself
according to the “state of the user’s mind” while being browsed (UM = User Model)
• Adaptation based on conditioning the appearance of slices using UAProf and/or UM
• Adaptivity uses AHAM (Adaptive Hypermedia Application Model) update rules for updating UM
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 31
Adapted Application Model
um:Technique < 10 um:Painting < 10
technique
name
exemplified_by
description
Set
painting picture
painting
name
painter name
main main
painted_by
year
picture
prf:ImageCapable = Yes
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 32
Presentation Model• Defines the physical appearance of the presentation• Based on the concept of region which contains attributes
and possibly other regions:– Each region has a rectangular area associated– Slices are translated to regions, one slice can be mapped to
several regions• Slice relationships are materialized with:
– Navigational relationships– Spatial relationships– Temporal relationships
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 33
Presentation Model
Bookcase regions
Screen rendering
bookcase
shelf
P.picture
painting
P.picture
P.name
…
Region
Attribute
(Associated to a certain painting P)
xy
right
below
Navigational Relationship
Spatial Relationship
0
1
2 0
P1 P2 P3
P4 P5 P6
P7
P1 ‘Stone Bridge’ 1638 …
…
Priority (Priority 0 is always fulfilled)
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 34
Presentation in Browsers
HTML SMIL WMLHyperText Markup Language
Synchronized Multimedia Integration Language
Wireless Markup Language
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 35
Implementation
• Models are represented in RDF and they are serialized in RDF/XML
• User Agent Profile (UAProf): a Composite Capability/Preference Profiles (CC/PP) vocabulary to model device capabilities and user preferences
• XSLT processor for transforming between different model instances (stylesheet-based transformation)– Xalan (XSLT 1.0)– Saxon (XSLT 2.0): multiple output files support
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 36
Data Transformations• Step 0: Preparation
– Substep 0.1: Application Model Unfolding creates the skeleton of an AM instance
– Substep 0.2: Application Model Adaptation adds slice visibility conditions to the previous skeleton
– Substep 0.3: Main Transformation Specification Generation builds the specification for the next step
• Step 1: Main Transformation populates the AM with the input CM instance
• Step 2: Presentation Generation produces code for different browsers (HTML, WML, SMIL)
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 37
Data Transformations
CC/PP user/platform vocabulary (rdfs)
conceptual model vocabulary (rdfs)
system media vocabulary (rdfs)
application model vocabulary (rdfs)
UAProf vocabulary (rdfs)
user profile vocabulary (rdfs)
conceptual model instance (rdf)
conceptual model (rdfs)
application model (rdfs)
user/platform profile (rdf)
application model instance (rdf)
application model unfolded (rdf)
application model unfolded, adapted (rdf)
HTML
WML
RT
SMIL
cmi2ami (xsl)
adaptation (xsl)
rdf2xsl (xsl)
ami2html (xsl)
ami2wml (xsl)
ami2smil (xsl)
application indepedent
input dependent
(1) (2)
(2)
(2)
(0.1)
(0.2) (0.3)
reference
instantiation
XSLT transf.
application dependent
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 38
Hera +/-
+ : Tasks separation (content, navigation, and presentation) Model-based specifications (enables presentation content adaptation)
Uses standard technology: RDF, RDF/XML, XSLT- (Future Work): Specifications are semi-formal (difficult to check integrity constraints)
Does not (yet) support user interaction
ISA/department of mathematics and computer
science
TU/e eindhoven university of technology
April 17, 2003 39
Summary• What is a Web Information System (WIS)• Features of WIS: data intensive, pervasive etc.• Design methodologies for WIS:
– Strudel (from industry)– Hera (from university)
• Model-based approach for WIS design• WIS design tasks separation:
– Data Selection– Navigation – Presentation