25
This article was downloaded by: [The University of Manchester Library] On: 22 December 2014, At: 09:39 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK New Review of Hypermedia and Multimedia Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tham20 Embedding information retrieval in adaptive hypermedia: IR meets AHA! Lora Aroyo , Paul De Bra , Geert-Jan Houben & Richard Vdovjak a Technische Universiteit Eindhoven – Computer Science , PO Box 513, 5600, MB, Eindhoven, the Netherlands E-mail: Published online: 05 Jan 2007. To cite this article: Lora Aroyo , Paul De Bra , Geert-Jan Houben & Richard Vdovjak (2004) Embedding information retrieval in adaptive hypermedia: IR meets AHA!, New Review of Hypermedia and Multimedia, 10:1, 53-76, DOI: 10.1080/13614560410001728146 To link to this article: http://dx.doi.org/10.1080/13614560410001728146 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Embedding information retrieval in adaptive hypermedia: IR meets AHA!

  • Upload
    richard

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

This article was downloaded by: [The University of Manchester Library]On: 22 December 2014, At: 09:39Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

New Review of Hypermedia andMultimediaPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/tham20

Embedding information retrieval inadaptive hypermedia: IR meets AHA!Lora Aroyo , Paul De Bra , Geert-Jan Houben & Richard Vdovjaka Technische Universiteit Eindhoven – Computer Science , PO Box513, 5600, MB, Eindhoven, the Netherlands E-mail:Published online: 05 Jan 2007.

To cite this article: Lora Aroyo , Paul De Bra , Geert-Jan Houben & Richard Vdovjak (2004)Embedding information retrieval in adaptive hypermedia: IR meets AHA!, New Review ofHypermedia and Multimedia, 10:1, 53-76, DOI: 10.1080/13614560410001728146

To link to this article: http://dx.doi.org/10.1080/13614560410001728146

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

Embedding information retrieval in

adaptive hypermedia: IR meets AHA!

LORA AROYO+, PAUL DE BRA, GEERT-JANHOUBEN and RICHARD VDOVJAK

Technische Universiteit Eindhoven �/ Computer Science, PO Box 513, 5600 MB Eindhoven,

the Netherlands; Email: {laroyo,debra,houben,richardv}@win.tue.nl

This paper concentrates on the retrieval aspect in adaptive hypermedia (AH).

Traditionally, AH research concentrates on applications that are ‘closed’, in the sense

that they assume fixed content elements. Certain applications ask for an extension of the

contents considered, with data obtained through information retrieval (IR). This paper

addresses this issue of ‘opening up’ AH applications, and gives insight into research that

applies techniques from IR and from the Semantic Web (SW) for the embedding of IR in

AH. We look at this issue in the context of an abstract reference model (AHAM) and a

concrete implementation framework (AHA!). The goal of this research is to define a

framework for AH with extended IR functionality. We address the relevant issues for

this framework, characterized by the application of concepts from the SW paradigm

leading to an enriched notion of concept relevancy.

Keywords: Adaptive hypermedia; Information retrieval; Ontology; Semantic Web;

Metadata

1. Introduction

A typical assumption made in major parts of the research on hypermedia is that

the content of the application is a fixed set of fragments. Moreover, it is assumed

that the author of the application knows these fragments. As a consequence, the

author can make decisions in the application design based on the knowledge of

the data fragments and their properties. It may, however, be the case that a

fragment itself is not considered but only the design reasons in terms of the

fragment’s properties, and it leaves the actual rendering of the fragment to a

runtime component. A model such as Dexter (Halasz and Schwartz 1994)

effectively supports this separation of concerns. In the research on adaptive

hypermedia (AH) this assumption is also used. Providing adaptation in a

hypermedia application means that the designer decides on using different

+Corresponding author.

New Review of Hypermedia and Multimedia, Vol. 10, No. 1, June 2004, 53 �/76

New Review of Hypermedia and Multimedia

ISSN 1361-4568 (print)/ISSN 1740-7842 (online) # 2004 Taylor & Francis Ltd

http://www.tandf.co.uk/journals

DOI: 10.1080/13614560410001728146

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 3: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

methods and techniques to make the desired adaptive application out of the

available fragments. In that context, it is even more important for the designer

to know explicitly the data fragments and their properties: specifying the

adaptation is then much easier. For adaptive hypermedia, the AHAM model

(de Bra et al. 1999, Wu 2002), like Dexter, leaves the actual rendering of the

fragments to a runtime component, therefore making it not an explicit part of the

design process. In other words, in the classical approach of AH design an

abstraction is made of the content, by assuming fixed, known fragments.

Certain areas of (adaptive) hypermedia research identify the need to consider

content that is outside the application, outside meaning that the data fragments

are not under direct control of the application. Often, this is interpreted in a way

that the application can have access to those data fragments to include them in

the presentation, but that there is less knowledge available to (the designer of) the

application about which fragments to consider when designing the application

(and its adaptation). Applications in the area of web information systems often

take an approach like this, not knowing the actual fragments that exist at the level

of instances, and only having information at the (schema) level of classes. Several

research projects, for example Hera (Frasincar and Houben 2002, Frasincar et al.

2003), focus on the control of adaptation in the context of data that are available

only at the schema level and that are instantiated at runtime. While this approach

allows for a limited notion of ‘outside’ data, there is a need to go further. A

natural next step is to include the concept of information retrieval (IR). Allowing

the application to use data fragments that are obtained through IR mechanisms

has an impact on the design in the sense that the designer of the application has a

limited insight into the (retrievable and retrieved) fragments and their properties.

With this step it becomes interesting to consider semantic issues in the retrieval of

the content. Being able to consider in the retrieval request elements that are

considered from different (semantic) viewpoints can bring a significant improve-

ment over the keyword-based retrieval: hence it is possible with the advent of

Semantic Web (SW) technology to improve on the way in which AH is extended

with IR (similar to the SW-based approach of Vdovjak et al. 2003). The major

challenge with this extension is to understand the consequences for the design of

the AH applications of including IR in AH applications: for the design and

specification of AH applications it is interesting to consider the influence of the

notion of IR and the role of SW techniques in it.

This challenge leads to a number of relevant research questions.

. First, it is interesting to see how we can capture this IR extension in terms of

existing notions of AH, specifically in a reference model. We want to describe

how IR is embedded in AH applications and try to identify how (Dexter and)

AHAM can help in providing this insight. It is also interesting to see how this

issue can be dealt with in concrete application frameworks such as AHA!

54 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 4: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

(de Bra and Calvi 1998, de Bra et al . 2002), for example how they deal with

data that are the result of an IR request.

. Second, if SW technology is used in the retrieval specification, then it is an

interesting research question to see how the concepts from the SW paradigm

can lead to an enriched notion of concept relevancy (for the sake of retrieval).

. Third, AH is typically based on a user model. Given the limited knowledge

available on data obtained through IR, the question is how this influences the

user modelling in AH. We address this issue by considering our ontology-

based solution.

The remainder of the paper starts in Section 2 with defining the main concepts from

the field of AH, and specifically the main concepts from AHAM and AHA!. In

Section 3 we address the characteristics of the IR extension to AH applications.

Section 4 explains the nature of our SW-based approach to integrate data from

multiple resources. Section 5 distinguishes different scenarios for the IR extensions.

In Section 6 we consider the influence of IR on the presentation aspect of our

applications, while in Section 7 we consider the influence on the user model.

Finally, Section 8 positions some related work, before we conclude in Section 9.

2. Adaptive hypermedia

AH systems (Brusilovsky 2001) maintain a user model to store users’ ‘features’,

and use these to provide adaptive content and adaptive navigation support. The

aim is to improve the usability of hypermedia by making them more personalized,

by adapting to the user’s goals. Traditionally the user modelling is based on

observing the user’s behaviour, which is mainly browsing (following a sequence of

hypermedia links). In other words, AH works with a user model in order to make

the hypermedia adapt to that model.

2.1 AHAM

Reference models or frameworks support the specification of AH systems; for

example, by distinguishing different components of the system each modelled by

a separate model. The user model (UM) is usually an overlay model of the

domain model (DM) and there is a separate module (engine) applying the

adaptation rules, specified in the application model (AM). A good example of

such a reference model is AHAM (de Bra et al. 1999, Wu 2002), in which the

UM, DM and AM cooperate to perform the adaptation.

The DM represents a semantic structure of concepts and relationships between

the concepts: it identifies the concepts considered in the application and their

descriptive attributes. The UM considers the same concepts and associates user-

attributes to them, represented as attribute/value pairs: these attributes express, for

example, the knowledge-about or interest-in the concept for the specific

Embedding IR in AH 55

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 5: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

user. Thus, the concepts represent various topics in the subject domain and the

knowledge/interest attribute values indicate the user’s level of knowledge

or interest in these concepts. The AM specifies relationships between attribute

values of concepts, thus indicating the knowledge and interest values in relation-

ship to other neighbouring concepts. An important part of the AM is the set of

rules to link theDMandUM to generate the final presentation of content. A set of

requirements is given by these rules to express constraints for presenting content

fragments (and the related concepts) on the basis of the current user model. The

system reasons on the basis of the values of the attributes to select and present

content and the content format to the user. This results in the system’s selection of

a set of hyperlinks to be presented to the user in a specific order and in a specific

colour and presentation format (e.g. text, video or audio). Another set of rules

defines the knowledge propagation mechanism (based on the concept attributes

and the links between the concepts) for the current user model. Those rules can

indicate to the system that if the user reads content fragment A, then the following

links should be to content fragment B and C and the knowledge value of conceptA and concept D (related to A) should, for instance, be increased by 50%.

2.2 AHA!

Several software platforms exist that support AH applications. Here we mention

Interbook (Brusilovsky et al. 1998) and AHA! (de Bra et al. 2002). Like Interbook,

AHA! offers an adaptive solution to develop online courses, where the main

browsing aspect is knowledge. Currently AHA! is also moving towards exploring

other aspects of browsing, such as interest-, context-, goal- or learning style-driven.

The current version of AHA! is implemented in Java as a web-based server-side

(Java Servlets) adaptive application. It offers adaptation to the web pages (local or

remote) requested by the user and does so on the basis of the user model, where an

update of attribute values appears each time the user accesses a page. The user

model, as well as the specific domain model structure of concepts and attributes, is

designed preliminary as an overlay of the existing generic domain model. The

adaptation rules are generically constructed and designed by the author. The most

recent version of AHA! contains a user-friendly authoring tool for editing all three

models (DM, UM, AM). Note that in AHAM a separation between DM and

AM is advocated, although in AHA! the two models are working closely together.

The main interaction between the user and an adaptive application

empowered by AHA! results in the presentation and access of ‘desired’ (by the

user) links on a web page. The notion of desired is deduced from the values of

the user model attributes. A link (and the content and concepts behind it) could be

interesting , not interesting or already visited. This is usually indicated

with different colours (called good , bad or neutral). An example of this interaction

is given in the implementation of the course ‘2L690: Hypermedia Structures

and Systems’ (http://wwwis.win.tue.nl/2L690/) given to fourth-year students in

56 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 6: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

Eindhoven. The course uses a local collection of learning material stored in

HTML pages and data in XML files for the concept structure and the user model.

Currently, the research on AHA! is concentrating on opening up AHA! for

external sources, making authoring of DM and AM more user-friendly, and

providing a more flexible inclusion of fragments or objects. This paper is part of

the attempt to include external sources that are not known beforehand.

3. Retrieving content from outside

3.1 Content inside

In traditional adaptive hypermedia applications, for example existing applica-

tions running on AHA!, the content is typically known to the author of the

application and is under his or her control. In fact, the content is stored, at least

conceptually, inside (with) the application, and as AHAM describes the storage

part of the application is closely related to the part that realizes the adaptation.

Usually the author’s task is to structure the content that is to be presented and to

fix the storage of the content within the system. Figure 1 shows what the

architecture of the application looks like in a situation where the content is

available within the system: the Content layer captures the data as they are stored

in the application’s repository and it captures the description of the data as they

are made available to the Application layer that provides the (adaptive)

hypermedia structure to the application’s users.

Figure 1. Content layer inside the system.

Embedding IR in AH 57

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 7: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

3.2 Content outside

Placing the content outside of the application is a move that is already visible in the

context of a large class of information systems known as web-based information

systems (WIS) (Isakowitz et al . 1998). They are information systems, often

database-driven, that exploit the web paradigm and use web technologies to

retrieve data from sources reachable through the web and deliver them to their users

(see Figure 2 for Hera’s perspective on WIS (Frasincar and Houben 2002, Frasincar

et al. 2003). Typically, a WIS delivers the data to the users in a web or hypermedia

presentation, requiring that in comparison to a handcrafted (adaptive) hypermedia

application, a WIS needs to generate (automatically) a hypermedia presentation for

the data to be delivered. We mention (general) methodologies or frameworks for

WIS engineering, such as Araneus (Mecca et al. 1998), WebML (Ceri et al. 2002),

UWE (Koch et al. 2001) and Hera (Frasincar and Houben 2002, Frasincar et al.

2003), of which only a few explicitly include the adaptation aspect. Most of these

approaches consider the outside content to be known at the schema (class) level,

such that the design is based on that level, leaving the actual runtime retrieving and

rendering of the data at instance level a separate and less prominent issue.

As indicated above, most of the research on hypermedia or multimedia genera-

tion (Van Ossenbruggen et al. 2001) concentrates on the output of the application.

There has been less attention paid to the input aspect. Some research has addressed

Figure 2. WIS architecture (Hera view).

58 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 8: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

the fact that the content that is to be retrieved on demand can be at different, distri-

buted places, which implies a process of integration based on a complicated

alignment of the semantics associated with the different sources (Vdovjak and

Houben 2002). However, this still assumes that besides the semantic alignment the

main part of the process is hard-wired. Because, typically, a schema-level definition

is known, the application can rely on schema properties but not on the properties of

the concrete instances that exist at the present time. As an example, in Frasincar et

al. (2003) an approach based completely on RDF(S) is chosen in which mappings

(XSLT transformations) at instance level are generated automatically from schema

mappings.

As a further extension to dynamic, distributed content, a WIS may choose to

access the content via information retrieval (IR). Specifically, when the data are

gathered using IR techniques, not all the properties of the content are available

for the application. Only some metadata of the content are available for the

application to perform its task. In comparison to the database-generated content

that is often implemented using fixed query dialogues, content accessed through

IR implies communication on the basis of a limited and dynamic set of metadata.

3.3 IR extension to the AH model and application

In the case of retrieving content from outside the application, we may want to

distinguish between different situations. It is possible to just have one ‘outside’

fragment, while it is also possible to have a search (retrieval) request that results

in an entire set of fragments that have to be made accessible (we address these

issues further in the next section). In each of the situations, we are dealing with

outside contents that have to be connected to the given AH application.

In principle, both situations lead to a different implementation of what is called

the resolver function in models like Dexter or AHAM. The purpose of this

function is to obtain the content that corresponds to the desired conceptual

fragment. In the case of outside content this function has to be related to the

outside sources in order to identify the data that have to be obtained.

This identification of outside content requires that the resolver function has

knowledge about the terms or concepts used in the sources. It is no longer

sufficient just to have some internal identifier for the stored data: the resolver will

be more like a retrieval or search operation that mentions some terms or

keywords. These search terms reflect the concept associated with the page (e.g.

‘Niagara ’). However, this might include a mapping from the terms known as

concepts in the application to terms as they are referred to in the outside space.

Often there is an ontology available for that outside retrieval space, and then an

ontology mapping or articulation (Vdovjak and Houben 2002) can give the term

or terms to be used (e.g. ‘Niagara Falls ’). Usually, the author will want to include

additional terms to state the context of that retrieval (e.g. ‘Canada ’, ‘waterfalls ’).

Embedding IR in AH 59

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 9: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

This would allow the author to adapt the retrieval request on the basis of their

perception and knowledge of the search space (or source).

According to the AHAM reference model the author captures the relevant

metadata concerning the content in the DM, and subsequently the UM and AM

are based on this representation. The challenge for the author when the content is

outside the application (i.e. not only inside) and IR is used to obtain the data is to

capture the relevant details of the content in terms of metadata. We assume that

for the purpose of IR we have an ontology available that specifies (the structure

of) the search space (see figure 3).

As this ontology basically is the only means for the application to ‘reason’

about the (outside) contents, we extend AHAM by relating this ontology (O) to

the DM, UM and AM. We add a retrieval model (RM) that captures the relation

between the retrieval space and DM. This RM will consist of a set of mappings

or articulations (M) between concepts from DM (representing the application)

and terms from O (representing the retrieval space). The ontology O describes the

metadata of the content as they are available for the application, and hence it acts

as the ‘contract’ on which the application interacts with the content. In any AH

application the metadata that goes with the content comprise the main ‘tool’ that

the author has available to manipulate the content and is able to apply effectively

in the application. If the content is outside and therefore not directly available to

the author, this tool is even more important as it is the only means to reason

about the content that is to be used in the application.

The RM describes the relationship between O and DM, and is in fact quite

similar to the integration model of Vdovjak and Houben (2002). Each mapping in

RM relates a concept of the conceptual model (DM in AHAM) to a collection of

terms in the ontology that describes the sources. This mapping might include

metadata on quality dimensions or weights that should be taken into account in the

Figure 3. Content available through ontology.

60 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 10: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

search or retrieval process. The mapping is called for whenever the AM decides that

the resolver function has to obtain the fragment that goes with the DM concept.

It appears that with this extension to AHAM we are able to specify the necessary

associations between DM and O. Implementing this in the same way as the

integration process was implemented in Frasincar et al. (2003), an elegant and

effective RDF(S)-based solution opens up the way to combine AH approaches with

the SW paradigm. In the next section we provide more detail on the implementation

of this RDF(S)-based extension.

4. Semantic Web-based retrieval specification

When WIS deals with data from multiple sources, organizing them in one way or

another, the retrieval (query) specification becomes an issue, especially when not

only text-like data but also multimedia data are considered. In this case semantic

issues become crucial. Consider, for instance, the following user query: ‘Give me

all pictures of Niagara Falls taken from the Falls Avenue in Niagara Falls, Canada

with a telelens ’. The first part of the query denotes the subject (waterfalls) being

photographed. The second identifies the position of the photographer. Note the

ambiguity of the Niagara Falls collocation, first denoting the waterfalls and then

being a part of the position description as a town name. Moreover, there is the

third part of the query, imposing an additional lens constraint, which basically

says that we are only interested in those images that provide enough detail and a

narrow perspective achievable only by using a lens of that focal length. It is

evident that queries of this kind are not likely to be satisfactorily answered by

keyword-based search engines. By translating this query into a set of keywords

and trying a keyword-based search engine we either obtained an empty set of

results (e.g. Google Image Search) or a countless number of irrelevant pictures

featuring big photo lenses (e.g. AltaVista Image Search retrieved over 1.4 million

images and the top scored ones were indeed mostly showing long lenses and other

photo equipment). Examples like this show a clear need for something more

powerful than keywords. The use of ontologies, taxonomies of classes within a

certain domain linked by properties indicating different relations among those

classes, would enable us to enhance queries and improve both the precision and

the recall. As ontologies become increasingly the essence of web portals, we have

more possibilities to offer integrated views over various domains.

Vdovjak et al. (2003) show that there are three important ingredients to

successfully deploy such ontology-based, organization-oriented systems. First, one

needs a design methodology identifying different design phases and their under-

lying models/ontologies, which serve as a framework for the designer. Second, there

must be a set of tools that are able to execute the design, instantiating the models

with data coming from various web sources. Third, there must also be an entry

point for the end-user, where they are able to explore what the system can do for

them and where they can formulate their queries. In the paper by Vdovjak et al. , a

Embedding IR in AH 61

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 11: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

methodology and tools from the Hera research project are proposed for this

purpose.

4.1 RDF, RDFS, RQL and Hera

The resource description framework (RDF) (Klyne and Caroll 2003) is a general-

purpose data language issued as a W3C standard. An RDF model consists of

resources, named properties, and property values. RDF schema (RDFS), an

extension to RDF that is itself expressed in RDF terms, provides a support for

creating vocabularies at the type (schema) level. RDFS defines a modelling

language by assigning a special semantics to several (system) resources and

properties including rdfs, rdfs:Property, rdfs:subClassOf, rdfs:subPropertyOf. In

Hera, RDF(S) is the main format used for the different models in the design

phases. One of the reasons for choosing RDF(S) was that it is flexible (supporting

schema refinement and description enrichment) and extensible (allowing the

definition of new resources/properties). Moreover, it comes with the promise of

web application interoperability. Hera model instances are represented in plain

RDF validated against their associated models (schemas) represented in RDFS.

As RDF(S) is used in Hera as the vehicle for capturing semantics of different

domains within the information system being designed, a queries language was

also needed that would enable the information of interest to be retrieved from

these knowledge bases. The most advanced RDF(S) query language to date is

RQL (Karvounarakis et al . 2001). It covers queries over both RDF schemas and

RDF instances. RQL queries that we will consider consist, similarly to SQL

queries, of SELECT-FROM-WHERE clauses. The SELECT clause specifies

resources (variables) that are of interest. The FROM part is the core of the query

and specifies one or more path expressions (i.e. subgraphs of the entire schema

graph) where the variables are being bound. Finally, the WHERE clause contains

filtering conditions that are being applied on the variables bound in the FROM

statement. Figure 4 shows the ‘Niagara’ query in RQL.

SELECT PHOTO FROM {PHOTO:GeoPhoto}depictsTheme{THEME:Universe}, {PHOTO}takenFromPlace{PLACE:AddressablePlace}.country{COUNTRY}, {PLACE}town{TOWN},{PLACE}street{STREET}, {PHOTO}takenWithSettings.usedLens{LENS:TeleLens} WHERE THEME like "*Niagara Falls" and COUNTRY like "Can*" and TOWN like "Niagara*" and STREET like "Falls Av*"

Figure 4. User query in RQL.

62 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 12: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

4.2 Data retrieval

The data retrieval in our context implies that we connect the application’s DM

with several autonomous sources by creating channels through which the data

will populate on request the concepts from DM. This involves identifying the

right concepts occurring in the source ontologies and relating them to their

counterparts in DM.

The DM (or conceptual model as it is called in Hera) provides a uniform semantic

view over multiple data sources. It is composed of concepts and concept properties

that together define the domain ontology. There are two types of concept

properties: concept attributes that associate media items to the concepts and

concept relationships that define associations between concepts. The running

example used here in relation to the ‘Niagara’ query is an organized multimedia

information system (MIS) implementing a photo library portal that allows the user

to create on-the-fly photo exhibitions (browsable presentations of images of

interest), study the photo technique behind these images and then rent or buy the

necessary photo equipment to achieve similar results.

The data are assembled on demand, based on the query, from the photos coming

from different (online) photostock agencies, annotated with relevant descriptions.

The photo equipment data are gathered from (online) catalogues and matched with

(online) photo rental offers. All these data are accessible from a single entry point,

semantically represented by the DM. Figure 5 presents an excerpt of this DM that is

composed of several subontologies covering the semantics of different subdomains:

. The photo subontology consists of terms coming from the photo domain. It

describes things like different kinds of Light, various photo Techniques,

different camera Settings, etc. The cornerstone of this ontology is the class

Photo, which is linked by its properties with the aforementioned classes.

Figure 5. Domain model (excerpt).

Embedding IR in AH 63

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 13: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

There is also a property called depictsTheme that connects the Photo class

from the photo ontology with the Entity class from the general ontology.

. The equipment subontology focuses on the photo hardware, describing and

classifying different kinds of lenses, cameras and other related accessories.

. The general subontology consists of all terms one can possibly take a picture

of, the most general term being the Entity class.

. The ternary subontology serves as a means to describe a story captured on the

photograph, where there are two or more actors that perform (either send or

receive) an action. For instance, a photo depicting a man biting a dog is

certainly a different story than that depicting a dog biting a man.

The problem of relating concepts from the source ontologies (as in Figure 6) to

those from the DM is addressed by the retrieval model (RM), or integration

model (IM) in the terms of Vdovjak et al. (2003). This problem can also be seen

as the problem of merging or aligning ontologies. The approaches to automate

the solution to this problem are usually based on lexical matches, relying mostly

on dictionaries to determine synonyms and hyponyms; this is, however, often not

enough to yield good results. Even when the structure of ontologies is taken into

account the results are often not satisfactory, especially in the case of

uncoordinated development of ontologies across the web (Noy and Musen

2001). For this reason and because every mistake in the integration phase will

propagate and become magnified in all the subsequent phases, we currently rely

on the designer or a domain expert to articulate DM concepts in the semantic

language of sources. What we offer the designer is an integration ontology by the

instantiation of which he or she specifies the links between the DM and the

sources.

The integration model ontology (IMO) depicted in figure 7 is a meta-ontology

(in the sense that its instances are dealing with concepts from other ontologies

such as the source ontologies and the DM) describing integration primitives that

are used both for ranking the sources within a cluster and for specifying links

between them and the DM. The IMO is expressed in RDFS, allowing the

Figure 6. Integrated sources: Photo Equipment Catalogue, Photo Equipment Rental.

64 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 14: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

designer to tailor it for a particular application. The main concepts in the IMO

are Decoration and Articulation.

Decorations serve as a means to label ‘appropriateness’ of different sources

(and their concepts) grouped within one semantically close cluster. By having a

literal property value they offer a simple way of ranking otherwise equivalent

sources from several different points of view. There are some general decoration

classes that are predefined in the framework (e.g. ResponseTime). However,

which ordering criteria are of interest depends mostly on the application. That is

why the concept of Decoration is meant to be extended (specialized) by the

designer. In this way the designer can capture in the RM (IM) his or her (mostly

background) knowledge regarding the sources. For instance, in our photo portal

example sources in the Cluster2 are graded based on the reliability of the

equipment they offer for rent. Hence the Reliability decoration is

introduced, as shown in figure 7.

Figure 7. Integration model ontology and its specialization.

Embedding IR in AH 65

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 15: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

Articulations (see figure 8) describe actual links between the DM and the

source ontologies and also clarify the notion of the concept’s uniqueness, which is

necessary to perform joins from several sources. Articulations are expressed using

path expressions. A path expression is a chain of concepts (represented by the

class Node) connected by their properties (represented by the class Edge). These

path expressions represent paths in the RDF graph. An articulation contains two

path expressions: the target path expression To pointing into the CM and the

path expression called From pointing to a source (note the srcAddress

property, the value of which is the source URL). A transformation is possible via

the Transformer instruction (e.g. a piece of Java code, an XSLT transforma-

tion, an RQL query or a combination of these).

While the integration, that is the configuration of the RM, is carried out only

once, prior to the asking of a query, the actual data retrieval phase is performed for

every query. In this phase, the query is split into several subqueries that are then

routed to the appropriate sources. Subsequently, the results are gathered and

transformed into a DM instance. In this process the mediator is responsible for

finding the answer to the query by consulting the available sources based on the

integration model instance. The mediator locates for every variable in the SELECT

clause of this query an articulation(s) that contains this variable. From this

articulation the mediator determines the name of the concept occurring in the

Figure 8. Articulations, integration model instance.

66 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 16: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

source and also the way to obtain that concept; that is, the necessary transformer(s)

for the concept values, the address of the source, and the path expression to the

concept of interest within the source schema. This path expression can be seen as a

query executed on a particular source. Hence, consulting articulations in the

IM instance in fact mean query unfolding, as it is known in the Global-As-View

(GAV) approach. The acknowledged disadvantage of the GAV approach is that,

in principle, it requires changing the definition of the global schema (in our case

the DM) each time a new source is added. This is, however, clearly not the case in

our framework, as the only thing that changes when a new source is added

or removed is the IM instance (new articulations are added or removed). From

this point of view, we keep the DM independent from the sources, similarly to

the Local-As-View (LAV) approach. Details concerning these approaches

are beyond the scope of this paper; for details see Ullman (1997). If there

are more articulations found for a given variable, that means there are several

competing sources offering values for this variable. In this case the decorations

attached to each articulation are used to decide the order in which the sources will

be consulted.

For the actual query in our example the EROS interface supporting this part of

the Hera retrieval process (Frasincar and Houben 2002, Frasincar et al. 2003)

produces a result as depicted in figure 9.

Figure 9. The resulting presentation.

Embedding IR in AH 67

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 17: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

At this point we should remark on the viability of this vision. We do emphasize

here that the amount of work needed for the designer to configure this retrieval

process by making the proper integration specification is not to be under-

estimated. However, we do observe that in domains where the data are organized

in an appropriately structured way, this effort is affordable and worthwhile. It

means that the vision can be implemented in rather restricted contexts where the

designer has the possibility to use the annotations provided by others, such as

domain experts, and as a main contribution controls the connection between the

different sources. While it does not solve the completely unstructured situation, in

a domain with a collection of data sources that have been tagged with metadata,

this approach helps to get the most out of this tagging by combining these

sources into an integrated application.

5. IR extension alternatives

As indicated in Section 3 we can distinguish between two concrete situations: one

in which the IR extension solely implies the inclusion of one outside fragment,

and another in which an entire navigational structure over outside content

retrieved through search is added to the application.

5.1 Outside fragment

We start by looking at the simplest situation of embedding outside contents

inside the navigation structure that has been designed for the application. In

standard AH applications the content is fixed in and known to the application,

and the adaptive part of the application communicates with the storage part

through some internal layer that delivers data fragments when given fragment

identifiers. Now let us assume for an AH application that within the given

navigation structure we decide to include content (data fragments) that is stored

elsewhere (outside) and that is retrieved on-the-fly. A typical result of embedding

retrieval in the standard navigation is to have a link to a page that:

. includes a fragment that is retrieved from outside the application and not from

the internal storage layer, and

. is like any other page in the application in the sense that it contains links for

further navigation (typically this navigation is contained in fragments

surrounding the embedded outside fragment).

In this way the page is included in the navigation structure as designed by the

author (by supplying the outgoing links to continue browsing in the application),

maintaining the main ideas reflected in that design. The advantage is that for the

user there might not even be a visible difference, except for the fact that the

included fragment might contain links that are not under control of the AH

68 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 18: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

application. For the author the main difference is the need to specify explicitly

how to obtain that fragment.

Example A concrete example is a (descriptive) page on Niagara Falls in our

application, where the data on Niagara Falls are included dynamically via the

web. For this purpose the fragment can be addressed through a web address

specifying the site of the Canadian Tourist Office. In this case (choosing the

solution of the outside fragment) we might want to make sure that the

presentation of the information from the Tourist Office is such that the reader

recognizes that the information is made available through the application, but is

not part of the application, and that the navigation from that information is

recognized as outside navigation.

5.2 Retrieval through search

If we take the search approach, we do not state one address to obtain a

dynamically generated fragment. Then, we are confronted with the problem that

the retrieval request results in a set of fragments. For this set of fragments an

access structure needs to be defined, and that access structure has to be

embedded in the existing application. The part of the embedding of the access

structure is the same as discussed in the previous section on outside fragments.

What is essentially different is the composition of this fragment offering the

access structure with links to the retrieved fragments: design questions are, for

example, how the different links are ordered and annotated.

Example Taking the same example as before, offering information on Niagara

Falls in our example application, the situation is different if the information on

Niagara Falls is retrieved through a request to a search engine. The system will

obtain a set of links to interesting documents about Niagara Falls, and the

application has to find a way to present that list. Not only the first presentation

needs to be specified, but also the effect of navigation within that additional part

of the navigation space.

6. Presenting retrieval results

One important part of using retrieval is the specification of the retrieval request:

the description of what needs to be retrieved. However, after the retrieval request

has been issued, data will be retrieved and then the application should know how to

present the retrieval results, that is how to embed the retrieval results in the

application. It is obvious that this part of the retrieval process, the access structure

to the retrieval results, needs to be included in the author specification as well.

One of the first things the author has to decide is whether the retrieved data

can be captured in one fragment or are better made accessible through a

Embedding IR in AH 69

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 19: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

fragment that only gives an access structure (e.g. a list of links) to other

fragments/pages. There are other variants possible, but these two represent the

situations of single or multiple retrieval results.

Furthermore, if links are used in the access structure, it is has to be made clear

how they are presented and annotated, and how this addition to the navigation

space relates to the original navigation structure of the application. In this respect

we refer to web engineering methodologies (like Hera) that are designed to specify

such dynamically generated access structures and dynamically composed

fragments.

Let us consider the prime decision of which (part) of the retrieval results is

presented. If the retrieved data are a (possibly large) set, then the author has to

define the composition of the result fragment. This includes the definition of which

elements are selected (e.g. top-3), and how they are ordered (e.g. based on relevance,

or ‘first internal links, then external ones’). This design aspect asks that choices can

be made by the author either at a generic level (for all cases) or at a specific level per

fragment or anchor. It is clear that what is needed depends on the nature of the

application. In the case of a hand-crafted application, such as a course, the author

can, without exactly knowing what is going to be retrieved, specify how the retrieval

result is offered. In the case of a WIS with dynamically generated content, such as a

shopping catalogue, it is much harder for the author to specify the result

presentation, as in this case the environment surrounding the embedding is only

known at a generic (schema) level. For the latter case methods such as Hera have

been designed: for an effective combination of these specifications at schema and

instance level RDF(S)-based approaches prove useful (Vdovjak et al . 2003).

7. Retrieval and user model

In the case of adaptation, by definition the user model (UM in AHAM) plays an

essential role. For the specification of the retrieval, the author not only has to

consider the actual retrieval request and the result presentation but also has to

determine the update of UM. In principle, the retrieval request contains a

number of search terms that can be used to determine the UM update.

Let us consider the different origin of a couple of search terms for their effect

on UM:

. Some of the search terms are taken from the source ontology (O), as they are

the terms that correspond to concepts mentioned in the DM of the

application. If, for example, on the basis of the source ontology the search

term ‘Niagara ’ is replaced by the term ‘Niagara Falls ’, which happens to be a

DM concept, then that concept should be updated in UM. Note that it is

possible that the mapping between ontology terms and DM concepts does not

have to be one-to-one, and therefore it is possible that the use of an ontology

(search) term requires an update to other DM concepts as well.

70 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 20: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

. Often, terms (from the ontology O) are added to the request that help to

specify and refine the context of the search; for example, ‘waterfalls ’ or

‘Canada ’ could be added to the above-mentioned request for information

about Niagara Falls. These terms often do not require UM updates, although

the author might take into account that by adding these terms general

information might be retrieved that could, overall, add to the general

knowledge.

. Terms can also be added from the UM, and in that case the corresponding

UM terms could be updated. If, on the basis of UM, ‘Ontario ’ is added to the

term ‘Niagara Falls ’, then the concept ‘Ontario ’ might be updated in UM.

. An interesting second (part of) UM can be formed by the search history,

remembering which search terms have been used previously. If, for example, a

pattern in the search history can be detected (e.g. the user keeps on searching

for a certain term), this can lead to a reaction by the application (e.g. a certain

adaptation or even a modification to the application).

Note that in the above we limit ourselves to the situation where the UM update is

only based on the (extended) retrieval request. We neglect the fact that the

retrieved data might not match those search terms, and without any additional

interpretation software we cannot improve on the accuracy of the UM update.

Example Suppose content is obtained through retrieval for the DM concept

‘Niagara ’. Let us assume that the author, on the basis of the ontology, has defined

that the content is retrieved with the request ‘Niagara AND waterfalls AND

Canada ’. First, it is possible that data are retrieved that do not contain proper

information on ‘Niagara ’ in the sense of the DM concept: this could be the

consequence of a wrong annotation in the outside search space. Second, it is

possible that the content that is retrieved contains information on other DM

concepts besides ‘Niagara ’: the retrieved data could give a survey of waterfalls that

nicely explains other waterfalls in the world that are much smaller than those of

Niagara Falls.

The two situations described in this example ask for a decision by the author.

The author should decide how the retrieved content contributes to the knowledge

about this and other concepts as reflected in the UM. We assume for the moment

that the mappings are fixed and known to the author, but what if they are not?

Because the information might not be accurately annotated, the author could

specify the UM update in such a way that the value associated with the concept

reflects this uncertainty. If UM contains concept attributes such as knowledge and

read, then the author could choose to give a value such as possibly-known, or

certainly-read, to reflect his or her expectation of the accuracy of the retrieval.

Note that the abstract reference model AHAM only specifies attributes and

attribute values. In a concrete AH system like AHA!, it is possible to use numeric

values between 0 and 100, and that opens up the possibility of representing the

Embedding IR in AH 71

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 21: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

expected value of the property. In most current AHA! applications these numeric

values are used to specify the contribution of (the knowledge about) concepts to

(the knowledge about) other concepts; for example, a knowledge of 80 on

adaptation represents the fact that 80% of the knowledge on the concept of

adaptation is obtained. For retrieval, the existing AHA! mechanism can be used to

represent the fact that the author expects that 80% of the knowledge is obtained.

The other anomaly is that content is retrieved that (also) contributes to the

knowledge about other concepts from DM. Besides the fact that the author has to

specify his or her expectation as described above, the author can also include the

extended retrieval request to look for clues on how to update UM. In the example

that we gave for the concept ‘Niagara ’, a retrieval request was constructed with

three terms: ‘Niagara AND waterfalls AND Canada ’. If the terms ‘waterfalls ’ and

‘Canada ’ occur in DM as well (maybe after some renaming), then the author should

consider updating these concepts: probably not with a major increase of knowledge,

but it is foreseeable that adding these terms leads to some knowledge about these

concepts as well. For those terms mentioned in the retrieval request the author can

update UM accordingly; for concepts that are covered coincidentally, the author

cannot foresee a UM update.

We see here that the possibilities that AHA! offers to deal with uncertainty in

the values that represent the different knowledge aspects in the application

support an effective retrieval process. The designer can choose to configure the

process in such a way that the decisions concerning representation, retrieval and

use of the information can be handled in a reasonable way, without the need for

unsatisfactory yes-or-no decisions.

As a final issue in connection with the UM update, we mention that in the

situation of retrieval through search, an additional navigation structure was

added (on the basis of retrieved outside content). The UM can also play a role in

annotating the links in that navigation structure based on the user’s exploration

of that structure: this would require that (dynamically) that navigation structure

is made adaptive itself by adding it to the AH application.

8. Related work

Many research fields are clearly showing excellent progress in topics such as

schema matching and wrapping. It is important that we state that we do not

exclude that research from our approach. In this paper we chose to restrict the

discussion to those aspects that relate information retrieval to the adaptive

hypermedia, starting from the perspective of the latter.

We mention here briefly a number of related research fields. Within the area of

open hypermedia (OH) there is major interest in the relation between the metadata

descriptors of the document’s content and ways to retrieve this document and apply

it for reaching specific educational or other goals. In this way OH builds upon the

existing WWW infrastructure, provides a powerful framework to aid navigation

72 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 22: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

and authoring, and solves some of the issues of distributed information manage-

ment (de Roure et al. 1999). The interesting issues in this context relate to flexibility,

which OH architectures allow in providing opportunities for adding various kinds

of links to the documents and creating a user-specific navigational overlay (Carr et

al. 1995; Carr et al. 1996 , Bailey et al. 2002). OH systems make it possible that the

(hyper)links, which traditionally are embedded in the web documents, could be

abstracted from the documents (multimedia data) and stored and managed

separately, and this also includes the searching of links in the same paradigm as

for documents. In this way OH systems could contribute in the direction of

presentation of the retrieval results and the integration of IRwithin the adaptation

process.

In connection with this move towards more open hypermedia we also mention

the Knowledge Sea approach presented earlier in this journal (Brusilovsky and

Rizzo 2002). The authors of Knowledge Sea show how in the context of adaptive

educational hypermedia systems (AEHS) the move from closed corpus to open

corpus material can use automatic linking techniques and map-based navigation.

Brusilovsky and Rizzo (2002) show how maps and landmarks can be used to bridge

the gap between closed and open corpus hyperspace. Their use of self-organized

map technology illustrates how they (more than we do here) rely on automatic

linking techniques. The ‘learning’ nature of that approach appears to target a

slightly different kind of application than we do in our more designer-controlled

setting. It seems that as we progress in ‘opening up’ the AH systems, it may depend

on the situation how constrained and controlled the approach should be. Therefore,

in the end we need a number of approaches that require the involvement of the

designer and/or a learning process as it is suited to the specific application. We feel

that these approaches do not conflict, but might complement each other effectively.

An interesting angle to the Knowledge Sea is the use of map-based navigation. With

a two-dimensional semantic map of the resources, theyoffer an interface that allows

the userof the education resources to navigate over this open corpus. In our work we

chose a more traditional approach, making use of the more usual AH interface.

Related to the above research on AEHS we would also like to mention the work

on the KBS Hyperbook system in connection with open corpus content. In this

issue Henze and Nejdl (2004) provide a logical characterization for AEHS on the

basis of identifying the document space, the user model, observations (on the user’s

interactions) and adaptation components. The authors illustrate their approach by

characterizing a number of well-known systems (including AHA!). This gives an

effective basis to also tackle the ‘open corpus problem’. As they conclude, the

document space plays a decisive role in the way adaptation over the open corpus can

be realized. We agree with this observation on the basis of our own experience, and

therefore we have chosen in this research to start extending the document space

without attempting the integration of adaptive functionality at the same time.

Another step towards more conceptualization and semantics in information

management within hypermedia systems is the effort of conceptual hypermedia

Embedding IR in AH 73

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 23: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

systems (CHS) to provide a well-defined conceptual schema for the hypertext

structure and navigation (e.g. Bruza 1990, Nanard and Nanard 1991, Tudhope and

Taylor 1997). A major research effort is being carried out within the context of the

TourisT (Bullock and Goble 1998) prototype of a Tourism Public Access System.

CHS gives us a powerful mechanism to support dynamic link and document

management, based on metadata descriptors of the real content. It builds upon the

notion of hypermedia and open hypermedia and makes a bridge with the SW.

Another example in this research is the COHSE project (Goble et al. 2001), which

aims to build a conceptual hypermedia system to enable documents to be linked on

the basis of the metadata content descriptor. This is realized by integrating an

ontology reasoning service (domain modelling) and an open hypermedia link

service (link providing).

For reasons of lack of space we only mention three references with respect to

adaptive information retrieval. Combining text and semantic markup directly

influences IR: we mention IR performed on the basis of knowledge representation

languages, such as RDF(S) and DAML�/OIL (OWL), which can help us to achieve

more flexible and precise information retrieval (Shah et al. 2000). Another example

of adaptive IR, or context-dependent IR, is given by systems like AIMS

(Aroyo et al. 2000, Dicheva and Aroyo 2000), where the successful IR depends

on the modelling of user tasks and learning goals in relation to the domain model,

on the conceptual visualization of the IR results in a semantic network of domain

concepts, and on including the user characteristics within the search query. Another

interesting example is given by METIOREW (Bueno et al. 2001), a collaborative

and content-based web-recommending system, where the IR is objective oriented,

considering an evolving user’s information need, and uses two user models, an

initial one corresponding to the initial user objective and search keywords, and

another one of a user with a similar objective. Thus, for its successful recommenda-

tions, it maintains simultaneously two models until the new user’s model becomes

significant.

9. Conclusion

This paper has addressed our vision on the extension of AH to include aspects of

IR. In terms of our research on the AH implementation framework AHA! (and the

associated reference model AHAM) we have identified a number of issues in

connection with ‘opening up’ AH for the retrieval of content from outside the

application. We have seen how the relationships between internal (DM) concepts

are made to outside contents, by means of descriptive metadata that relate the

outside content to terms from a source ontology: we have introduced the ontology

and retrieval model to the perspective of AHAM. We have shown how SW

technology can be used to considerably improve the query and retrieval capabilities,

and we have illustrated the process in terms of our Hera implementation. An

important aspect of retrieving multiple fragments at a time is the design of the

74 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 24: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

presentation and access structure for the collected fragments. As the user model is a

central issue in AH, the desire to optimally consider that user model in dealing with

outside content also has significant consequences for the design and specification of

the application.

The current implementation of this vision leads to a more extensive use of the

SW-based approach from Hera in AHA!, realizing a more ‘open’ approach to

retrieval in the current AHA! system, and at the same time to a more advanced

notion of adaptation in the retrieval and integration part and the hypermedia

presentation part of the Hera system. As we indicated earlier in the paper, the

vision in its current state does not provide a solution for the embedding of

retrieval from a completely unstructured repository of content into AH. Two

aspects that play a limiting factor in the current state are the availability of the

domain expertise expressed in proper annotations of the content and the effort

that is needed and is affordable to combine content from different sources. In our

progress to ‘open up’ AH, as in our AHA! system, we can observe in our current

experiments that the implementation of the vision expressed here, albeit not in

the ultimate, completely open context but in a more constrained setting, already

implies a significant step forwards for AH, where the challenge for the

implementation effort is in maintaining the strong points of an AH system like

AHA!, while supporting a more advanced concept of IR.

References

L. Aroyo, D. Dicheva and I. Velev, ‘‘Conceptual visualisation in a task-based information

support system’’, in EdMedia , AACE, 2000, pp. 125�/130.

C. Bailey, W. Hall, D.E. Millard and M.J. Weal, ‘‘Towards open adaptive hypermedia’’, in AH

2002 , LNCS 2347, 2002, pp. 36�/46.

P. Brusilovsky, ‘‘Adaptive hypermedia’’, User Modeling and User-adapted Interaction , 11, pp.

87�/110, 2001.

P. Brusilovsky and R. Rizzo, ‘‘Using maps and landmarks for navigation between closed and

open corpus hyperspace in web-based education’’, The New Review of Hypermedia and

Multimedia , 9, pp. 59�/82, 2002.

P. Brusilovsky, J. Eklund and E. Schwarz, ‘‘Web-based education for all: a tool for developing

adaptive courseware’’, Computer Networks and ISDN Systems, 30, pp. 291�/300, 1998.

P.D. Bruza, ‘‘Hyperindices: a novel aid for searching in hypermedia’’, in ACM Hypertext ,

1990, pp. 109�/122.

D. Bueno, R. Conejo and A.A. David, ‘‘METIOREW: an objective oriented content based and

collaborative recommending system’’, in AH 2001 , LNCS 2266, 2002, pp. 310�/314.

J. Bullock and C. Goble, ‘‘TourisT: the application of a description logic based semantic

hypermedia system for tourism’’, in ACM Hypertext , 1998, pp. 132�/141.

L. Carr, D. de Roure, W. Hall and G. Hill, ‘‘The distributed link service: a tool for publishers,

authors and readers’’, World Wide Web Journal , 1, pp. 647�/656, 1995.

L. Carr, H. Davis, D. de Roure, W. Hall and G. Hill, ‘‘Open information services’’, Computer

Networks and ISDN Systems, 28, pp. 1027�/1036, 1996.

S. Ceri, P. Fraternali and M. Matera, ‘‘Conceptual modeling of data-intensive web

applications’’, IEEE Internet Computing , 6, pp. 20�/30, 2002.

Embedding IR in AH 75

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14

Page 25: Embedding information retrieval in adaptive hypermedia: IR meets AHA!

P. de Bra and L. Calvi, ‘‘AHA! An open adaptive hypermedia architecture’’, The New Review of

Hypermedia and Multimedia , 4, pp. 115�/139, 1998.

P. de Bra, G.J. Houben and H. Wu, ‘‘AHAM: a Dexter-based reference model for adaptive

hypermedia’’, in ACM Hypertext , 1999, pp. 147�/156.

P. de Bra, A. Aerts, D. Smits and N. Stash, ‘‘AHA! Version 2.0, More adaptation flexibility for

authors’’, in AACE ELearn , 2002, pp. 240�/246.

D. de Roure, S. El-Beltagy, N. Gibbins, L. Carr and W. Hall, ‘‘Integrating link resolution

services using query routing’’, in 5th Open Hypermedia Workshop , 1999.

D. Dicheva and L. Aroyo, ‘‘An approach to intelligent information handling in web-based

learning environments’’, in ICAI , 2000, pp. 327�/333.

F. Frasincar and G.J. Houben, ‘‘Hypermedia presentation adaptation on the semantic web’’, in

Adaptive Hypermedia and Adaptive Web-based Systems (AH2002) , LNCS 2347, 2002,

pp. 133�/142.

F. Frasincar, G.J. Houben and R. Vdovjak, ‘‘Engineering semantic web information systems in

Hera’’, Journal of Web Engineering , 2, pp. 3�/26, 2003.

C. Goble, S. Bechhofer, L. Carr, D. de Roure and W. Hall, ‘‘Conceptual open hypermedia�/the

semantic web?’’, in SemWeb’01 , 2001.

F. Halasz and M. Schwartz, ‘‘The Dexter hypertext reference model’’, CACM , 37, pp. 30�/39,

1994.

N. Henze and W. Nejdl, ‘‘A logical characterization of adaptive educational hypermedia’’, The

New Review of Hypermedia and Multimedia , pp. 77�/113, 2004.

T. Isakowitz, M. Bieber and F. Vitali, ‘‘Web information systems’’, CACM , 41, pp. 78�/80, 1998.

G. Karvounarakis, V. Christophides, D. Plexousakis and S. Alexaki, ‘‘Querying RDF

descriptions for community web portals’’, in 17iemes Journees Bases de Donnees

Avancees, 2001, pp. 133�/144.

G. Klyne and J.J. Caroll, ‘‘Resource description framework (RDF): concepts and abstract

syntax’’, W3C proposed recommendation, 15 December 2003.

N. Koch, A. Kraus and R. Hennicker, ‘‘The authoring process of the UML-based web

engineering approach’’, in First Int. Workshop on Web-Oriented Software Technology,

2001.

G. Mecca, P. Atzeni, A. Masci, P. Merialdo and G. Sindoni, ‘‘The Araneus web-based

management system’’, in ACM SIGMOD, 1998, pp. 544�/546.

J. Nanard and M. Nanard, ‘‘Using structured types to incorporate knowledge in hypertext’’, in

ACM Hypertext , 1991, pp. 329�/342.

N.F. Noy and M.A. Musen, ‘‘Anchor-prompt: using non-local context for semantic matching’’,

in Workshop on Ontologies and Information Sharing at IJCAI , 2001.

U. Shah, T. Finin and A. Joshi, ‘‘Information retrieval on the semantic web’’, in ICIKM , 2000,

pp. 461�/468.

D. Tudhope and C. Taylor, ‘‘Navigation via similarity: automatic linking based on semantic

closeness’’, Information Processing & Management , 33, pp. 233�/242, 1997.

J.D. Ullman, ‘‘Information integration using logical views’’, in ICDT’97, LNCS 1186, 1997,

pp. 19�/40.

J. Van Ossenbruggen, J. Geurts, F. Cornelissen, L. Hardman and L. Ruthledge, ‘‘Towards

second and third generation web-based multimedia’’, in World Wide Web Conference,

2001, pp. 479�/488.

R. Vdovjak and G.J. Houben, ‘‘Providing the semantic layer for WIS design’’, in CAiSE 2002 ,

LNCS 2348, 2002, pp. 584�/599.

R. Vdovjak, P. Barna and G.J. Houben, ‘‘Designing a federated multimedia information

system on the semantic web’’, in CAiSE 2003 , LNCS 2681, 2003, pp. 357�/373.

H. Wu, ‘‘A reference architecture for adaptive hypermedia applications’’, PhD thesis,

Technische Universiteit Eindhoven, 2002.

76 L. Aroyo et al.

Dow

nloa

ded

by [

The

Uni

vers

ity o

f M

anch

este

r L

ibra

ry]

at 0

9:39

22

Dec

embe

r 20

14