Evaluation Protocols · 2017-08-30 · The deliverable D6.1 - Evaluation Protocols - is the first deliverable of Work Package 6 - Validation. The deliverable is linked to task T6.1

PersOnalized Smart Environments to increase Inclusion of people with DOwn’s syNdrome

1

Deliverable D6.1

Evaluation Protocols

Call: FP7-ICT-2013-10

Objective: ICT-2013.5.3 ICT for smart and

personalised inclusion

Contractual delivery date: 30.04.2014 (M6)

Actual delivery date: 15.04.2015

Version: v2

Editor: Andreas Braun (FhG)

Anna Zirk (BIS)

Contributors: Silvia Rus (FhG)

Katrine Prince Moe (Karde)

Reviewers: Juan Carlos Augusto (MU)

Terje Grimstad (Karde)

Dissemination level: Public

Number of pages: 18

FP7 ICT Call 10 STREP POSEIDON Contract no. 610840

2

Contents

1 Executive Summary ......................................................................................................................... 3 2 Introduction ..................................................................................................................................... 4 3 Evaluation in POSEIDON .................................................................................................................. 4

3.1 POSEIDON Timeline ................................................................................................................. 4 4 Technical evaluation of POSEIDON ................................................................................................. 5

4.1 Best practice of system evaluation ......................................................................................... 5 4.2 POSEIDON requirements ......................................................................................................... 6 4.3 Evaluating requirements ......................................................................................................... 6

4.3.1 Requirement adherence .................................................................................................. 7 4.3.2 Risk estimation ................................................................................................................ 7

4.4 Evaluation process ................................................................................................................... 9 4.4.1 Organization .................................................................................................................. 10 4.4.2 Managing potential requirement revision .................................................................... 10

4.5 Result Template ..................................................................................................................... 11 5 User experience evaluation of POSEIDON .................................................................................... 12

5.1 Guidelines for Pilots .............................................................................................................. 12 5.1.1 Ethical approval ............................................................................................................. 12 5.1.2 Recruitment procedures................................................................................................ 12 5.1.3 Eligibility criteria for people with DS ............................................................................. 12 5.1.4 Eligibility criteria for caregiver ....................................................................................... 12 5.1.5 Exit strategy ................................................................................................................... 12 5.1.6 Pre-pilots ....................................................................................................................... 12

5.2 Methods ................................................................................................................................ 13 5.2.1 Controlled tasks versus free usage ................................................................................ 13 5.2.2 Observation ................................................................................................................... 13 5.2.3 Video recording ............................................................................................................. 13 5.2.4 Interviews ...................................................................................................................... 14 5.2.5 Logs ................................................................................................................................ 14 5.2.6 Questionnaires .............................................................................................................. 14 5.2.7 Diaries ............................................................................................................................ 15 5.2.8 Immediate feedback ...................................................................................................... 15 5.2.9 Scores ............................................................................................................................ 16 5.2.10 Limited Evaluation of POSEIDON ................................................................................... 16

6 Conclusion ..................................................................................................................................... 17 7 References ..................................................................................................................................... 18


3

1 Executive Summary

The deliverable D6.1 - Evaluation Protocols - is the first deliverable of Work Package 6 - Validation. The

deliverable is linked to task T6.1 - Technical Assessment and has a number of connections to

deliverables of Work Packages 2, & 5. It is based on deliverables D2.1 - Report on requirements and

D2.3 - Report on Design of HW, Interfaces, and Software that build the foundation for the technical

Work Packages. These requirements are driving development in Work Packages 3, 4 & 5. For this

document we are also considering D5.1 - Development framework that describes the different

components and their specific requirements. Additionally, Section 3.2.1 of Part B of the POSEIDON

DoW lists different expected outcomes and the measures to assess their impact.

The document gives an introduction to the system evaluation, restating the timeline of POSEIDON and

the requirement driven development process. We show relevant best practice of system evaluation

and how it relates to the methods used for the POSEIDON evaluation. The process of the technical

evaluation is outlined, stating both methods and management layout. The evaluation will be based on

assessing the adherence to requirements and the associated risk of not meeting them.

Additionally, the process of user evaluation is described. Guidelines, including ethical approval,

recruitment procedures, eligibility criteria, exit strategy and pre-pilots are presented. Furthermore,

different methods like controlled tasks and free usage for assessing data are presented. Instruments

and approaches for collecting quantitative and qualitative data are mentioned. They include

observation, video recording, interviews, computer logs, questionnaires, diaries and immediate

feedback. The idea is, to present methods and instruments which are useful for gaining comprehensive

user insights.


4

2 Introduction

In the scope of POSEIDON a variety of different technical systems will be developed and integrated

into a common platform. Thus it is crucial to monitor and evaluate the performance of the system

including, but not limited to hardware, interface, software, integration routines, user experience,

usability aspects but also contributions to safety, privacy and ethics. This task is integrated into Work

Package 6 - Validation. This task is intended to materialize the mechanisms that are envisaged to

produce a highly reliable and useful product. In this scope the compliance to requirements will be

double-checked, a link between requirements and testing, as well as monitoring the validation and

pilots including the design revisions.

3 Evaluation in POSEIDON

3.1 POSEIDON Timeline The interface strategy strongly depends on the requirements gathered during the requirement

analysis. The interviews of the primary user, the online questionnaires of the secondary and tertiary

user and the first user workshop have all contributed to this process. From these requirements an

interface strategy is extracted, imminently followed by an implementation in form of an integrated

prototype. In Fig. 1, (b) and (d), the requirements gathering phase and the first user workshop followed

by f when the first interfaces and interactive technology are set up, are represented respectively. The

created prototype (g) is evaluated in the second user workshop (h) and the outcome of the workshops

is subsequently analysed. The feedback is taken into account and the interface (l) is adjusted, when

the interfaces are completely defined. This iterative process for the interfaces is finished in step r,

when the improved interfaces are set up.

Fig. 1 Project and work package milestones and events

From this timeline we can identify a set of milestones of the technical systems developed in the scope

of POSEIDON. The relevant events determining the points in which the system is tested are the user

workshops and the pilots of integrated prototypes. Therefore, we have the following short list of

events that are directly associated to the system evaluation in WP6.


5

Table 1 Important events related to evaluation in POSEIDON

Event Description Time in project

Planned date

1st User Group Workshop

Marks the end of the requirements gathering face - requirements refinement

M3 January 2014

2nd User Group Workshop

Indicates successful implementation of first set of requirements

M10 September 2014

First Pilot Testing of integrated prototype over a longer time

M20-21 from July 2015

Second Pilot Testing of second integrated prototype over a longer time

M30-31 from May 2016

Final Workshop Final check of requirements and post-project planning

M36 October 2016

From those dates the first user group workshop has a special role, as it marks the end of the

requirement gathering process and the adherence to requirements can’t be validated already. Instead

the primary purpose is to finalize the list of requirements and elucidate new ones according to the

feedback gathered from the users.

The next three events are the primary important events in the scope of the process described in this

deliverable. During those events the systems are tested by the users and the evaluators can determine

how well the requirements set for this specific stage have been fulfilled. Afterwards, the set of

requirements to be fulfilled for the next stage can be adapted and refined according to the results of

the testing.

The last event is primarily intended to prepare the potential market launch of the POSEIDON system

and can also act as a final check of adherence for all requirements, in order to verify that the full system

is running with all intended functionality and at the intended level of stability.

The process of evaluation is closely following the requirements gathering process, as specified in WP2.

In the next section some best practice of system evaluation in general and the necessary aspects of

requirements engineering specifically.

4 Technical evaluation of POSEIDON

4.1 Best practice of system evaluation A large body of literature has been written on testing systems for conformity, adherence, robustness

and feature-completeness. The goal is always to assure that a certain level of quality is reached that

has been defined early in the project [1]. POSEIDON is following an approach that is based on defining

and testing a set of requirements that are to be fulfilled at different stages of the project. Thus, we are

using an approach that known in software development as Requirements Engineering.

Requirements engineering describes the process of formulating, documenting and maintaining

software requirements and to the subfield of Software Engineering concerned with this process [2].

Typically we can distinguish a set of seven different steps in this process, namely [3]:

1. Requirements inception or requirements elicitation -

2. Requirements identification - identifying new requirements

3. Requirements analysis and negotiation - checking requirements and resolving stakeholder

conflicts


6

4. Requirements specification (Software Requirements Specification) - documenting the

requirements in a requirements document

5. System modeling - deriving models of the system, often using a notation such as the Unified

Modeling Language

6. Requirements validation - checking that the documented requirements and models are

consistent and meet stakeholder needs

7. Requirements management - managing changes to the requirements as the system is

developed and put into use

As we are not solely developing software in POSEIDON this process has to be adapted to cater also to

the requirements of the different hardware systems that will be developed. These adaptations are

discussed in the following section.

4.2 POSEIDON requirements The interface strategy is closely related to D2.1 - Report on requirements and to D2.3 - Report on

Design of HW, Interfaces, and Software. The general design principles for interfaces have been

presented in D2.3. Therefore, parts of the following section regarding the design principles are taken

from there, while section 3.2.2 regarding the requirements analysis presents the requirements

applicable for the interface strategy.

In order to assure compatibility between the process of requirements engineering outlined in the

previous section and the development in POSEIDON, the requirements have to be adapted

appropriately. We will briefly describe the adaptation process for each step.

1. The requirements inception is performed according to specifications in the DoW, the initial

requirements gathering phase and the results of the first user group workshop

2. The requirements identification is performed iteratively after each user workshop and the pilot

phases of the integrated system

3. The requirements analysis and negotiation occurs led by a core group of developers in strong

collaboration with the representatives of the user organizations

4. The specification of the requirements adds a number of labels to distinguish functional and

non-functional requirements, identify the target system iteration and the associated system

components

5. The combined software- and hardware-architecture is initially designed at a very high level and

will be detailed according to the requirements of the current prototype iteration

This step is the primary concern of this document and detailed in the next section

6. POSEIDON is using a shared Wiki system to manage the requirements and their different

iterations

4.3 Evaluating requirements In the scope of WP2 the requirements of the POSEIDON system are defined and refined throughout

the project. While some requirements can be easily quantified, particularly non-functional

requirements need additional information that helps determining how well they meet criteria set

towards them. In this section we will show how to determine the adherence of requirements and

based on that calculate a risk level that helps to specify how the requirements should be modified in

the later stages of the development and how severe the expected impact on the development timeline

will be.


7

4.3.1 Requirement adherence

The first important factor in the requirements evaluation is to determine how well they adhere to the

specified metric. Here we can distinguish two different types of requirements - quantitative

requirements and non-quantitative requirements. The first category can be easily measured and in

order to fulfil the requirement a discrete number is given. In the following tables, we will use a four-

level grading system for the different components of the requirement evaluation. This grading will

simplify the notation of adherence in the later project stages. The grades used are A, B, C and F -

whereas A indicates the highest grade, e.g. full adherence and F typically indicates failure. Table 3

shows grading, adherence level and description for quantitative requirements.

Table 3 Quantitative requirements adherence levels, descriptions and associated score

Grade Adherence Level Description

A Full Achieving at least 100% of the specified value

B With Limitations Achieving between 80% and 100%

C Low 50% to 80%

F Failure Less than 50%

A similar grading can be used for non-quantitative requirements and their adherence. In this case the

level is subjective by definition and should therefore be determined by multiple persons, e.g. using a

majority vote.

Table 4 Non-quantitative requirements adherence levels, descriptions and associated score

Grade Adherence Level

Description

A Full Full adherence is achieved if the requirement is completely fulfilled in the given evaluation

B With Limitations

If a requirement is only partially fulfilled, but the deviation is not critical, a level “with limitations” should be given

C Low If a requirement is partially fulfilled and the deviation is high the adherence level is “low”

F Failure If a requirement is not fulfilled at all the level of “failure” has to be attributed

4.3.2 Risk estimation

An important step in estimating the potential impact of not fulfilling a requirement is performing a risk

assessment based on the level of adherence and some other factors. Risk management is a whole field

of study dedicated to identify, assess and priorize risks [4]. Apart from the risks specified in the DoW

we are only using a minimal routine to estimate the risk or impact of note meeting a certain

requirement, based on the following factors:

Risk Score = Adherence Level Score * (Criticality Level + Estimated time to fix)

The adherence level score is derived from Table 4, shown in the previous section. The criticality level

can be defined according to the following table.

Table 5 Criticality of non-adherence to requirements

Grade Criticality Level

Description


8

A Not critical Not fulfilling this specific requirement will not interfere with the overall functionality of the system - mostly suited for optional requirements

B Potentially critical

If not fulfilled this requirement has some impact on the system or user experience but it is considered low

C Very critical If the requirement is not fulfilled the system is expected to behave unexpectedly or not adhere to the minimum functionality required

F Fatal This breaks the system experience and permits main functions from working

The last component is the estimated time required to fix the requirement in a way so it will be adhering

to the specified level. This is important to get an estimate about the resources that will be required to

fix the problem and how they can be mapped into the remaining project timeline and the development

queue until the next iteration is due. The time required to fix will be quantized similar to the previous

factors, in the following table. It should be noted that the time should be estimated in a way that

includes testing of the fixed requirement:

Table 6 Score associated to estimated time to fix a certain requirement not met

Grade Estimated time to fix

A < 1 hour

B < 1 day

C < 1 week

F > 1 week

In order to calculate the risk score we have to associate the grades that were given to a numeric value.

We are using the simple association of the following table:

Table 7 Association between grades and numeric values

Grade Numeric value

A 0

B 1

C 3

F 5

Now, all components are complete that are required to calculate the risk score. For example, if there

is a requirement with low adherence level that is not critical and will take less than one day to fix and

test the resulting risk score is:

Risk Score = 3 * (0 + 1) = 3

A second example for a requirement with low adherence level that is potentially critical and takes a

long time to fix is:

Risk Score = 3 * (3 + 5) = 24

The risk score is considerably higher than before, as the impact of not meeting this requirement on the

project development roadmap can potentially be very high, primarily due to the long time that is

required for a fix. The risk score for a fully met requirement thus is always 0 and there is obviously no

need to note criticality and time to fix.


9

Finally, we can group the risks into three distinct risk level groups based on their risk scores, as shown

in the following table:

Table 8 Association between risk levels and risk score

Risk level Risk score

High > 15

Medium < 15

Low < 6

None 0

This risk level should drive the discussion about how to manage changes to requirements and the

impact of not meeting a specific requirement on the development schedule specified in the

development roadmap. This risk level should be noted for any requirement that is not achieving

adherence level, or for all requirements, whereas the risk level for requirements fully met, is

automatically set to “None”.

4.4 Evaluation process

Figure 1 Requirement implementation and evaluation cycle

The evaluation process is following the cycle outlined in Figure 1. Five different phases can be

distinguished:

1. During the piloting & testing the adherence of the different requirements is tested with

regards to the specification. The duration is depending on the setup of the pilot/workshop

2. In the analysis phase these results are used to calculate derived information, such as the risk

score. This phase should not last longer than 14 days, in order to guarantee a thorough review,

while not affecting the overall development too much.

Piloting & Testing

Analysis

ConsolidationRequirement

Revision

Development


10

3. The consolidation phase is a physical or virtual meeting, in which a consensus on the ranking

of the requirements has to be found. This meeting should also be used to determine if an

adaptation of requirements is needed

4. In the requirement revision phase the adaptations specified in the consolidation phase have

to be detailed and integrated into the development roadmap

5. The development phase will implement the system in order to fulfill the requirements needed

for the next iteration of the prototype

4.4.1 Organization

The organizational structure of the personnel responsible should be kept small, in order to minimize

overhead. We envision a system of three roles operating given the following organizational structure.

Figure 2 Organizational structure of technical evaluation

The different roles will have the following tasks.

Technical Evaluation Coordinator will lead the process, has to query the different stages of the

process, distribute tasks to the technical evaluation committee and lead the consolidation

meeting

The technical evaluation committee members will execute the different tasks related to the

evaluation process, such as performing the technical evaluation, analyzing the results,

participating in the consolidation meeting and adapting the requirements

The requirements advisory board is comprised of several members of the social science

specialists within the consortium that perform the workshops and pilots and can contribute to

the process during the consolidation meeting and analysis phase - giving input to potential

adaptations of requirements for the next iteration

4.4.2 Managing potential requirement revision

In POSEIDON we are using a Wiki-based system to track the requirements. The integrated versioning

allows keeping track of the different versions. In Figure 3, we can see a screenshot of the system. The

system is in the process of being set up - a process that will be completed in time before the first pilot.

The wiki follows the structure of requirements as specified in D2.3.

Technical Evaluation

Coordinator

Technical Evaluation Committee

Requirements Advisory

Committee


11

Figure 3 Screenshot of requirement tracking wiki page

The wiki system will be updated by all members of the Technical Evaluation committee, with the

Technical Evaluation Coordinator being responsible for periodically checking for adherence to the

specified standards.

In the future we will investigate different wiki add-ons or other system that will allow us to create the

result template presented in the following section automatically, from a subset of requirements.

4.5 Result Template The results of the requirement evaluation should be noted in a specific template that allows recording

the information required in the later stages of the process. The following table can be expanded and

printed out with the current list of requirements to be tested in scope of the next event. There are two

example requirements filled in order to show how the table works. They are not related to any actual

evaluation.

Table 9 Template of requirement evaluation sheet

Label Requirement Category Quantitative Qualitative Adh Level

Criticality Est. time-to-fix

Fun5 Should keep track

of user's position

when traveling

outdoors

Functional - - A - -

Fun6a Should provide

basic outdoors

navigation services

Functional - - B F C


12

5 User experience evaluation of POSEIDON

5.1 Guidelines for Pilots To ensure POSEIDON achieves its overall aim, the developmental process follows an user centered

approach and involves primary and secondary users in two one-month lasting field trials.

The field trials will be conducted in a standardised manner. A comprehensive set of field work

guidelines will be produced first in English, and then, after review and improvements by relevant

partners, translated into all other project languages.

5.1.1 Ethical approval

Each participating country will apply for the relevant ethical approval in their area if ethical approval

is required.

5.1.2 Recruitment procedures

Three people with DS will be recruited in Norway, UK and Germany for each pilot. Each person with DS

(primary user) will be paired with a caregiver (secondary user). People with DS will be recruited by the

referring DSAs. Where possible we will aim to recruit a balance in terms of gender, relationship

between primary and secondary end user.

Potential participants will receive an easy to read information sheet describing the project and the

pilot. Those interested will be contacted by the researcher. The researcher will then arrange a visit to

explain the project in more detail and to answer any questions. A screening questionnaire will be used

to make sure the participant is eligible. Participants will be encouraged to discuss the project with

family/friends or a relevant professional before agreeing to take part. If participants agree to take part

they will be asked to sign and return the consent form.

5.1.3 Eligibility criteria for people with DS

Must be older than 16 years

Have the motivation and ability to participate

Have a general interest in IT, not scared of IT

WiFi access at home

5.1.4 Eligibility criteria for caregiver

Must be a parent, relative or professional care worker of people with DS

Have the motivation and ability to participate

Have a general interest in IT, not scared of IT

WiFi access at home

5.1.5 Exit strategy

Participants may leave the study at any time. They will, for quality assurance purposes be asked for

the reason of exit, however they do not have to give a reason if they do not wish to. Those that leave

will be asked to return the equipment (interactive table) to the respective provider so that it can be

used with another participant if necessary.

5.1.6 Pre-pilots

There will be pre-pilot studies conducted by the developers as part of their formal software

development process, so that the hardware and software being used during the pilots are fit for

purpose and fulfill the requirement.


13

5.2 Methods The pilots will consist of several methods and approaches. Data collection will be carried out

quantitatively and qualitatively with the help of different methods and instruments. Gathering

quantitative and qualitative, objective and subjective data as well as observing and asking primary and

secondary users ensures a comprehensive view on the developmental process. This approach also

helps to validate subjective data gained from the primary users, who might have problems in telling

their experiences in interviews for instance.

5.2.1 Controlled tasks versus free usage

On the one hand, participants will be given the chance to explore POSEIDON and all its functions on

their own during the one month pilot. That means, all primary users will get a smartphone and an

interactive table, which enables them to use POSEIDON whenever they want to or when help is

needed. On the other hand, several controlled tasks will be conducted. This approach enables the

researcher to gain standardized information about how people with DS are using POSEIDON. These

controlled tasks will be used to make sure, that the main functions of POSEIDON are used and

evaluated by people with DS during the pilots. Therefore they will be asked to fulfil different tasks

consisting of several subtasks with all of the features tested during the trials. These different tasks will

take place at different stages during the testing period of one month. People with DS will be instructed

by a researcher. The researcher will explain the different tasks and ask the person with DS to fulfil them

with the help of POSEIDON.

5.2.2 Observation

Observation will take place when conducting the controlled tasks. For each subtask the researcher notes down:

Wrong turns: i.e. where the participant taps that will not complete the subtask.

For each subtask and “wrong turn” the following should be recorded:

o Level of hesitation or confusion shown (scale of 1-5 with 1 being very little hesitation) and description of hesitation or confusion. Did the person look distressed or distracted etc.?

o Comments and question voiced by the participant, record of what was asked and when.

o Hints or help given. What hints or tips were given and when. These could be coded with help of prompting guidance.

o Subjective measure of ease of task completion on a scale of 1-5.

Task XY Subtask Wrong

turns Hesitation or confusion

Comments or questions asked

Hints or tips given

Ease of task (1-5)

Subtask 1

Subtask 2

Subtask 3

Subtask 4

Subtask 5

Subtask 5

Subtask 6

5.2.3 Video recording

When conducting the controlled tasks, participants will be recorded. This ensures, that all important

aspects of the interaction will be gathered. Afterwards there is enough time to analyse the video


14

material. Videos will be analysed by a coding scheme for detecting usability and fun problems

developed by Barendregt and Bekker [5] based on Detailed Video Analysis method (DEVAN).

5.2.4 Interviews

Interviews should take place at several stages of the pilots and with the primary and secondary users.

On the one hand they will be used to assess the experiences gained in the controlled tasks, on the

other hand they can be used to assess how participants like interacting with POSEIDON in their daily

routine. The researchers are asked to make appointments with the referring participant and

parent/carer to conduct these interviews at three times during the testing period. People with DS may

have problems focussing or understanding some of the questions. Therefore, the following topics and

questions are a guide and can be asked in different ways depending on the participant.

General questions

How do you feel using POSEIDON?

How often do you use POSEIDON?

When did you use POSEIDON?

When was it helpful to use POSEIODN? Why?

When did you have problems in using POSEIDON? Why?

What do you like best?

What don’t you like?

Do you need help in using POSEIDON?

Did you enjoy using POSEIDON?

Would you use it again?

What was easy?

What was difficult?

5.2.5 Logs

Statistics of the use and preferences will be collected through logs embedded in the technology (which

inclusive services they use, in which contexts, when, how often, for how long, etc.). This enables the

researcher to draw conclusions about which features of POSEIDON are very helpful or where

improvements are needed. These data can be captured permanently during the testing period without

disturbing the user or interrupting him in his daily activities. The advantage of collecting these data is

that they are not biased by subjective perception.

5.2.6 Questionnaires

Feedback through questionnaires will run during and at the end of the pilot. Secondary users will be

asked, to fill in the following questionnaire once a week to gain some insights from their point of view

on how POSEIDON is used by the primary users. Furthermore changes over time in usage can be

assessed. This questionnaire will be made available online. Secondary users will receive a reminder

once a week to fill in this questionnaire. Filling in the questionnaire should happen in cooperation

between primary and secondary user.

In the tables below please circle or write in the answer that best describes your experience.

General

Is {primary user} using POSEIDON? Yes/No If not, why not?

I think {primary user} enjoys using it. Strongly disagree 1 – 2 – 3 – 4 – 5 Strongly agree

I think it is useful for the {primary user}. Strongly disagree 1 – 2 – 3 – 4 – 5 Strongly agree


15

5.2.7 Diaries

Diaries can provide comprehensive insights about user behavior, experienced problems and feelings

of success. Contrary to questioning they do not represent a snap-shop but rather are adaptable to the

events experienced within this period of time. Secondary users will receive a short questionnaire every

second day. They will be asked to give a short feedback regarding the usage of POSEIDON in the last

two day.

Date: If you or the person with Down’s Syndrome used POSIEDON today or yesterday please tell us:

Something that was good about the

system – or a way in which it helped the

primary user

Something that wasn’t so good

5.2.8 Immediate feedback

Impressions from primary users will be collected immediately after interacting with POSEIDON and

using the different systems. For that reason a modified version of the Smileyometer [6] will be used.

The Smileyometer was developed with the help of kids and proved to be very informative. The

Smileyometer will be displayed automatically after finishing an interaction. Participants will be asked

„How much fun was it to do hat activity?“.

I think the system is easy for {primary user}.

Strongly disagree 1 – 2 – 3 – 4 – 5 Strongly agree

About how often do you think the {primary user} uses it?

More than twice a day Once or twice a day A few times a week Once a week Less than once a week Never

Do you help the {primary user} using POSEIDON?

Yes/No Please explain:

Do you use it together? Yes/No Please expand:

Which features of POSEIDON does {primary user} use the most?

Navigation app Calendar app How to handle money app

What problems have you encountered?


16

Brilliant Good Awful

Additionally the “again-again table” will be used [6]. It lists activities on the left hand side and has three

columns headed. By using this instrument we try to answer if there is a desire to do things again. The

“again-again table” will be displayed automatically after finishing an interaction and participants will

be asked: “Can you imagine to use XX in the future?”

5.2.9 Scores

Scores can be used to assess immediate learning and long-term learning effects. Participants will receive scores for instance when doing a training with the stationary navigation system. If the score increases over time, we can draw the conclusion that also the learning success increases.

5.2.10 Limited Evaluation of POSEIDON

Whilst 18 users engaging over a month will provide rich data of all elements of POSEIDON, more

evaluation is needed. For that reason, we will evaluate a limited version of POSEIDON with at least 30

users in all countries. The DSA´s will help with the recruitment. They can send out information about

the projects, the app and ask if anyone is interested in testing it out. Conferences and workshop where

the POSEIDON project will take part will be used to introduce the POSEIDON system to large user

groups. People interested in the POSEIDON technology will be encouraged to try it out. Researcher will

observe and identify problems in usage. Afterwards they will be interviewed and asked about their

experiences.


17

6 Conclusion

On the previous pages we have introduced the process and protocols to allow a technical evaluation

of the systems that will be developed in the scope of POSEIDON. We introduced the rationale of the

technical evaluation by restating the POSEIDON timeline.

An overview of relevant best practice of system evaluation was given that relates to the requirement

driven design of the POSEIDON development process. Based on this it was possible to introduce a

method to quantify the technical capabilities of a system given a set of requirements. This assessment

is based on two factors - the adherence to the specified requirements and the risk associated to not

fulfilling the requirement, which can be determined using a risk score calculation. We outlined the

process and persons required to perform this evaluation and briefly introduced the tools necessary to

track the requirements.

Additionally methods and instruments which will be used in order to get comprehensive insights

regarding usability and user experience aspects were presented.


18

7 References

[1] B. Beizer, Software system testing and quality assurance. Van Nostrand Reinhold Co., 1984.

[2] B. Nuseibeh and S. Easterbrook, “Requirements engineering: a roadmap,” in Proceedings of the Conference on the Future of Software Engineering, 2000, pp. 35–46.

[3] I. Sommerville and P. Sawyer, Requirements engineering: a good practice guide. John Wiley & Sons, Inc., 1997.

[4] K. Dowd, Beyond value at risk: the new science of risk management, vol. 3. Wiley Chichester, 1998.

[5] Barendregt, W., & Bekker, M. M. (2006). Developing a coding scheme for detecting usability and fun problems in computer games for young children. Behavior Research Methods, 38(3), 382–389. http://doi.org/10.3758/BF03192791

[6] Read, J. C., MacFarlane, S. J., & Casey, C. (2002). Endurability, engagement and expectations: Measuring children’s fun. In Interaction Design and Children (S. 189–198).

Documents

Evaluation Protocols · 2017-08-30 · The deliverable D6.1 - Evaluation Protocols - is the first deliverable of Work Package 6 - Validation. The deliverable is linked to task T6.1