View
2
Download
0
Category
Preview:
Citation preview
PersOnalized Smart Environments to increase Inclusion of people with DOwn’s syNdrome
1
Deliverable D6.1
Evaluation Protocols
Call: FP7-ICT-2013-10
Objective: ICT-2013.5.3 ICT for smart and
personalised inclusion
Contractual delivery date: 30.04.2014 (M6)
Actual delivery date: 29.07.2014
Version: v1
Editor: Andreas Braun (FhG)
Contributors: Silvia Rus (FhG)
Reviewers: Juan Carlos Augusto (MU)
Terje Grimstad (Karde)
Dissemination level: Public
Number of pages: 20
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
2
Contents
Contents .................................................................................................................................................. 2
1 Executive Summary ......................................................................................................................... 3
2 Introduction ..................................................................................................................................... 4
3 System evaluation in POSEIDON ..................................................................................................... 5
3.1 POSEIDON Timeline ................................................................................................................. 5
3.2 Best practice of system evaluation ......................................................................................... 6
3.3 Measuring the impact of POSEIDON ....................................................................................... 7
3.4 POSEIDON requirements ......................................................................................................... 8
4 Technical evaluation of POSEIDON ............................................................................................... 10
4.1 Evaluating requirements ....................................................................................................... 10
4.1.1 Requirement adherence ................................................................................................ 10
4.1.2 Risk estimation .............................................................................................................. 11
4.2 Evaluation process ................................................................................................................. 13
4.2.1 Organization .................................................................................................................. 14
4.2.2 Managing potential requirement revision .................................................................... 14
4.3 Result Template ..................................................................................................................... 15
5 User experience evaluation of POSEIDON .................................................................................... 16
5.1 Evaluation of user experience requirement .......................................................................... 16
5.2 Evaluation process ................................................................................................................. 16
5.2.1 Organization .................................................................................................................. 17
5.2.2 User experience requirements ...................................................................................... 17
6 Conclusion ..................................................................................................................................... 19
7 References ..................................................................................................................................... 20
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
3
1 Executive Summary
The deliverable D6.1 - Evaluation Protocols - is the first deliverable of Work Package 6 - Validation.
The deliverable is linked to task T6.1 - Technical Assessment and has a number of connections to
deliverables of Work Packages 2, & 5. It is based on deliverables D2.1 - Report on requirements and
D2.3 - Report on Design of HW, Interfaces, and Software that build the foundation for the technical
Work Packages. These requirements are driving development in Work Packages 3, 4 & 5. For this
document we are also considering D5.1 - Development framework that describes the different
components and their specific requirements. Additionally, Section 3.2.1 of Part B of the POSEIDON
DoW lists different expected outcomes and the measures to assess their impact.
The document gives an introduction to the system evaluation, restating the timeline of POSEIDON
and the requirement driven development process. We show relevant best practice of system
evaluation and how it relates to the methods used for the POSEIDON evaluation.
The process of the technical evaluation is outlined, stating both methods and management layout.
The evaluation will be based on assessing the adherence to requirements and the associated risk of
not meeting them. Additionally, the process of the user evaluation is briefly recapitulated and an
introduction of the management is given. A template for the evaluation including the process to
estimate the mentioned risk concludes this document.
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
4
2 Introduction
In the scope of POSEIDON a variety of different technical systems will be developed and integrated
into a common platform. Thus it is crucial to monitor and evaluate the performance of the system
including, but not limited to hardware, interface, software, integration routines, but also
contributions to safety, privacy and ethics. This task is integrated into Work Package 6 - Validation, as
Task 6.1 - Technical Assessment. This task is intended to materialize the mechanisms that are
envisaged to produce a highly reliable and useful product. In this scope the compliance to
requirements will be double-checked, a link between requirements and testing, as well as monitoring
the validation and pilots including the design revisions.
This deliverable D6.1 Evaluation Protocols outlines the different steps involved in the testing process.
This will range from analyzing the best practice in this domain and the prerequisites of POSEIDON, to
the creation of evaluation protocols themselves - that will be used in the validation phase to quantify
the performance of the technical systems.
The system evaluation of POSEIDON is driven in part by the project specific timeline of pilots and
studies. The most important factor is to verify adherence to the previously defined requirements
gathered in Work Package 2. The technical systems that are to be developed in WP3, WP4 and WP5
should be tested according to the process specified in this document. The evaluation process is
strictly based on the requirements that are defined and can be detailed in the scope of the
validation.
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
5
3 System evaluation in POSEIDON
3.1 POSEIDON Timeline The interface strategy strongly depends on the requirements gathered during the requirement
analysis. The interviews of the primary user, the online questionnaires of the secondary and tertiary
user and the first user workshop have all contributed to this process. From these requirements an
interface strategy is extracted, imminently followed by an implementation in form of an integrated
prototype. In Fig. 1, (b) and (d), the requirements gathering phase and the first user workshop
followed by f when the first interfaces and interactive technology are set up, are represented
respectively. The created prototype (g) is evaluated in the second user workshop (h) and the
outcome of the workshops is subsequently analyzed. The feedback is taken into account and the
interface (l) is adjusted, when the interfaces are completely defined. This iterative process for the
interfaces is finished in step r, when the improved interfaces are set up.
Fig. 1 Project and work package milestones and events
From this timeline we can identify a set of milestones of the technical systems developed in the
scope of POSEIDON. The relevant events determining the points in which the system is tested are the
user workshops and the continuous pilots of integrated prototypes. Therefore, we have the following
short list of events that are directly associated to the system evaluation in WP6.
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
6
Table 1 Important events related to evaluation in POSEIDON
Event Description Time in project
Planned date
1st User Group Workshop
Marks the end of the requirements gathering face - requirements refinement
M3 January 2014
2nd User Group Workshop
Indicates successful implementation of first set of requirements
M10 September 2014
First Pilot Testing of integrated prototype over a longer time
M20-21 from July 2015
Second Pilot Testing of second integrated prototype over a longer time
M30-31 from May 2016
Final Workshop Final check of requirements and post-project planning
M36 October 2016
From those dates the first user group workshop has a special role, as it marks the end of the
requirement gathering process and the adherence to requirements can’t be validated already.
Instead the primary purpose is to finalize the list of requirements and elucidate new ones according
to the feedback gathered from the users.
The next three events are the primary important events in the scope of the process described in this
deliverable. During those events the systems are tested by the users and the evaluators can
determine how well the requirements set for this specific stage have been fulfilled. Afterwards, the
set of requirements to be fulfilled for the next stage can be adapted and refined according to the
results of the testing.
The last event is primarily intended to prepare the potential market launch of the POSEIDON system
and can also act as a final check of adherence for all requirements, in order to verify that the full
system is running with all intended functionality and at the intended level of stability.
The process of evaluation is closely following the requirements gathering process, as specified in
WP2. In the next section some best practice of system evaluation in general and the necessary
aspects of requirements engineering specifically.
3.2 Best practice of system evaluation A large body of literature has been written on testing systems for conformity, adherence, robustness
and feature-completeness. The goal is always to assure that a certain level of quality is reached that
has been defined early in the project [1]. POSEIDON is following an approach that is based on
defining and testing a set of requirements that are to be fulfilled at different stages of the project.
Thus, we are using an approach that known in software development as Requirements Engineering.
Requirements engineering describes the process of formulating, documenting and maintaining
software requirements and to the subfield of Software Engineering concerned with this process [2].
Typically we can distinguish a set of seven different steps in this process, namely [3]:
1. Requirements inception or requirements elicitation -
2. Requirements identification - identifying new requirements
3. Requirements analysis and negotiation - checking requirements and resolving stakeholder
conflicts
4. Requirements specification (Software Requirements Specification) - documenting the
requirements in a requirements document
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
7
5. System modeling - deriving models of the system, often using a notation such as the Unified
Modeling Language
6. Requirements validation - checking that the documented requirements and models are
consistent and meet stakeholder needs
7. Requirements management - managing changes to the requirements as the system is
developed and put into use
As we are not solely developing software in POSEIDON this process has to be adapted to cater also to
the requirements of the different hardware systems that will be developed. These adaptations are
discussed in the following section.
3.3 Measuring the impact of POSEIDON In the application phase of POSEIDON, the consortium included a number of different considerations
into the initial proposal concerned with measuring the impact of POSEIDON. We would like to briefly
revisit these considerations and put them into context. The impact measurement is based on
analysing a set of expected outcomes with a specific measure of success. They are listed in Table 2.
Five different expected key outcomes are listed together with a proposed measure of success. In the
evaluation phase we will transfer the measures and outcomes into requirements that can be mapped
regarding the procedure outlined in the other chapters of this document. We will iteratively revisit
this table and track the progress throughout all project phases.
Table 2 How outcomes are measured in POSEIDON
Expected Outcomes Proposed Measure of Success
Novel accessibility soluti-ons for user groups at risk of exclusion.
One novel service/application with many functions that is available for people with DS (primary user group) and other intellectual disabilities.
Results of testing in primary user group positive so development/production and marketing of the product will proceed.
Interest organisations for DS and also for other persons with intellectual disabilities, positive to the product and will spread information about it (at conferences etc.) because they think it is useful.
The measurement of these outcomes can be done more precisely during the last trimester of the project as part of the preparations for bringing the product to market.
Enhanced quality of life for people at risk of exclusion, including peo-ple with disabilities, older people and people with low digital literacy and skills.
More than 50% of representatives for target group (including parents, carers, and teachers) find that our product makes persons with DS more independent and autonomous in their daily life.
The impact of the product will also be observed in daily situations. Mastering of the technology developed will be measured by observations and interviews of representatives (parents, carers, teachers etc.) and interviews of people with DS.
More than 50% of persons in target group like to use the product.
These measurements will be done through the pilots [months 10 and 20] and the user group workshops [months 3, 10, 36] which will allow us to trace the variations in response in relation to the evolutions of the system.
Strengthened possibilities of employment to non-highly specialised profess-ionals.
The more independent and autonomous in daily life the greater is the chance of employment for people with DS and other intellectual disabilities. Increased independence within the environment will be measured by:
be evidenced by an increasing response to technical triggers rather than ‘being
told what to do next’.
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
8
result in relationships that depend less on instruction and more on engagement,
to better facilitate mutual relationships.
These measurements will be achieved through a combination of information
gathered by the kits on their usage and the feedback provided by the users in each
pilot.
Improved competitive-ness of European ICT industry in the field of inclusive smart environ-ments and services.
The US is ahead of Europe with regard to ICT devices and programmes for people with intellectual disabilities (for example Ablelink Technologies). Our service/ application will reduce the gap. Our service/application will be adaptable to different countries, cultures and languages. This will be tested in different European countries. A measure of acceptance is that relevant organizations for targeted beneficiaries find that the product is good and say so at conferences, meetings etc. to increase the use of the product.
The measurement of these will be performed along the life of the process by contacting those who are interested in the project from any dimension. The findings will be compiled and summarized for the final report.
Wider availability and effectiveness of develop-ers’ tools for creating inclusive smart environ-ments (targeted to SMEs, key mainstream industria-lists, open-source devel-opers, and other less technical developers).
The aim of POSEIDON is to make some relevant inclusion services and a framework which should enable a wide range of developers to provide services for people with DS. Increased number of inclusion services and interest will be measured by:
Number of developers participating in POSEIDON social media.
Number of companies providing POSEIDON services.
Number of POSEIDON apps/services provided in different countries.
The methods and architecture developed for the product establish a best practise from which others can learn.
Some of these measurements can be only performed partially and at late stages of the project (after month 30) when the system is fully fledged and we can start transitioning to market deployment. Findings and evidence of these will be described in the final report.
3.4 POSEIDON requirements The interface strategy is closely related to D2.1 - Report on requirements and to D2.3 - Report on
Design of HW, Interfaces, and Software. The general design principles for interfaces have been
presented in D2.3. Therefore, parts of the following section regarding the design principles are taken
from there, while section 3.2.2 regarding the requirements analysis presents the requirements
applicable for the interface strategy.
In order to assure compatibility between the process of requirements engineering outlined in the
previous section and the development in POSEIDON, the requirements have to be adapted
appropriately. We will briefly describe the adaptation process for each step.
1. The requirements inception is performed according to specifications in the DoW, the initial
requirements gathering phase and the results of the first user group workshop
2. The requirements identification is performed iteratively after each user workshop and the
pilot phases of the integrated system
3. The requirements analysis and negotiation occurs led by a core group of developers in strong
collaboration with the representatives of the user organizations
4. The specification of the requirements adds a number of labels to distinguish functional and
non-functional requirements, identify the target system iteration and the associated system
components
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
9
5. The combined software- and hardware-architecture is initially designed at a very high level
and will be detailed according to the requirements of the current prototype iteration
This step is the primary concern of this document and detailed in the next section
6. POSEIDON is using a shared Wiki system to manage the requirements and their different
iterations
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
10
4 Technical evaluation of POSEIDON
As previously mentioned this section will detail the evaluation process of the technical systems
developed in POSEIDON. It is separated into three distinct parts. At first we will detail how
requirements can be evaluated, including monitoring and risk assessment. The second part outlines
the organizational structure of the evaluation process and the last part gives a set of templates that
will be used to evaluate the requirements.
4.1 Evaluating requirements In the scope of WP2 the requirements of the POSEIDON system are defined and refined throughout
the project. While some requirements can be easily quantified, particularly non-functional
requirements need additional information that helps determining how well they meet criteria set
towards them. In this section we will show how to determine the adherence of requirements and
based on that calculate a risk level that helps to specify how the requirements should be modified in
the later stages of the development and how severe the expected impact on the development
timeline will be.
4.1.1 Requirement adherence
The first important factor in the requirements evaluation is to determine how well they adhere to
the specified metric. Here we can distinguish two different types of requirements - quantitative
requirements and non-quantitative requirements. The first category can be easily measured and in
order to fulfil the requirement a discrete number is given. In the following tables, we will use a four-
level grading system for the different components of the requirement evaluation. This grading will
simplify the notation of adherence in the later project stages. The grades used are A, B, C and F -
whereas A indicates the highest grade, e.g. full adherence and F typically indicates failure. Table 3
shows grading, adherence level and description for quantitative requirements.
Table 3 Quantitative requirements adherence levels, descriptions and associated score
Grade Adherence Level Description
A Full Achieving at least 100% of the specified value
B With Limitations Achieving between 80% and 100%
C Low 50% to 80%
F Failure Less than 50%
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
11
A similar grading can be used for non-quantitative requirements and their adherence. In this case the
level is subjective by definition and should therefore be determined by multiple persons, e.g. using a
majority vote.
Table 4 Non-quantitative requirements adherence levels, descriptions and associated score
Grade Adherence Level
Description
A Full Full adherence is achieved if the requirement is completely fulfilled in the given evaluation
B With Limitations
If a requirement is only partially fulfilled, but the deviation is not critical, a level “with limitations” should be given
C Low If a requirement is partially fulfilled and the deviation is high the adherence level is “low”
F Failure If a requirement is not fulfilled at all the level of “failure” has to be attributed
4.1.2 Risk estimation
An important step in estimating the potential impact of not fulfilling a requirement is performing a
risk assessment based on the level of adherence and some other factors. Risk management is a
whole field of study dedicated to identify, assess and priorize risks [4]. Apart from the risks specified
in the DoW we are only using a minimal routine to estimate the risk or impact of note meeting a
certain requirement, based on the following factors:
Risk Score = Adherence Level Score * (Criticality Level + Estimated time to fix)
The adherence level score is derived from Table 4, shown in the previous section. The criticality level
can be defined according to the following table.
Table 5 Criticality of non-adherence to requirements
Grade Criticality Level
Description
A Not critical Not fulfilling this specific requirement will not interfere with the overall functionality of the system - mostly suited for optional requirements
B Potentially critical
If not fulfilled this requirement has some impact on the system or user experience but it is considered low
C Very critical If the requirement is not fulfilled the system is expected to behave unexpectedly or not adhere to the minimum functionality required
F Fatal This breaks the system experience and permits main functions from working
The last component is the estimated time required to fix the requirement in a way so it will be
adhering to the specified level. This is important to get an estimate about the resources that will be
required to fix the problem and how they can be mapped into the remaining project timeline and the
development queue until the next iteration is due. The time required to fix will be quantized similar
to the previous factors, in the following table. It should be noted that the time should be estimated
in a way that includes testing of the fixed requirement:
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
12
Table 6 Score associated to estimated time to fix a certain requirement not met
Grade Estimated time to fix
A < 1 hour
B < 1 day
C < 1 week
F > 1 week
In order to calculate the risk score we have to associate the grades that were given to a numeric
value. We are using the simple association of the following table:
Table 7 Association between grades and numeric values
Grade Numeric value
A 0
B 1
C 3
F 5
Now, all components are complete that are required to calculate the risk score. For example, if there
is a requirement with low adherence level that is not critical and will take less than one day to fix and
test the resulting risk score is:
Risk Score = 3 * (0 + 1) = 3
A second example for a requirement with low adherence level that is potentially critical and takes a
long time to fix is:
Risk Score = 3 * (3 + 5) = 24
The risk score is considerably higher than before, as the impact of not meeting this requirement on
the project development roadmap can potentially be very high, primarily due to the long time that is
required for a fix. The risk score for a fully met requirement thus is always 0 and there is obviously no
need to note criticality and time to fix.
Finally, we can group the risks into three distinct risk level groups based on their risk scores, as
shown in the following table:
Table 8 Association between risk levels and risk score
Risk level Risk score
High > 15
Medium < 15
Low < 6
None 0
This risk level should drive the discussion about how to manage changes to requirements and the
impact of not meeting a specific requirement on the development schedule specified in the
development roadmap. This risk level should be noted for any requirement that is not achieving
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
13
adherence level, or for all requirements, whereas the risk level for requirements fully met, is
automatically set to “None”.
4.2 Evaluation process
Figure 1 Requirement implementation and evaluation cycle
The evaluation process is following the cycle outlined in Figure 1. Five different phases can be
distinguished:
1. During the piloting & testing the adherence of the different requirements is tested with
regards to the specification. The duration is depending on the setup of the pilot/workshop
2. In the analysis phase these results are used to calculate derived information, such as the risk
score. This phase should not last longer than 14 days, in order to guarantee a thorough
review, while not affecting the overall development too much.
3. The consolidation phase is a physical or virtual meeting, in which a consensus on the ranking
of the requirements has to be found. This meeting should also be used to determine if an
adaptation of requirements is needed
4. In the requirement revision phase the adaptations specified in the consolidation phase have
to be detailed and integrated into the development roadmap
5. The development phase will implement the system in order to fulfill the requirements
needed for the next iteration of the prototype
Piloting & Testing
Analysis
ConsolidationRequirement
Revision
Development
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
14
4.2.1 Organization
The organizational structure of the personnel responsible should be kept small, in order to minimize
overhead. We envision a system of three roles operating given the following organizational structure.
Figure 2 Organizational structure of technical evaluation
The different roles will have the following tasks.
Technical Evaluation Coordinator will lead the process, has to query the different stages of
the process, distribute tasks to the technical evaluation committee and lead the
consolidation meeting
The technical evaluation committee members will execute the different tasks related to the
evaluation process, such as performing the technical evaluation, analyzing the results,
participating in the consolidation meeting and adapting the requirements
The requirements advisory board is comprised of several members of the social science
specialists within the consortium that perform the workshops and pilots and can contribute
to the process during the consolidation meeting and analysis phase - giving input to potential
adaptations of requirements for the next iteration
4.2.2 Managing potential requirement revision
In POSEIDON we are using a Wiki-based system to track the requirements. The integrated versioning
allows keeping track of the different versions. In Figure 3, we can see a screenshot of the system. The
system is in the process of being set up - a process that will be completed in time before the first
pilot. The wiki follows the structure of requirements as specified in D2.3.
Figure 3 Screenshot of requirement tracking wiki page
Technical Evaluation
Coordinator
Technical Evaluation Committee
Requirements Advisory
Committee
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
15
The wiki system will be updated by all members of the Technical Evaluation committee, with the
Technical Evaluation Coordinator being responsible for periodically checking for adherence to the
specified standards.
In the future we will investigate different wiki add-ons or other system that will allow us to create the
result template presented in the following section automatically, from a subset of requirements.
4.3 Result Template The results of the requirement evaluation should be noted in a specific template that allows
recording the information required in the later stages of the process. The following table can be
expanded and printed out with the current list of requirements to be tested in scope of the next
event. There are two example requirements filled in order to show how the table works. They are
not related to any actual evaluation.
Table 9 Template of requirement evaluation sheet
Label Requirement Category Quantitative Qualitative Adh Level
Criticality Est. time-to-fix
Fun5 Should keep track of
user's position when
traveling outdoors
Functional - - A - -
Fun6a Should provide basic
outdoors navigation
services
Functional - - B F C
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
16
5 User experience evaluation of POSEIDON
As previously mentioned this section will detail the evaluation process of user experience and
usability factors of the POSEIDON systems. It is separated into three distinct parts. At first we will
detail how requirements can be evaluated, including monitoring. The process and organizational
aspects are similar to the ones presented in the previous chapter. Therefore we will refrain from
reiterating most of the information and focus on the novel aspects.
5.1 Evaluation of user experience requirement The evaluation of user experience and usability is more difficult to express in terms of quantitative
measurements. There are three different levels that we have to consider in increasing level of
abstraction:
1. Functionality
2. Usability
3. User experience
The functional aspects include all the required aspects for the system to work in the desired way.
This is covered by the technical requirement evaluation presented in the previous section. Usability
extends this scope by also considering aspects such as intuitiveness of use and predictability of the
chosen actions. Finally, user experience covers all aspects of the system including the look and feel of
POSEIDON. This category encompasses aspects, such as “joy of use”, reaction of users to the system,
trust of the users, and various other aspects. In terms of evaluation this often is a subjective matter.
5.2 Evaluation process
Figure 4 User experience testing cycle
The user experience evaluation process is following the cycle outlined in Figure 2. It is similar to the
process for the technical evaluation. Four different phases can be distinguished:
Piloting & Interviews
Analysis
Consolidation
Requirement Revision
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
17
1. During the piloting & testing the adherence of the user experience requirements is tested
with regards to the specification. It is happening at the same time as the technical evaluation
2. Analogue to the technical evaluation, the analysis phase uses the results to calculate derived
information, such as the risk score. This phase should not last longer than 14 days.
3. The consolidation phase is analogue to the technical evaluation
4. The requirements revision phase is also similar to the technical evaluation. However, single
user experience factor may affect various requirements. Accordingly, more time should be
planned in to thoroughly perform this task
5.2.1 Organization
The organizational structure of the personnel responsible should be kept small, in order to minimize
overhead. We envision a system of three roles operating given the following organizational structure.
Figure 5 Organizational structure of technical evaluation
The different roles will have the following tasks.
User Experience Evaluation Coordinator will lead the process, has to query the different
stages of the process, distribute tasks to the technical evaluation committee and lead the
consolidation meeting
The user experience evaluation committee members will execute the different tasks related
to the evaluation process, such as performing the user experience evaluation, analysing the
results, participating in the consolidation meeting and adapting the requirements
The requirements advisory board is comprised of several technical specialists within the
consortium that perform the workshops and pilots and can contribute to the process during
the consolidation meeting and analysis phase - giving input on the technical feasibility of
system adaptations
5.2.2 User experience requirements
In order to perform a risk assessment similar to the previously described requirement evaluation, it is
necessary to specify a set of requirements specifically associated to this factor and analyse them
similar to the evaluation of qualitative requirements that was introduced earlier. A practical
approach is to “quantize” the user experience by performing interviews based on Likert-scales or
transfer open interview questions to a Likert-scale scheme after the fact. The latter is required when
interviewing certain user groups that are not suited to fill out Likert questions due to lack of
experience or other factors.
This transfer of open-ended questions to Likert-scale ratings enables to quantify the results of the
questionnaire, in order to specify adherence levels as specified in the previous sections. However,
this process has to be performed very carefully. The members of the user experience evaluation
User Experience Evaluation
Coordinator
User Experience Evaluation Committee
Requirements Advisory
Committee
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
18
committee will perform this transformation according to best practice. Example can be found in the
write-ups by Hughes and Silver [5], or Jackson et al. [6].
The next step is to use this coding or quantification to grade the responses on user experience similar
to the process presented in the previous section. According to the specific question, we propose the
scheme, shown in Table 10.
Table 10 User experience adherence levels, descriptions and associated score
Grade Adherence Level Description
A Full Matching or exceeding threshold
B With Limitations Scoring between 80% and 100% of threshold
C Low Scoring between 50% and 80% of threshold
F Failure Scoring less than 50% of threshold
The most important factor to consider is the threshold. This should be defined, based on the
importance of a specific item. On Likert-scale to 10, a typical value could be 7. When analyzing the
responses of the interviews the result is either a quantified table for Likert-style questions or coded
values for open-ended questions. The coding should be chosen accordingly - so we can also assume a
scale of 1-10 and a threshold of 7.
Using this as base for grading the calculations for the different adherence grades, with a threshold of
7 look like this:
Grade Score
A >= 7.00
B 6.60 to 7.00
C 3.50 - 6.60
F < 3.50
Achieving the adherence grades enables us to use the risk assessment of the previous section to
specify risks and estimate their severity and impact on the project. The risk assessment is part of the
analysis phase and should be performed at this step.
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
19
6 Conclusion
On the previous pages we have introduced the process and protocols to allow a technical evaluation
of the systems that will be developed in the scope of POSEIDON. We introduced the rationale of the
technical evaluation by restating the POSEIDON timeline and the measures of impact and success
that were introduced during the application phase.
An overview of relevant best practice of system evaluation was given that relates to the requirement
driven design of the POSEIDON development process. Based on this it was possible to introduce a
method to quantify the technical capabilities of a system given a set of requirements. This
assessment is based on two factors - the adherence to the specified requirements and the risk
associated to not fulfilling the requirement, which can be determined using a risk score calculation.
We outlined the process and persons required to perform this evaluation and briefly introduced the
tools necessary to track the requirements.
Additionally, the document also discusses critical differences between technical evaluation and user
experience evaluation. This includes differences in process, but also evaluation and quantification of
measurements.
FP7 ICT Call 10 STREP POSEIDON Contract no. 610840
20
7 References
[1] B. Beizer, Software system testing and quality assurance. Van Nostrand Reinhold Co., 1984.
[2] B. Nuseibeh and S. Easterbrook, “Requirements engineering: a roadmap,” in Proceedings of the Conference on the Future of Software Engineering, 2000, pp. 35–46.
[3] I. Sommerville and P. Sawyer, Requirements engineering: a good practice guide. John Wiley & Sons, Inc., 1997.
[4] K. Dowd, Beyond value at risk: the new science of risk management, vol. 3. Wiley Chichester, 1998.
[5] G. Hughes and C. Silver, “Quantitative analysis strategies for analysing open-ended survey questions in ATLAS.TI,” 2011. [Online]. Available: http://www.surrey.ac.uk/sociology/research/researchcentres/caqdas/support/analysingsurvey/quantitative_analysis_strategies_for_analysing_openended_survey_questions_in_atlasti.htm.
[6] K. M. Jackson and W. M. K. Trochim, “Concept mapping as an alternative approach for the analysis of open-ended survey responses,” Organ. Res. Methods, vol. 5, no. 4, pp. 307–336, 2002.
Recommended