Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
INTRODUCTION TO EVALUATION MODULE
SEPTEMBER 2019
Table of ContentsSECTION ONE........................................................................................................................................................2
1
Defining Evaluation......................................................................................................................................2
Evaluation and monitoring..........................................................................................................................3
Evaluation and research..............................................................................................................................7
Evaluation approaches.................................................................................................................................9
Evaluation activities and the evaluator.....................................................................................................15
What is the purpose of programme evaluations?....................................................................................16
SECTION TWO.....................................................................................................................................................19
Types of evaluation....................................................................................................................................19
Evaluation Criteria......................................................................................................................................22
Evaluation Standards, guidelines and principles.......................................................................................24
Sustainable Development Goals................................................................................................................29
SECTION THREE...................................................................................................................................................32
Evaluation Designs.....................................................................................................................................32
Table of figures:
Figure 1: Relationship between monitoring and evaluation. Adapted from Wildschut (2015)............................5Figure 2: Programme life cycle.............................................................................................................................6Figure 3: Utilization focused evaluation steps. Adapted from Ramírez and Brodhead (2010:1)........................10Figure 4: RMBE and the logic model...................................................................................................................14Figure 5: The various uses of summative and formative evaluations; adapted from the PELL Institute Evaluation Tool Kit..............................................................................................................................................20Figure 6: CDC framework for programme evaluation in public health...............................................................25Figure 7: AEG utility standards...........................................................................................................................26Figure 8: AEG feasibility standards.....................................................................................................................27Figure 9: AEG propriety standards......................................................................................................................27Figure 10: AEG accuracy standards.....................................................................................................................28Figure 11: SDG 6 targets and indicators.............................................................................................................29Figure 12: the classic experimental design. Adapted from Mouton (2016: 17).................................................34
2
SECTION ONEDefining EvaluationIntroduction: We will refer to slides in the accompanying Power Point presentation in this section and the
sections two and three that follow. We will also refer to the case study of the Youth in Action for Health
Programme from Shangano Organisation. In this section, we will look at some of the definitions of evaluation.
We will also look at the relationship between evaluation and monitoring and the relationship between
evaluation and research. Following this, we will explore some of the approaches that have influenced the
discipline and practice of evaluation over time. Then we will zoom in on the value and purpose of evaluation.
This will be followed by a quick look at some of the common activities that an evaluator undertakes during an
evaluation process and the possible roles that the evaluator plays and inevitably, the competencies the
evaluator may need to be able to play those roles. At the end of this section, our expectation is that you
should be able to do the following:
1. Define evaluation;
2. Differentiate evaluation from monitoring and research but also be able to see where they intersect;
3. State some of the evaluation approaches and their key features;
4. Explain different types of evaluation purpose;
5. Identify common evaluation activities, evaluator roles and the competencies you need to play those
roles;
Evaluation is defined as:
Evaluation is the systematic assessment of the operation and/or the outcomes of a program or policy, compared to a set of explicit or implicit standards as a means of contributing to the improvement of the program or policy. (Weiss, 1975)
Evaluation is the systematic and objective assessment of an ongoing or completed project, program, or policy, including its design, implementation, and results. The aim is to determine the relevance and fulfillment of objectives, development efficiency, effectiveness, impact, and sustainability’ (OECD, 2002:21).
Along the same lines as the definitions above, evaluation is defined by the Business Dictionary as a detailed
assessment of the outcome of a program. This assessment is carried out against established measures or
expected results to determine if it achieved its objectives. From these definitions, the following can be noted
about evaluation:
1. It follows and informs an intervention – project, programme or policy;
3
2. It seeks to establish whether outcomes (changes that we expect to see among beneficiaries because
of the intervention) have occurred, why they have occurred and how in what context and what
conditions;
3. It requires specific competencies (skills, knowledge and attitudes) from those that conduct
evaluations;
4. It is guided by a set of criteria or standards (discussed in sections that follow);
5. It involves critical thinking and use of information before reaching a judgement about the
intervention and
6. It seeks to improve programmes, projects and policies and inform decision-making.
Evaluation and monitoringUNAIDS defines monitoring as the routine tracking and reporting of priority information about a program /
project, its inputs and intended outputs, outcomes and impacts. This definition is line with the definition
offered by OECD. They describe monitoring is described as ‘continuous and systematic collection and of data
on specified indicators. The aim is to provide management and the main stakeholders of an ongoing
intervention with indications of the extent of progress and achievement of objectives and progress in the use
of allocated funds (OECD, 2002:27).
Looking at these definitions of monitoring and those of evaluation discussed above, it is evident that one of
the key differences between evaluation and monitoring is that the former is carried out at sporadic points in
the life cycle of the programme while the latter is ongoing (Mouton, 2016; Kellog Foundation, ud; Adhikari,
2017). Secondly, while evaluation is concerned about outputs and outcomes and impact, monitoring focus on
activities (what programme team members and implementing partners do), inputs (what they put in e.g.
human and financial resources) and outputs (the direct results of the activities carried out by programme
staff – these are tangible or intangible e.g. classrooms constructed, boreholes drilled or awareness campaigns
conducted). Referring to our case study of the YAH programme at Shangano Organisation, activities would
include training peer educators and setting up WhatsApp groups. Outputs would include resource centers
that have been established. One key outcome would be that students have increased uptake or use of
available health services. At impact level, the change we would expect is a reduction in new STIs at the 6
universities.
Table 1 below, adapted from Surbhi (2017) gives a summary of the differences between monitoring and
evaluation:
Table 1: Differences between monitoring and evaluation.
4
Basis for comparison
Monitoring Evaluation
Meaning Monitoring refers to a routine process, that examines the activities and progress of the project and also identifies bottlenecks during the process.
Evaluation is a sporadic activity that is used to draw conclusion regarding the relevance and effectiveness of the project or program.
Related to Observation Judgement
Occurs at Operational level Business level
Process Short term Long term
Focuses on Improving efficiency Improving effectiveness
Conducted by Internal Party Internal or External Party
Adapted from Surbhi, 2017. https://keydifferences.com/difference-between-monitoring-and-evaluation.html
Now if we turn back to our focus programme which is YAH, what do you think would be the key differences
between the monitoring and evaluation activities? You have probably considered a range of possibilities that
include the following:
1. The YAH clarificatory/deign evaluation was conducted at the beginning when they were
conceptualizing the programme. They are likely to have carried out feasibility assessments and or
needs assessment to get an understanding of the extent of problem of access to information and
SRH services and its uptake at universities. They are highly likely to have used tools such as the
problem/solution analysis tree (more details of this tool available at:
https://www.odi.org/publications/5258-problem-tree-analysis)
2. Shangano intends to have a process evaluation at the end of the second year of the programme,
which is at the mid-term of the programme. Their aim will be to see if they have been implementing
this programme in the universities according to plan and also look at how to improve their
implementation.
3. They will have an outcome evaluation at the end of the programme to see whether they have
achieved their objectives. One of them being to improve life skills of university students. Three years
after the programme ends they will have an impact study to check whether the programme has had
long-term effects in the universities.
4. Shangano will continuously collect data, monitoring data to check progress on outputs towards the
set targets – this data will be collected and analyzed monthly or quarterly. For instance, data on
participation on web-based platforms and use of services such as HIV Counseling and Testing will be
collected monthly while data on leadership engagement will be collected quarterly.
5
If you have come to the conclusion that this emphasizes that evaluation is periodic while monitoring is
continuous, then you are right! The YAH team will continuously monitor and only evaluate at specific points
in time.
The relationship between monitoring and evaluation
Before you evaluate, you must monitor (McKenna, 2018). This already indicates that there is a relationship
between the two. Evaluation leads to a judgment of merit or worth and some of the key data used to support
the inferences made are gathered during monitoring. Schwandt (2016) makes a similar observation and
notes that evaluation makes use of monitoring data. Monitoring and evaluation complement and reinforce
each other (UNESCO, 2016). As such, Shangano organization will not be able to have an impact study if they
have not collected baseline and monitoring data on the YAZ programme. The relationship between the two is
shown in figure 1 below:
Figure 1: Relationship between monitoring and evaluation. Adapted from Wildschut (2015)
Much of our discussions in this module will elaborate more on evaluations of programmes and projects. To access more information on evaluations of policies, you can visit these websites:
African monitoring and evaluation systems: exploratory case studies https://www.dpme.gov.za/publications/Reports%20and%20Other%20Information%20Products/Case%20Studies.pdf
The World Bank: Building Better Policies : The Nuts and Bolts of Monitoring and Evaluation Systems: http://www.managingforimpact.org/resource/world-bank-building-better-policies-nuts-and-bolts-monitoring-and-evaluation-systems
Monitoring and evaluation of policy influence and advocacy: http://www.managingforimpact.org/resource/monitoring-and-evaluation-policy-influence-and-advocacy
6
Evaluation guidelines: http://www.managingforimpact.org/resource/evaluation-guidelines
The World Bank: Handbook on Impact Evaluation: Quantitative Methods and Practices: http://www.managingforimpact.org/sites/default/files/resource/world_bank_impact_evaluation_handbook.pdf
How Feedback Loops Can Improve Aid (and Maybe Governance): http://www.managingforimpact.org/sites/default/files/resource/center_for_global_development_feedback_loops_whittlefeedbackessay_0.pdf
Influential valuations: Evaluations that Improved Performance and Impacts of Development Programs
http://www.managingforimpact.org/sites/default/files/resource/influential_evaluations_ecd.pdf
M&E and the programme life cycle
In the Monitoring section of this module we discussed the programme life cycle. For this module we adopted the programme life cycle used by UNODC as shown in figure 2 below:
Figure 2: Programme life cycle
Monitoring and evaluation is increasingly being viewed a programme management tool. As shown in the
programme life cycle above, M&E is not a stand-alone activity at the end of the programme or only at the
beginning of the programme. M&E is an integral part of the programme. To drive the point home, let’s look
at the YAH programme at Shangano organisation:
1. Stage 1 – Planning: M&E was part of the planning stage of their programme life cycle. As shown in
the figure above, at planning, baselines, needs assessment, stakeholder analysis, the M&E plan and
the project design and logframe are carried out. This is all part of the design/clarificatory evaluation
7
that Shangano undertook. This helped them to establish the need, understand the problem, unearth
assumptions they had about the SRHS issues at campuses and how and why the YAH programme
components of information dissemination and services provision would bring about change to
address those issues.
2. Stage 2 – Implementation: during this stage, a mid term evaluation is conducted and continuous
monitoring takes place during this phase. Monitoring is guided by the M&E plan developed in the
planning stage (an M&E template can be accessed at
http://www.tools4dev.org/resources/monitoring-evaluation-plan-template/). For YAH programme,
a mid term evaluation will be conducted at the end of 2020, which is the second year of the
programme. This evaluation will inform implementation as Shangano makes necessary adjustments
based on evaluation findings. Monitoring data collected from the campuses and service providers
will also inform implementation.
3. Stage 3 – Evaluation: at this stage, the outcome evaluation will be conducted. For YAH, the aim
could be to get an understanding of whether or not the objectives have been met. For instance, do
universities now provide SRH services to students? Is there increased uptake of such services? And
have any policies and ordinances been changed. As can be seen in the figure above, findings will be
used to inform what Shangano will do next. They may decide to upscale – e.g. increase the number
of universities in the programme. They may decide to replicate – introduce the YAH programme at
FETs or poly-techniques in Zimbabwe such as Harare Poly or have backward linkages into
communities close to the universities, for instance rural communities close to Lupane State
Universities. They may also decide to scale back. For instance they may choose refocus their efforts
at just 3 of the 6 universities where they found the need to be highest and the institutional
arrangements supportive of the programme.
Evaluation and researchWe have defined evaluation in the preceding section and it is clear that evaluation follows precedes, informs
or follows an intervention. Research on the other hand is defined as ‘the collection and evaluation of
information about a particular subject. The overarching purpose of research is to answer questions and
generate new knowledge’ (Nordquist, 2019:1). Similary, reserch is
There are 4 views on the relationship between research and evaluation; a) research and evaluation as a
dichotomy, b) evaluation and research are mutually independent, c) evaluation as a subset of research and d)
research as a subset of evaluation.
8
View 1: Research and evaluation as a dichotomy:
Podems (2014), just because someone is a good researcher it does not necessarily mean that they will be a
good evaluator. The differences between evaluation and research are listed below and also reflected on slide
number 6 in power point presentation. These are also shared by Patton (2014). As shown by slide number 6,
the intersection between the two is strong at methods and analysis. During the design, sampling and or
selection, data collection and analysis, evaluation teams use quantitative and qualitative research methods
and analysis. The differences between evaluation and research are evident on the following:
Purpose: research seeks to prove and evaluation seeks to not only prove but improve. Research generates
new scientific knowledge while evaluation produces information used for decision making e.g. on scaling up,
replicating or terminating an intervention.
Focus: research is researcher focused while evaluation is stakeholder focused (e.g. to meet the information
needs of different stakeholders such as management, funders, implementing partners and community
members benefiting from or affected by the evaluation). As such, research questions come from scholars
within a specific field while evaluation questions originate from stakeholders.
Research tests theory and produces generalizable findings. Evaluation seeks to answer key evaluation
questions that are usually centered on whether or an intervention has been effective. Generalization of
findings is not as important to evaluation. An understanding of the causes of phenomena is seen as more
important (Levine-Rozalis, 2003). While evaluation focuses on a program, research focuses on a population
(Small, 2012).
Dissemination of findings: researchers publish their research results and evaluators report their findings to
the evaluation stakeholders.
Test in value: the ultimate test for research is its contribution to knowledge. On the other hand, for
evaluation, this is found in the extent to which it improves effectiveness.
View 2: Evaluation and research are mutually independent
Based on the understanding that a person can conduct both research and evaluation or neither of the two,
this view sees evaluation and research as two independent variables that not mutually exclusive (Rogers,
2014). This view maintains the following:
Research is empirical while evaluation is value-based and evaluators make evaluative conclusions and pass a
judgment of merit or worth on an intervention. Research does not make any judgments. Rather it gives
factual descriptions e.g. census data. Evaluation without a research aspect does not use a systematic process
9
of data collection to arrive at judgments about the intervention. An overlap between the two occurs when
the evaluators systematically collect and analyze data for them to arrive at their judgment or conclusion. In
this regard, this is similar to the first view discussed above.
View 3: Evaluation as a subset of research
“Doing research does not necessarily require doing evaluation. However, doing evaluation always requires
doing research” (Endias, 1998). This view regards research as a learning process (observe and learn) while
evaluation is considered as a judgmental process (assess and make a decision).
View 4: Research as a subset of evaluation
Under this view, research is seen as one of the activities undertaken during an evaluation process. Other
activities include planning the evaluation, managing the evaluation and promoting use of evaluation findings.
Evaluation approachesA number of approaches have influenced the practice and discipline of evaluation over the years. These
include; responsive evaluation, utilization focused evaluation and developmental evaluation. Slide number 12
in the power point presentation provides a summary of these approaches. In addition to these, we also
discuss the Results Based Monitoring and Evaluation approach from slides 23 to 27.
Utilization focused evaluation: Michael Patton is known as the father of utilization-focused evaluation
(Mouton, 2015). Patton identified that ‘utility’ of evaluation findings was a glaring gap when it came to
judging evaluation proposals. He saw that measurability, generalizability and validity, for instance were
among the main criteria used. As such, evaluators could be divorced from what the commissioners did with
the findings of the evaluation. According to Patton (2008), the value of the evaluation should be its utility and
whether the organization actually uses the findings and recommendations and so attention has to be given to
intended use by intended users, constantly (Ramirez and Brodhead, 2010). This approach gives evaluators the
responsibility to ensure that the design and process of the evaluation took into account evaluation use. It has
10
12 steps as shown in the figure below:
Figure 3: Utilization focused evaluation steps. Adapted from Ramírez and Brodhead (2010:1)
Assuming that you were the leader of the team appointed to conduct an outcome evaluation of the YAH
programme at Shangano, what are some of the key considerations you would make if you were being
informed by the Utilization Focused Evaluation approach? You got it right, ensuring that you involve or
engage stakeholders or evaluation users right from the beginning of the evaluation process is one of the ways
you can ensure evaluation use.
Responsive evaluation: Robert Stake introduced responsive evaluation in 1967. It seeks to personalize and
humanize the evaluation process by incorporating and responding to the stakeholders’ opinions and concerns
throughout the design and implementation of the evaluation process without sacrificing the quest for quality
in the programme or intervention to be evaluated (Mouton, 2015). According to Stake, stakeholders other
11
than the evaluator are in a position to best judge the quality of the evaluand (Robison, 2002). It is not a
particular model or even methodology of how to conduct an evaluation; but rather it as an approach or
philosophy, responsive evaluation upholds the importance of the evaluator getting close to the evaluation
stakeholders and being responsive to their concerns and issues.
There are different stakeholders and constituencies that are affected by every evaluation. Stake posits that it
is the responsibility of the evaluator to ensure that the evaluation would exhibit their expectations, different
values and value positions (Mouton 2015). And yet, consensus is not the aim. The evaluator is viewed as
being in the position to show these values, value positions and expectations without pushing for consensus
(Stake, 2001). Stake was not ‘comfortable’ with the idea that evaluation should stand for something. He
believed that the evaluator’s role is to show the different stakeholder perspectives and provide a description
(Robinson, 2002).
If you were asked to apply the responsive evaluation thinking in the same evaluation of YAH, you would
probably be interested in ensuring that the perspectives of different stakeholders are considered in the
evaluation. You are on the right track; these stakeholders include students, health facility management at
universities, the university leadership as well as other implementing partners (such as the National AIDS
Council) that work with Shangano at the university.
Empowerment/participatory evaluation: In 1993, David Fetterman introduced the empowerment evaluation
approach as a response to the value-free positivist models (Miller and Campbell, 2006). The approach draws
its origins from empowerment theory, community psychology, and action anthropology (Sheriff and Porter,
ud). At the heart of the approach is the aim “to help people help themselves” (Fetterman, 1996:5). Besides
seeking to improve policies and programmes, the approach is intentional about providing capacity building
for communities to have skills to monitor and evaluate their own performance and accomplish their goals
(Better Evaluation, ud). Potter (1999) categorizes the approach in the realm of critical-emancipatory
approaches to programme evaluation. Empowerment evaluation challenges the status quo.
Fetterman identifies five "facets" of empowerment evaluation (Mouton, 2015, Sheriff and Potter, ud):
• Training: training participants to conduct their own evaluations;
• Facilitation role: the evaluator is a facilitator who coaches rather than judges;
• Advocacy role: the evaluator advocates for programme team members to decide on the nature and
purpose of the evaluation;
12
• Liberation: programme personnel have improved skills to redefine their roles and objectives and, in
that progression, improve their own lives.
So, if you were to use empowerment evaluation approach to the evaluation of the YAH programme, what is
one of the key things you would do? You will probably train peer educators to collect and analyze data. You
may also do some capacity building for implementing partners that provide the services during campaigns if
needed.
Theory-based approaches to evaluation: Theory based evaluation is not a specific method used during an
evaluation. Rather, it is an approach to evaluation and a way that an evaluator can use to structure and
undertake the evaluation task at hand (Treasury Board of Canada Secretariat, 2010). Theory based
approaches to evaluation include ‘realist evaluation’ put forward by Pawson and Tilley (1997). They posit that
whether interventions work or not depends on the ‘underlying mechanisms at play in a specific context’.
Simply put, for them an outcome is a result of an interaction between the context and a mechanism that
causes change. As such, we should be asking, ‘what works for whom, under what circumstances?’ (Pawson
and Tilley, 1997).
Theory-based approaches to evaluation also articulate the Theory of Change for the programme. The Theory
of Change is used to draw conclusions about whether and how a programme contributed to the expected
effects. In that regard, they open the black-box that dominated early evaluations about which Weis (1997)
was highly concerned – a phenomena where we can see results but we are not sure about how the change
occurred or how it occurred in a specific context. Slides 17, 18, 19 and 20 in the power point provide brief
summaries of the Theory of Change. In the Shangano case study, we mentioned their theory of change and
some of the outputs and outcomes they identified using the clarificatory/design evaluation. It is noteworthy
that by unpacking their theory of change it was possible to unearth their assumptions about how and why
change would occur. As an evaluator appointed to evaluate the YAH programme, you will have to get an
understanding of their Theory of Change before you can conduct an evaluation.
More details on the Programme Theory (Theory of Change and Logic Models) can be found at
https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/centre-excellence-
evaluation/theory-based-approaches-evaluation-concepts-practices.html. You can also visit this site
https://www.theoryofchange.org/.
Results-based monitoring and evaluation: Results Based Monitoring and Evaluation (RBME) is recognized as
an essential management tool for the public sector (Waidyaratma, 2012; Kusek and Rist, 2004) The
application of Results Based Monitoring and Evaluation (RBME) is described as a continuous process of
improvement (Farrell, 2009). RBM focuses on; achieving results, implementing performance measurement
13
and using feedback learn and change. A number of concerns have pushed government departments and
ministries as well as organizations to adopt the RBME model. These include:
a. Having unclear goals;
b. Focusing on intervention activities and not results;
c. Inability to use feedback or data for decision making and corrections;
d. Inability to justify the need for resources to relevant stakeholders.
When applying RBME, you describe results in a sequential hierarchy. This starts by articulating specific short-
term results. These are followed by the long-term results that are broader by nature and will occur after the
short-term results are accomplished. You then design an M&E process that will be used to assess whether or
not results have been achieved. Resources are then allocated on the basis of activities that will need to be
undertaken to achieve the results. The M&E processes will also guide how results will be communicated to
stakeholders. Kusek and Rist (2004) developed an informative handbook on the steps to follow when building
a RBME system. This can be found at https://www.oecd.org/dac/peer-reviews/World%20bank
%202004%2010_Steps_to_a_Results_Based_ME_System.pdf. Farell (2009) also provides another good
source of information RBME. His handbook can be accessed at
http://oasis.col.org/bitstream/handle/11599/110/MEHandbook.pdf?sequence=1&isAllowed=y
14
Figure 4: RMBE and the logic model.
The figure above illustrates how the logic model can be used in the RBM process. Traditional monitoring and
evaluation would focus on process and implementation. As such, the focus would end with inputs and
activities and outputs. But RBME is interested in the ‘so what question’. It looks at the outcomes and impacts
that should result from the outputs – that is where the results are (intermediate effects of the programme
and the long term changes that will be observed in the targeted population – intended and unintended).
Similarities between evaluation approaches
Figure 3 also serves to illustrate that there are some similarities between the different evaluation
approaches. It has to be noted that at the heart of most of the approaches was the real need to make M&E
to help evaluators and organisations to better understand change, how it occurs and why it occurs. These
similarities between the approaches include:
15
1. Emphasis on unpacking why and how change occurs;
2. Linked to point 1 above, emphasis on ensuring that we understand the logical connection between
the various results levels of the intervention;
3. Emphasis on having a better understanding of intervention effects and impact;
4. Emphasis on stakeholder engagement;
5. Emphasis on understanding the context of the intervention, as interventions do not occur in a
vacuum.
Evaluation activities and the evaluatorOn slide 29 we presented the possible and most common stages of an evaluation process. In reality, these
stages are not always in logical sequence. For example, while promoting use of results is at stage 9, this
occurs at most stages of the evaluation process. For instance, at the planning stage and at the initial
implementation stage as well as at the data collection and analysis stage, the evaluator will ensure that he or
she engages relevant stakeholders. Let us use the YAH programme as an example. To conduct an outcome
evaluation successfully, you would have to engage stakeholders such as students, university leadership,
implementing partners and the Shangano implementing team as well as their M&E officer at all these stages.
Engaging them in understanding and defining objectives and articulating evaluations questions is an
important factor for evaluation use. In recent years, most organizations have developed Rosters for M&E
specialists. As such, the evaluation activities may not necessarily start with stages 1 and 2 as shown in the
slide but with stage 3 as the evaluators would already be part of the organisation’s roster who gets
evaluation or evaluation related work assignments as need arises. Please note, the CDC framework for
evaluating public health interventions is a very good framework of reference especially for public health
programs (although it can be applied across sectors). More details on the framework are available at
https://www.cdc.gov/eval/framework/index.htm.
Some competencies are needed through every stage or most stages of the evaluation, for example, cultural
competency, interpersonal skills and communication skills and ethical practice. For the YAH programme, you
will need to be culturally competent when talking to students, when talking to the health facility staff and
when talking to university leadership – for instance as they are likely to have different perspectives and
realities. It is also important to note that the evaluator’s roles will also be affected by whether he or she is an
internal or external evaluator. For example, the former is likely to start their activities at stage 4, which is
‘Planning evaluation’.
No single evaluator comes with all the skills that are required for the evaluation. A full evaluation is different
from offering technical support for a team that is developing a Theory of Change or and M&E plan. These
16
types of assignments can be carried out by one individual. A full evaluation, for instance an outcome
evaluation or an impact evaluation in reality would require a diverse team of evaluators who bring their
unique set of skills to the team. One team member may be very strong in qualitative data analysis, another in
quantitative data analysis, another may have a stronger understanding of how to conduct Value for Money
studies and another may have subject matter expertise in the field e.g. market development and micro-
finance or public health. As a result, most competency frameworks that have been developed by Voluntary
Organisations for Professional Evaluators (VOPES) e.g. ANZEA (2013), AES (2011) are quite clear on this point.
AES regards its competency framework as a ‘menu rather than a checklist’ (AES 2013:8) because evaluators
have different strengths, knowledge, skills and experience’. The competencies are therefore an instrument
for ‘understanding and managing strengths and gaps in a constructive way’ (AES 2013:8). The same view is
taken by ANZEA and UKES.
It is NOT expected that an individual evaluator or an evaluation team would possess ALL of the proposed
competencies. Rather, evaluators will develop and build on their areas of strength, and address any gaps
through professional development and/or collaborating with others. (ANZEA 2011: 6). The UKES
acknowledges that the responsibility to ensure a quality evaluation does not solely lie with the individual
evaluator. Besides the organisational environment and culture towards evaluations, teams are often involved
in the conduct of evaluation in which evaluators have different strengths and different levels of experience
(UKES 2012:3).
What is the purpose of programme evaluations?Evaluations are conducted with a purpose in mind. The evaluator and stakeholders should always be clear on
its intended purpose i.e. is the evaluation meant for accountability or decision making. That means that the
evaluator has to ensure that the intended audience for the evaluation have been identified. As shown in the
slide, the purpose of an evaluation is often linked to the type of evaluation and inevitably, the timing or stage
at which the evaluation occurs. The evaluation types are discussed in the section that follows. In addition, the
purpose of the evaluation also determines the team composition (whether it is going to be conducted by an
external or internal team), the resources, time and budget that needs to be allocated and the people
responsible for setting the Terms of Reference (TOR).
Formative functions: (before or during implementation) an evaluation can inform design and improve
implementation. For instance, during a clarificatory evaluation programme theory will be articulated and
assumptions unpacked and the logic of the of the proposed intervention examined
Summative functions: (at the end of the programme or close to the end of the project) evaluation findings
can inform decisions about upscaling, downsizing or replication thereby reducing the reduce risk
17
Strategic functions: evaluation results can help the organization to prioritize their investments, they can
facilitate learning and support risk management
Learning functions: evaluation studies can assist in gaining more insight about human behaviour and how
social change occurs. They can bring understanding on what sort of programmes produce what impact in
what contexts and under what circumstances.
The Terms of Reference and inception reports: Before engaging in an evaluation, the TOR is non-negotiable.
It is a strategy-level document that defines the tasks and duties required of the evaluator or evaluators. It
also highlights the evaluation purpose, scope, objectives and audience as well as expected deliverables and
responsibilities of both the evaluator(s) and the commissioners of the evaluation. Assuming that you have
been engaged by Shangano to conduct a process evaluation of YAH, you would want to ensure that the TOR
has all the things that we have just listed. In addition, you will ensure that the TOR specifies the expected
time lines, the reporting lines as well as the initially proposed evaluation design (evaluation designs are
dicussed in latter sections) and questions. Although some of the key items remain the same, TORs vary from
evaluation to evaluation depending on the organisation, field, sector, scope as well as budget available (for
instance Shangano may decided only to hire a consultant to lead the evaluation but all data collection at
campuses will be conducted by trained youth volunteers and provincial coordinators due to financial
contraints of paying a full evaluation team). The Independent Evaluation Group (IEG) provides insightful
guidance on how to develop a TOR. This can be accessed at
https://siteresources.worldbank.org/EXTEVACAPDEV/Resources/ecd_writing_TORs.pdf.
An inception report will follow the TOR. The inception report is prepared by the evauator. In the scenario
where you are evaluating the YAH programme, the inception report is essential. It helps you to get a clear
understanding and agreement beyween yourself and Shangano in terms of what you will deliver. For
instance, it detils the evaluation questions, the evaluation design, the data collection methods and tools as
well as the sampling and selection strategy that you will employ plus how data will be analysed.
Summary of section one:In this section, we have defined evaluation. Our point of departure was that program evaluation follows and
informs an intervention. Evaluation has a number of purposes that include accountability, supporting
decision making and strategic investments. We have looked at how evaluation differs from and relates to
research from 4 perspectives – the evaluation and research perspective, evaluation and research as two
independent exercises, evaluation as a subset of research and research as a subset of evaluation. We also
explored the common activities during an evaluation and how the evaluator will respond with specific roles
that require certain competencies. Remember, our emphasis was the no single evaluator comes with all the
18
competencies required for evaluation work – competencies being the skills, knowledge and attitudes. Among
the evaluation approaches that have influenced the discipline over the years, we looked at utilization focused
evaluation (emphasis on evaluation use), realistic evaluation (what works for whom in what circumstances)
and theory-based evaluation (emphasis on unpacking the theory of the programme – theory of change and
theory of action). We also looked at the RMBE as a management tool for the public sector and identified its
emphasis on moving beyond outputs and activities and inputs to consider results (outcomes and impacts).
Section one: quick knowledge check - individual exerciseBefore we move to section two, complete exercise one to see what you have learnt and double check areas
where you are still not sure. All the questions are in multiple choice and the correct answers will be provided
for you to check.
Questions Responses1. Evaluation occurs continuously throughout the life-cycle of the
programA. TrueB. False
2. An evaluation will never need monitoring data. Evaluators will have sufficient data during their study to make a judgement of merit or worth
A. TrueB. False
3. This evaluation approach believes that stakeholders of the evaluation are best placed, more than the evaluator to judge the quality of the programme
A. Realistic evaluationB. Theory based evaluationC. Utilization focused evaluation D. Responsive evaluation
4. These competencies are required of the evaluator in most stages of the evaluation process
A. cultural competency, B. interpersonal skillsC. ethical practiceD. a and bE. a, b and c
5. RMBE is mostly concerned with the outputs produced from programme activities
A. TrueB. False
19
SECTION TWO Introduction: Having defined evaluation, identified the possible roles and competencies for evaluation
activities and identified the purpose of evaluation as well as the evaluation approaches, we now turn to
evaluation types. This discussion is then followed by evaluation criteria and evaluation standards. You will see
that the evaluation types and the criteria cannot be divorced. We will end off the section by looking at
evaluation and the Sustainable Development Goals. By the end of the section, we expect that you will be able
to do the following:
1. Identify different evaluation types, their purpose and timing in the project life-cycle;
2. Describe the evaluation criteria and the evaluations they apply to;
3. Identify African Evaluation Guidelines;
4. Explain the role of evaluation in the review of national, regional and global progress towards the SDGs.
Types of evaluationEvaluations have to meet the needs of the commissioner or the one who requests the evaluation. There are a
number of evaluations that can be conducted and these fall into two categories of summative and formative
evaluations (The Pell Institute, ud). Summative evaluations are conducted at the end of the project or
programme to determine the value, merit or worth of the intervention against some criteria or standards
(Scriven, 1967). On the other hand, formative evaluations are carried out at the beginning of the program or
during implementation to inform design (Guyot, 1978). Figure 3 below illustrates how formative and
summative evaluations can be used.
20
Figure 5: The various uses of summative and formative evaluations; adapted from the PELL Institute Evaluation Tool Kit.
Table2 below presents a summary of evaluation types, the purpose of the evaluation, the ideal timing within
the program lifecycle and the value that particular evaluation presents to the commissioning organization.
Let’s look at one example, clarificatory evaluation. This type of evaluation is aimed at clarifying the logic of
the evaluation (Owen and Rogers, 1999). The key questions would include whether the planned activities will
lead to the expected outputs and would these be sufficient to lead to the expected outcomes. Clarificatory
evaluations are conducted at the beginning of the programme and during conceptualization and design.
Thus, they are formative by nature. In terms of value to the organization, programme teams and the
management can get a sense of whether they are ready to implement the programme and they can see
whether or not the planned programme is feasible (given their context, resources, experience etc.). Examples
of clarificatory evaluation include; feasibility studies, needs assessments, and using the theory of change
framework to get clarity on the logic of the intervention – from activities through to impacts.
21
Table 2: Evaluation types, timing, purpose and value. Adapted from Mouton, 2015.
Type of Evaluation Purpose Timing Value to the Organisation1.Design/ clarificatory evaluation
To clarify program logic (goals, objectives, outputs and outcomes) in order to establish feasibility of implementation
During conceptualization and design of a program
Ensures clarity and feasibility of program and whether program is ready for implementation
2.Process or implementation evaluation
To establish whether a program is being implemented properly and whether the target group receives the intervention
Concurrent with roll-out of program
Provides timely ad constant feedback on program roll-out and enables quick adjustments to programs
3.Mid-term review(s) (of long-term programs)
To establish whether short-term outputs and outcomes are being achieved in order to advise project management in implementation. It can also help the team to look at contextual changes and how the programme/project is adapting.
Half-way through actual (not planned) implementation
Ensures more systematic reviews of implementation and ‘first’ achievements
4.Diagnostic evaluation (where there is a perceived risk of no or poor implementation)
To establish whether a program is still on course; to identify problems/weaknesses in implementation and to advise donors whether to continue or discontinue funding
Evaluation commissioned during program implementation for trouble-shooting purposes
Ensures an independent assessment of potential problems in program implementation with recommendations re. change
5.Outcomes evaluations and impact assessments
To establish whether the expected outcomes of a program have been achieved and what the overall impact of the program has been
Commence with baseline study + continuous monitoring + structured post implementation measures
Provides rigorous and credible assessment of immediate and possible short-term impact of a program and whether money has been well-spent (ROI/Value for money studies)
6.Cost-benefit analysis To assess the benefit (value) of a program to the target group against the cost of implementation
Similar to impact assessment
Calculation in some standardized units of both cost as well as benefits accruing from program
22
Evaluation CriteriaCriteria is defined as “a standard on which a judgment or decision may be based. How to use criterion in a
sentence” (Merriam-Webster). Evaluation criteria is described as:
Evaluation criteria for a project are like assessment criteria for student work. Before we can fairly assess a piece of student work, we need to clearly identify what we are looking for, and on what basis our assessment will be made. https://education.nsw.gov.au/teaching-and-learning/professional-learning/evaluation-resource-hub/evaluation-design-and-planning/setting-the-scope-of-an-evaluation/evaluation-criteria
Such criteria for evaluation as described above has been provided by the OECD DAC (the criteria are currently
under review) as shown by slide number 33. It is noteworthy that sometimes, depending on the evaluation
type and purpose, one or more criteria will apply:
In addition to using effectiveness as a criterion, an evaluation might employ economic criteria (efficiency in terms of costs and benefits), equity and equality criteria (who benefits, who doesn’t), as well as criteria related to sustainability, cultural and contextual relevance and appropriateness, and sometimes other criteria negotiated with stakeholders (IIED, 2016: 2).
It is important to remember that not all criteria will apply to each and every type of evaluation. Rather, it will
depend on the evaluation type and evaluation purpose. In other words, the criteria are a guide for evaluators
and not a prescription that all should be applied to each evaluation. Table 3 below demonstrates how
different criteria will apply to different evaluations, although the criteria is at times complementary:
Table 3: Evaluation types, purpose and timing. Adapted from Mouton (2015)
Type of Evaluation Purpose Timing Value to the Organisation1.Design/Clarificatory evaluation
To clarify programme logic (goals, objectives, outputs and outcomes) in order to establish feasibility of implementation. E.g. for YAH programme, will the activities planned such as training of peer educators and facilitating web based discussions give Shangano the outputs they need to lead to improved life skills of students?
During conceptualization and design of a program
Ensures clarity and feasibility of program and whether program is ready for implementation
2.Process or implementation evaluation
To establish whether a programme is being implemented properly and whether the target group receives the intervention. E.g. for YAH programme, they would want to find out whether all the intended trainings have conducted with peer-educators as per plan and whether the planned health campaigns have been conducted the expected services delivered. They would also want to know whether the
Concurrent with roll-out of program
Provides timely ad constant feedback on program roll-out and enables quick adjustments to programs
23
information has been reaching their key target groups e.g. female students in their first year or students at campuses that have not had equal access to services.
3.Mid-term review(s) (of long-term programs)
To establish whether short-term outputs and outcomes are being achieved in order to advise project management in implementation. It can also help the team to look at contextual changes and how the programme/project is adapting. E.g. for YAH programme, they may be interested to find out whether there are any changes that are occurring within the institutions that affect the implementation of the programme and consequently the outputs and outcomes. With that information, they can adapt implementation.
Half-way through actual (not planned) implementation
Ensures more systematic reviews of implementation and ‘first’ achievements
4.Diagnostic evaluation (where there is a perceived risk of no or poor implementation)
To establish whether a programme is still on course; to identify problems/weaknesses in implementation and to advise donors whether to continue or discontinue funding. E.g. for YAH programme, there may be concerns about buy-in from universities and Shangano may want to come up with strategies to better engage the universities’ leadership.
Evaluation commissioned during program implementation for trouble-shooting purposes
Ensures an independent assessment of potential problems in program implementation with recommendations re. change
5.Outcomes evaluations and impact assessments
To establish whether the expected outcomes of a program have been achieved and what the overall impact of the program has been. E.g. for YAH programme, Shangano would want to whether or not students have increased access to current and accurate SRH information and services and whether students increasingly use the services available. In addition, they would want to know whether the numbers of new STI infections are reduced at the 6 universities.
Commence with baseline study + continuous monitoring + structured post implementation measures
Provides rigorous and credible assessment of immediate and possible short-term impact of a program and whether money has been well-spent (ROI/Value for money studies)
6.Cost-benefit analysis To assess the benefit (value) of a program to the target group against the cost of implementation. E.g. for YAH programme, Shangano may want to know whether the resources (financial and human)
Similar to impact assessment
Calculation in some standardized units of both cost as well as benefits accruing from program
24
that have been used to achieve the results could have been put to optimal use and whether there could have been better ways of doing things.
Different criteria exist for evaluations in Evaluations of Humanitarian Actions. These include connectedness,
coherence, coverage and impact. It has to be noted that evaluations in the humanitarian response contexts
differ from evaluations of conventional development programs in terms of timing, purpose, data available
and the reality that humanitarian responses occur in very highly dynamic contexts where needs and priorities
shift constantly. More details on these criteria can be accessed at
https://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/2382.pdf. Better Evaluation
recently added a new page with resources on Evaluation of Humanitarian Response. This can be accessed at
https://www.betterevaluation.org/en/blog/evaluation-humanitarian-action-new-page.
Evaluation Standards, guidelines and principlesThere are guidelines, standards and principles that guide evaluation practice. These include the Programme
Evaluation Standards and the African Evaluation Guidelines. In this section we focus on the Africa Evaluation
Guidelines (AEG). They were developed by AFREA through a consultative process that spanned over 5 years -
1998 to 2002 (AFREA https://www.AFREA/African_Evaluation_Guidelines.pdf. As noted by AFREA, these
guidelines are based on the Programme Evaluation Standards (PES) that are used by the American Evaluation
Association. The PES was developed by the American Joint Committee on Standards for Educational
Evaluation (AJCSEE). The standards have been promoted by many VOPEs (UNICEF
https://www.unicef.org/evaluation/files/Evaluation_standards.pdf). They are also promoted by agencies such
as the Center for Diseases Control (CDC).
What is the purpose of the guidelines?
AFREA refer to these guidelines as a ‘checklist’ to assist evaluators in ‘planning evaluations, negotiating clear
contracts, reviewing progress and ensuring adequate completion of an evaluation’. It appears that quality
evaluations that are useful, that inform decision making and that are cognizant of stakeholders are at the
heart of the guidelines. A look at the guidelines also reveals that accountability of the evaluation is important.
Recently, the standard ‘Evaluation Accountability’ was added to the four standards by the CDC as a stand-
alone standard (MacDonald, ud). The CDC framework puts the evaluation standards at the core of the
evaluation process, from Step 1 which is stakeholder engagement to step 6 which is ensuring use and sharing
of lessons learned as shown below in figure 4:
25
Figure 6: CDC framework for programme evaluation in public health
The standards and the statements: AEG content
Each of the 4 standards in the AEG is supported by statements that further described the standard. These are
detailed in figures 4-6 below. You will see that some of the statements will have (modified) in brackets. That
means that they have been modified from the original statement in the PES. A detailed description of the
process can be accesses at https://afrea.org/the-african-evaluation-guidelines/
26
Figure 7: AEG utility standards.
Figure 8: AEG feasibility standards
27
Evaluation standard 1: Utility
Description: The utility guidelines are intended to ensure that an evaluation will serve the information needs of intended users and be owned by stakeholders.
U1 Stakeholder identification (modified): Persons and organisations involved in or affected by the evaluation (with special attention to beneficiaries at community level) should be identified and included in the evaluation process, so that their needs can be addressed and so that the evaluation findings are utilisable and owned by stakeholders, to the extent this is useful, feasible and allowed.
U2 Evaluator credibility: The persons conducting the evaluation should be both trustworthy and competent to perform the evaluation, so that the evaluation findings achieve maximum credibility and acceptance.
U3 Information scope and selection: Information collected should be broadly selected to address pertinent questions about the programme and be responsive to the needs and interests of clients and other specified stakeholders.
U4 Values identification (modified): The perspectives, procedures, and rationale used to interpret the findings should be carefully described, so that the bases for value judgments are clear. The possibility of allowing multiple interpretations of findings should be transparently preserved, provided that these interpretations respond to stakeholders’ concerns and needs for utilisation purposes.
U5 Report clarity: Evaluation reports should clearly describe the programme being evaluated, including its context, and the purposes, procedures and findings of the evaluation, so that essential information is provided and easily understood.
U6 Report timeliness and dissemination (modified): Significant interim findings and evaluation reports should be disseminated to intended users, so that they can be used in a reasonably timely fashion, to the extent that this is useful, feasible and allowed. Comments and feedback of intended users on interim findings should be taken into consideration prior to the production of the final report.
U7 Evaluation impact: Evaluations should be planned, conducted and reported in ways that encourage follow through by stakeholders, so that the likelihood that the evaluation will be used is increased.
Evaluation standard 2: Feasibility
Description: The feasibility guidelines are intended to ensure that an evaluation will be realistic, prudent, diplomatic, and frugal.
F1 Practical procedures: The evaluation procedures should be practical to keep disruption to a minimum while needed information is obtained.
F2 Political viability (modified): The evaluation should be planned and conducted with anticipation of the different positions of various interest groups, so that their cooperation may be obtained, and so that possible attempts by any of these groups to curtail evaluation operations or to prejudice or misapply the results can be averted or counteracted to the extent that this is feasible in the given institutional and national situation.
F3 Cost effectiveness (modified): The evaluation should be efficient and produce information of sufficient value, so that the resources expended can be justified. It should keep within its budget and account for its own expenditures.
Figure 9: AEG propriety standards.
28
Evaluation standard 3: Propriety
Description: The propriety guidelines are intended to ensure that an evaluation will be conducted legally, ethically and with due regard for welfare of those involved in the evaluation, as well as those affected by its results.
P1 Service orientation: Evaluation should be designed to assist organisations to address and effectively serve the needs of the full range of targeted participants.
P2 Formal agreements (modified): Obligations of the formal parties to an evaluation (what is to be done, how, by whom, when) should be agreed to through dialogue and in writing, to the extent that this is feasible and appropriate, in order for these parties to have a common understanding of all the conditions of the agreement and hence are in a position to formally renegotiate it if necessary. Specific attention should be paid to informal and implicit aspects of expectations of all parties to the contract.
P3 Rights of human participants (modified): Evaluation should be designed and conducted to respect and protect the rights and welfare of human subjects and the communities of which they are members. The confidentiality of personal information collected from various sources must be strictly protected.
P4 Human interaction (modified): Evaluators should respect human dignity and worth in their interactions with other persons associated with an evaluation, so that participants are not threatened or harmed or their cultural or religious values compromised.
P5 Complete and fair assessment: The evaluation should be complete and fair in its examination and recording of strengths and weaknesses of the programme being evaluated, so that strengths can be built upon and problem areas addressed.
P6 Disclosure of findings (modified): The formal parties to an evaluation should ensure that the full set of evaluation findings along with pertinent limitations are made accessible to the persons affected by the evaluation, and any others with expressed legal rights to receive the results as far as possible. The evaluation team and the evaluating institution will determine what is deemed possible, to ensure that the needs for confidentiality of national or governmental entities and of the contracting agents are respected, and that the evaluators are not exposed to potential harm.
P7 Conflict of interest: Conflict of interest should be dealt with openly and honestly, so that it does not compromise the evaluation.
P8 Fiscal responsibility: The evaluator’s allocation and expenditure of resources should reflect sound accountability
Figure 10: AEG accuracy standards
29
Evaluation standard 4: Accuracy
Description: The accuracy guidelines are intended to ensure that an evaluation will reveal and convey technically adequate information about the features that determine worth of merit of the programme being evaluated.
A1 Programme documentation (modified): The programme being evaluated should be described clearly and accurately, so that the programme is clearly identified, with attention paid to personal and verbal communications as well as written records.
A2 Context analysis (modified): The context in which the programme exists should be examined in enough detail, including political, social, cultural and environmental aspects, so that its likely influences on the programme can be identified and assessed.
A3 Described purposes and procedures: The purposes and procedures of the evaluation should be monitored and described in enough detail, so that they can be identified and assessed.
A4 Defensible Information sources (modified): The sources of information used in a programme evaluation should be described in enough detail, so that the adequacy of the information can be assessed, without compromising any necessary anonymity or cultural or individual sensitivities of respondents.
A5 Valid information (modified): The information gathering procedures should be chosen or developed and then implemented so that they will assure that the implementation arrived at is valid for the intended use. Information that is likely to be susceptible to biased reporting should be checked using a range of methods and from a variety of sources.
A6 Reliable information: The information gathering procedures should be chosen or developed and then implemented so that they will assure that the information obtained is sufficiently reliable for the intended use.
A7 Systematic information: The information collected, processed and reported in an evaluation should be systematically reviewed and any errors found should be corrected.
A8 Analysis of quantitative information: Quantitative information in an evaluation should be appropriately and systematically analysed so that evaluation questions are effectively answered.
A9 Analysis of qualitative information: Qualitative information in an evaluation should be appropriately and systematically analysed so that evaluation questions are effectively answered.
A10 Justified conclusions: The conclusions reached in an evaluation should be explicitly justified, so that stakeholders can assess them.
A11 Impartial reporting: Reporting procedures should guard against distortion caused by personal feelings and biases of any party to the evaluation, so that evaluation reports fairly reflect the evaluation findings.
A12 Meta-evaluation: The evaluation itself should be evaluated in a formative and summative manner against these and other pertinent guidelines, so that its conduct is appropriately guided and, on completion, stakeholders can closely examine its strengths and weakness.
Sustainable Development GoalsIn 2015, the United adopted the 2030 Agenda for Sustainable Development Goals (SDGs). The SDGs build on
the successes of the Millennium Development Goals (MDGs). Also known as the Global Goals, the 17 SDGs
are “a universal call to action to end poverty, protect the planet and ensure that all people enjoy peace and
prosperity” (UNDP).
SDGs indicators: Each of the 17 ambitious SDGs has its set of targets. In total there 230 targets and each of
these has indicators. You will notice that specific programs, projects ad policies contribute to specific SDGs.
And the SDGs themselves are not mutually exclusive – progress in one is essential for progress to be realized
in another. The outcomes and impacts in your programme logframe or logical model can tell you which SDGs
you are contributing to. For example, an organization that implements programs and projects in shelter or
housing sector is likely to be contributing to SDG number 11 (Sustainable cities and communities) and SDG
number 6 (Clean water and sanitation) through work in Water Sanitation and Health (WASH). Below is an
example of SGD number 6 with its targets and indicators. Details of all the SDGs and their targets and
indicators can be accessed at https://unstats.un.org/sdgs/metadata/.
Figure 11: SDG 6 targets and indicators.
30
Goal 6: Ensure availability and sustainable management of water and sanitation for all
Target 6.1: By 2030, achieve universal and equitable access to safe and affordable drinking water for all
Indicator 6.1.1: Proportion of population using safely managed drinking water services
Target 6.2: By 2030, achieve access to adequate and equitable sanitation and hygiene for all and end open defecation, paying special attention to the needs of women and girls and those in vulnerable situations
Indicator 6.2.1: Proportion of population using safely managed sanitation services, including a hand-
Target 6.3: By 2030, improve water quality by reducing pollution, eliminating dumping and minimizing release of hazardous chemicals and materials, halving the proportion of untreated wastewater and substantially increasing recycling and safe reuse globally
Indicator 6.3.1: Proportion of wastewater safely treated Indicator 6.3.2: Proportion of bodies of water with good ambient water quality
Target 6.4: By 2030, substantially increase water-use efficiency across all sectors and ensure sustainable withdrawals and supply of freshwater to address water scarcity and substantially reduce the number of people suffering from water scarcity
Indicator 6.4.1: Change in water-use efficiency over time Indicator 6.4.2: Level of water stress: freshwater withdrawal as a proportion of available freshwater resources
SDGs and M&E: Programs, projects and policies contribute to the attainment of SDGs at the national and global level.
Evaluation will be essential for the review of systems for the SDGs (International Institute for Environment
and Development). The 2030 Agenda committed to systematic review and engagement over the next 15
years to measure progress. This will be accomplished through high quality evaluations. As argued by D'Errico
et al (2016), such evaluation goes beyond ‘conducting a survey’. Rather, it involves critical thinking, asking the
right questions and analyzing claims before arriving at a judgment of value, worth or merit. It is not enough
to monitor and measure. Evaluation of the SDGs is key to show not just how but why progress has been made
nationally, regionally and globally. It can also inform improvements on future national, regional and global
initiatives (IIED, 2016).
Summary of section two
In this section, we built upon the found that was laid in section one. We looked at the different evaluation
types and came to the understanding that evaluations are either formative or summative by nature.
Formative evaluations are conducted at the beginning or conceptualization stages of the programme e.g.
clarificatory evaluations which can be needs assessments or unpacking programme theory. Summative
evaluations are carried out at the end or towards the end of the programme, for instance outcome
evaluations and impact evaluations. We also looked at the evaluation criteria such as effectiveness (have we
met the objectives of the programme) and impact (what are the long-term effects – intended and unintended
of the programme?). Guided by the OECD DAC criteria for evaluations, our point of emphasis was that the
criteria are not prescriptive and the criterion used will depend on the evaluation type and purpose. We also
looked at the African Evaluation Guidelines developed by AFREA. Remember that the guidelines were
adapted from the Program Evaluation Standards and AFREA kept the 4 standards of utility, feasibility,
propriety and accuracy. We ended section two by briefly looking at the SDGs and evaluation for SDGs and
noted that quality evaluation is critical for the review of the progress being made by the SDGs.
Section two: quick knowledge check - individual exerciseBefore we move to section three, complete exercise two to see what you have learnt and double check areas
where you are still not sure. All the questions are in multiple choice and the correct answers will be provided
for you to check.
31
Questions Responses1. Clarificatory evaluation is usually conducted at the end
of the programme to see whether or not programme objectives have been achieved.
A. TrueB. False
2. Which evaluation type is conducted halfway through the programme implementation?
A. Diagnostic evaluationB. Clarificatory evaluationC. Impact evaluationD. Midterm reviewE. None of the above
3. This OECD DAC evaluation criteria is concerned with the question: ‘the rate at which the inputs are converted to outputs – have we made optimal use of resources to achieve results?
A. EfficiencyB. EffectivenessC. Relevance D. Value for money
4. Which evaluation standard is concerned with whether or not the evaluation meets the information needs of the evaluation stakeholders?
A. ProprietyB. AccuracyC. UtilityD. All the above
5. Which of these statements is not true?a. There are 17 Sustainable Development Goalsb. There are 200 targets for the SDGsc. You are able to see what SDGs your programme is contributing towards nationally by looking at the impacts or outcomes in your logframe or logic modeld. Evaluation will help us to understand why and how progress has been made towards the 2030 agenda
A. a and cB. a, c and dC. a and bD. b only
32
SECTION THREE Introduction: In this section we turn to evaluation designs. We will discuss three key types of evaluation
design: the classic experimental design, the quasi-experimental design and the non-experimental design.
We will touch on some of the considerations that you make when selecting a design which include the
evaluation purpose, some research considerations such as external and internal validity as well as the
resources and time available. At the end of this section, we expect that you will be able to do the
following:
1. Identify the key evaluation designs;
2. Identify the main considerations you make when selecting an evaluation design;
3. Identify some of the key strengths and weaknesses of the different designs.
Evaluation DesignsEvaluation design is a critical part of the evaluation planning process. It is important to remember that the
evaluation design that you adopt has to fit the purpose and objective of the evaluation. As shown in slide 45
in the accompanying power point presentation, there are a number of considerations to be made when
selecting the evaluation design. The evaluation design will determine the methodology that will be used for
data collection and naturally analysis of the data. The decision-framework for selecting an evaluation design
presented by Mouton (2014) on slide number 46 shows how the design is tied to the evaluation purpose
(which could be improvement or judgement oriented). This translates into the evaluation type. As discussed
in section two, evaluation types include clarificatory evaluation, outcome evaluation or impact evaluations. It
is a good idea to have someone with expertise to assist you in the process of selecting the evaluation design.
There are three main evaluation designs:
1. The experimental design;
2. The quasi-experimental design and
3. The non-experimental design
Table 4 below provides a summary of the 3 key evaluation designs for an outcome evaluation; their strengths
and challenges:
33
Table 4: Summary of evaluation designs. Adapted from go2itech (ud:4) https://www.go2itech.org/wp-content/uploads/2017/07/Evaluation-Design-and-Methods.pdf
Design type Examples Strengths Challenges Experimental:Compares intervention with non-intervention
Uses controls that arerandomly assigned
Randomized controlled trial (RCT)Pre-post design with a randomized control group is one example of an RCT
Can infer causality withhighest degree ofconfidence
Most resource-intensive
Requires ensuring minimal extraneous factors
Sometimes challenging togeneralize to “real world”
Quasi-Experimental:Compares intervention with non-interventionUses controls or comparison groups that are not randomly assigned
Pre-post design with a nonrandomized comparison group
Can be used when youare unable to randomizea control group, but youwill still be able tocompare across groupsand/or across timepoints
Differences betweencomparison groups mayconfoundGroup selection critical
Moderate confidence ininferring causality
Non-Experimental:Does not use comparison or control group
Case control (post-intervention only): Retrospectively comparesdata between intervention andnon-intervention groups
Pre-post with no control: Data from one group are compared before and after the training intervention
Simple design, usedwhen baseline dataand/or comparisongroups are not availableand for descriptivestudy.
May require leastresources to conductevaluation
Minimal ability to infercausality
The experimental design:
The classic experimental design has two features as discussed below and shown in figure 10 below (Mouton,
2016; The Provincial Center of Excellence for Child and Youth Mental Health; Badiei, 2012; AmeriCorps ud)
1. Random assignment of people into treatment and non-treatment groups. This means that in a
context, people are randomly assigned to the intervention group (the treatment group) where they
receive the intervention. In the case of our example of YAH programme at Shangano, students from
the 6 universities will be the intervention group as they will be exposed to or will participate in the
YAH programme. Or, they are assigned into the control group (the non-treatment group) where they
are not exposed to the programme or intervention. Students from universities that do not get the
YAH programme (about 4 universities in Zimbabwe) will be selected to be part of the control group;
34
2. There is a possibility of the pre (baseline) and post (outcomes) tests for both groups to determine
their status before the intervention and after the intervention. In that way, the evaluators can
attribute effects or changes that they observe in the ‘treatment group’ to their programme.
Figure 12: the classic experimental design. Adapted from Mouton (2016: 17)
Experimental designs have been criticized for the following:
1. They encourage the ‘black-box’ in evaluations. This means that they can attribute change to the
programme based on the changes observed in the assigned groups but they cannot tell us how and
why change occurred. Referring to the YAH programme, we could be able to claim at the end of the
day that YAH has resulted in improved health among students but we would not be able to tell how
and why this change has occurred. We would also not be able to tell which combination of
components in the programme worked best together to produce the results that are now seen;
2. In real life, it is not easy to keep those in the control group from accessing the intervention e.g.
community members from a community without a new clinic may access services from the clinic in
another community. Students from Zimbabwe Open University could still access information from
the University of Zimbabwe, for instance if we go back to our example of the YAH programme.
Evaluation approaches such as the Realistic Evaluation Approach (discussed under the Evaluation
Approaches section) by Pawson and Tilley were a response to this weakness;
3. They do not take into account that some other programmes or projects in the context may be
causing the effect observed among the ‘treatment’ group. For example – a programme in the village
may be aiming at teaching community members good hygiene practices in their homes. At the same
time, community members may be receiving information from the clinic on good hygiene practice as
well as through some media campaigns on the local radio station. In the YAH programme example, it
35
could be that students at the universities are also accessing information on social media platforms
such as Twitter and Facebook. Furthermore, they could accessing information from other service
providers when they go home on semester breaks.
Quasi-experimental designs
Quasi-experimental research is similar to experimental research in that there is manipulation of an independent variable. It differs from experimental research because either there is no control group, no random selection, no random assignment, and/or no active manipulation (Abraham and McDonald, 2011).
Some of the leading proponents of quasi-experimental designs include Carola Weiss, Thomas Cook and Peter
Rossi (Mouton, 2016). In this design, two groups that have similar features are selected. However, random
selection is not used to assign people or communities to the groups. Both groups are then measured before
and after the one group has received the intervention. For example, 10 schools in a district may be selected
to receive a Heads Of Department training intervention and 10 other schools in the same district with similar
features would not receive the same intervention but be measured before and after the intervention
(Measure Evaluation, ud). Bringing in our YAH programme example, universities that have similar
characteristics with those receiving the YAH programme would be considered as comparison groups. These
could be universities that are also public universities and have a similar mix of programmes and faculties as
those in the intervention group. The baseline study would be conducted for all the institutions. Then they
would be measured at the end of 2022 with the planned outcome evaluation. Below are some of the key
features of quasi-experimental designs:
1. Statistical matching is stronger
2. Propensity score matching is used
3. Judgmental matching is weaker
4. It can be used after the target group for the programme has been selected
5. It is not free from issues of sample selection bias
Non-experimental evaluation designs
When using a non-experimental design, only people who are participating in the program or have received
the intervention get the pre- and post-test (Measure Evaluation, ud). This means that in terms of the YAH
programme, Shangano would only be concerned with baseline and post implementation measurements of
only the 6 universities that are participating in the programme. In terms of scientific rigor, this design is
deemed as the weakest. It does not provide information about what may have occurred in the absence of
the intervention. The following are the examples of non-experimental evaluation designs; a) correlation, b)
mixed methods, c) case studies and d) surveys.
36
Summary of section three
In this section we looked at the three key evaluation designs; experimental, quasi-experimental and
non-experimental designs. We saw that the experimental design and the quasi experimental design
differ when it comes to assignment into treatment and non-treatment groups. The former assigns
randomly and carries pre and post tests with both groups (the intervention group and the control
group). The latter does not assign to groups randomly. Rather, the ‘comparison’ group is the made up of
a community or population, for example with similar traits that does not participate in the programme.
Non-experimental designs conduct pre and post measures only for the group that has participated in the
intervention.
Section three: quick knowledge check - individual exerciseWell-done on completing sections one, two and three. Here is another quick exercise for you to complete and
see you have learnt and double check areas that you need to revisit. Just as in sections one and two, the
questions are in multiple choice and the correct answers will be provided for you to check.
Questions Responses1. This evaluation design has been criticized because it doesn’t help
us to explain how change happensA. Quasi-experimental designB. Non-experimental designC. Experimental design
2. In this design, there is group assignment but the assignment is not random. The group that does not get the intervention is called the comparison group.
A. Quasi-experimental designB. Non-experimental designC. Experimental design
3. Case studies, mixed methods and surveys are examples of which evaluation design?
A. Quasi-experimental designB. Non-experimental designC. Experimental design
4. The Randomized Control Trial is an example of a classic experimental evaluation design.
A. True B. False
37
ReferencesAfrican Evaluation Association (ud) The African Evaluation Guidelines. Available at https://afrea.org/the-african-evaluation-guidelines/ Accessed 4 July 2019
Better Evaluation. Empowerment Evaluation. Available at https://www.betterevaluation.org/en/plan/approach/empowerment_evaluation Accessed 17 June 2019
Business Dictionary (ud). Available at http://www.businessdictionary.com/definition/program-evaluation.html. Accessed 20 May 2019.
International Training and Education Center for Health (ud). Choose Evaluation Designs and Methods. Available at https://www.go2itech.org/wp-content/uploads/2017/07/Evaluation-Design-and-Methods.pdf Accessed 3 July 2019
Guyot, W.M. (1978). Summative and Formative Evaluation. The Journal of Business Education. 54(3):127-129.
International Institute for Environment and Development (2016). Effective Evaluation for the Sustainable Development Goals. Available at https://www.iied.org/effective-evaluation-for-sustainable-development-goals
MacDonald, G (ud), Framework for Program Evaluation in Public Health: A Checklist of Steps and Standards. Available at https://wmich.edu/sites/default/files/attachments/u350/2014/CDC_Eval_Framework_Checklist.pdf Accessed 4 June 2019
Merriam-Webster Dictionary. Available at https://www.merriam-webster.com/dictionary/criterion Accessed 3 June 2019
Mouton, J. (2014). Programme Evaluation Designs and Methods. SunMedia. Stellenbosch.
NSW Government (ud). Evaluation Criteria. Available at https://education.nsw.gov.au/teaching-and-learning/professional-learning/evaluation-resource-hub/evaluation-design-and-planning/setting-the-scope-of-an-evaluation/evaluation-criteria Accessed 3 June 2019
Podems, D. (2014). Evaluator competencies and professionalising the field: where are we now? “The Canadian Journal of Programme Evaluation” Vol.28. No.3. Pp. 127-136. Canadian Evaluation Society.
Quora (2018). Wat is the Difference Between Monitoring and Evaluation? Available at https://www.quora.com/What-is-the-difference-between-monitoring-and-evaluation Accessed 13 June 2019
Ramirez, R and Bodhead, D (2010). Utilization Focused Evaluation A primer for evaluators. Available at https://evaluationinpractice.files.wordpress.com/2013/04/ufeenglishprimer.pdf Accessed 2 July 2019
Robinson, J (2002). Responsive Evaluation: Responsible Evaluation: What is Evaluation Research for? Putting Derrida to Stake. Available at http://www.leeds.ac.uk/educol/documents/00002586.htm Accessed 8 July 2019
Sheriff, B and Potter, S (ud). An Introduction to Empowerment Evaluation: Teaching Materials. Available at https://pdfs.semanticscholar.org/090e/7cbd7895c72d2f0c19901603f9351b384af1.pdf Accessed 15 July 2019
38
Surbhi (2017). Differences Between Monitoring and Evaluation. Available at,
https://keydifferences.com/difference-between-monitoring-and-evaluation.html Accessed 12 June 2019
The PELL Institute (ud). Evaluation Tool Kit. Available at
http://toolkit.pellinstitute.org/evaluation-101/evaluation-approaches-types/ Accessed on 5 July 2019
UNESCO (2016). Designing Effective Monitoring and Evaluation of Education Systems for 2030: A Global Synthesis of Policies and Practices. Available at http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/ED/pdf/me-report.pdf Accessed 3 July 2019
UNICEF (ud) Evaluation Standards. Available at https://www.unicef.org/evaluation/files/Evaluation_standards.pdf Accessed 10 June 2019
Waidya, M. (2012). Results-Based Monitoring and Evaluation System (RBME). A Tool for Public Sector Management. Available at https://www.slideshare.net/madhawa66/results-based-monitoring-and-evaluation Accessed on5 September 2019.
39