SECTION ONE - ZEA€¦ · Web viewIn the Shangano case study, we mentioned their theory of change and some of the outputs and outcomes they identified using the clarificatory/design

INTRODUCTION TO EVALUATION MODULE

SEPTEMBER 2019

Table of ContentsSECTION ONE........................................................................................................................................................2

1

Defining Evaluation......................................................................................................................................2

Evaluation and monitoring..........................................................................................................................3

Evaluation and research..............................................................................................................................7

Evaluation approaches.................................................................................................................................9

Evaluation activities and the evaluator.....................................................................................................15

What is the purpose of programme evaluations?....................................................................................16

SECTION TWO.....................................................................................................................................................19

Types of evaluation....................................................................................................................................19

Evaluation Criteria......................................................................................................................................22

Evaluation Standards, guidelines and principles.......................................................................................24

Sustainable Development Goals................................................................................................................29

SECTION THREE...................................................................................................................................................32

Evaluation Designs.....................................................................................................................................32

Table of figures:

Figure 1: Relationship between monitoring and evaluation. Adapted from Wildschut (2015)............................5Figure 2: Programme life cycle.............................................................................................................................6Figure 3: Utilization focused evaluation steps. Adapted from Ramírez and Brodhead (2010:1)........................10Figure 4: RMBE and the logic model...................................................................................................................14Figure 5: The various uses of summative and formative evaluations; adapted from the PELL Institute Evaluation Tool Kit..............................................................................................................................................20Figure 6: CDC framework for programme evaluation in public health...............................................................25Figure 7: AEG utility standards...........................................................................................................................26Figure 8: AEG feasibility standards.....................................................................................................................27Figure 9: AEG propriety standards......................................................................................................................27Figure 10: AEG accuracy standards.....................................................................................................................28Figure 11: SDG 6 targets and indicators.............................................................................................................29Figure 12: the classic experimental design. Adapted from Mouton (2016: 17).................................................34

2

SECTION ONEDefining EvaluationIntroduction: We will refer to slides in the accompanying Power Point presentation in this section and the

sections two and three that follow. We will also refer to the case study of the Youth in Action for Health

Programme from Shangano Organisation. In this section, we will look at some of the definitions of evaluation.

We will also look at the relationship between evaluation and monitoring and the relationship between

evaluation and research. Following this, we will explore some of the approaches that have influenced the

discipline and practice of evaluation over time. Then we will zoom in on the value and purpose of evaluation.

This will be followed by a quick look at some of the common activities that an evaluator undertakes during an

evaluation process and the possible roles that the evaluator plays and inevitably, the competencies the

evaluator may need to be able to play those roles. At the end of this section, our expectation is that you

should be able to do the following:

1. Define evaluation;

2. Differentiate evaluation from monitoring and research but also be able to see where they intersect;

3. State some of the evaluation approaches and their key features;

4. Explain different types of evaluation purpose;

5. Identify common evaluation activities, evaluator roles and the competencies you need to play those

roles;

Evaluation is defined as:

Evaluation is the systematic assessment of the operation and/or the outcomes of a program or policy, compared to a set of explicit or implicit standards as a means of contributing to the improvement of the program or policy. (Weiss, 1975)

Evaluation is the systematic and objective assessment of an ongoing or completed project, program, or policy, including its design, implementation, and results. The aim is to determine the relevance and fulfillment of objectives, development efficiency, effectiveness, impact, and sustainability’ (OECD, 2002:21).

Along the same lines as the definitions above, evaluation is defined by the Business Dictionary as a detailed

assessment of the outcome of a program. This assessment is carried out against established measures or

expected results to determine if it achieved its objectives. From these definitions, the following can be noted

about evaluation:

1. It follows and informs an intervention – project, programme or policy;

3

2. It seeks to establish whether outcomes (changes that we expect to see among beneficiaries because

of the intervention) have occurred, why they have occurred and how in what context and what

conditions;

3. It requires specific competencies (skills, knowledge and attitudes) from those that conduct

evaluations;

4. It is guided by a set of criteria or standards (discussed in sections that follow);

5. It involves critical thinking and use of information before reaching a judgement about the

intervention and

6. It seeks to improve programmes, projects and policies and inform decision-making.

Evaluation and monitoringUNAIDS defines monitoring as the routine tracking and reporting of priority information about a program /

project, its inputs and intended outputs, outcomes and impacts. This definition is line with the definition

offered by OECD. They describe monitoring is described as ‘continuous and systematic collection and of data

on specified indicators. The aim is to provide management and the main stakeholders of an ongoing

intervention with indications of the extent of progress and achievement of objectives and progress in the use

of allocated funds (OECD, 2002:27).

Looking at these definitions of monitoring and those of evaluation discussed above, it is evident that one of

the key differences between evaluation and monitoring is that the former is carried out at sporadic points in

the life cycle of the programme while the latter is ongoing (Mouton, 2016; Kellog Foundation, ud; Adhikari,

2017). Secondly, while evaluation is concerned about outputs and outcomes and impact, monitoring focus on

activities (what programme team members and implementing partners do), inputs (what they put in e.g.

human and financial resources) and outputs (the direct results of the activities carried out by programme

staff – these are tangible or intangible e.g. classrooms constructed, boreholes drilled or awareness campaigns

conducted). Referring to our case study of the YAH programme at Shangano Organisation, activities would

include training peer educators and setting up WhatsApp groups. Outputs would include resource centers

that have been established. One key outcome would be that students have increased uptake or use of

available health services. At impact level, the change we would expect is a reduction in new STIs at the 6

universities.

Table 1 below, adapted from Surbhi (2017) gives a summary of the differences between monitoring and

evaluation:

Table 1: Differences between monitoring and evaluation.

4

Basis for comparison

Monitoring Evaluation

Meaning Monitoring refers to a routine process, that examines the activities and progress of the project and also identifies bottlenecks during the process.

Evaluation is a sporadic activity that is used to draw conclusion regarding the relevance and effectiveness of the project or program.

Related to Observation Judgement

Occurs at Operational level Business level

Process Short term Long term

Focuses on Improving efficiency Improving effectiveness

Conducted by Internal Party Internal or External Party

Adapted from Surbhi, 2017. https://keydifferences.com/difference-between-monitoring-and-evaluation.html

Now if we turn back to our focus programme which is YAH, what do you think would be the key differences

between the monitoring and evaluation activities? You have probably considered a range of possibilities that

include the following:

1. The YAH clarificatory/deign evaluation was conducted at the beginning when they were

conceptualizing the programme. They are likely to have carried out feasibility assessments and or

needs assessment to get an understanding of the extent of problem of access to information and

SRH services and its uptake at universities. They are highly likely to have used tools such as the

problem/solution analysis tree (more details of this tool available at:

https://www.odi.org/publications/5258-problem-tree-analysis)

2. Shangano intends to have a process evaluation at the end of the second year of the programme,

which is at the mid-term of the programme. Their aim will be to see if they have been implementing

this programme in the universities according to plan and also look at how to improve their

implementation.

3. They will have an outcome evaluation at the end of the programme to see whether they have

achieved their objectives. One of them being to improve life skills of university students. Three years

after the programme ends they will have an impact study to check whether the programme has had

long-term effects in the universities.

4. Shangano will continuously collect data, monitoring data to check progress on outputs towards the

set targets – this data will be collected and analyzed monthly or quarterly. For instance, data on

participation on web-based platforms and use of services such as HIV Counseling and Testing will be

collected monthly while data on leadership engagement will be collected quarterly.

5

https://www.odi.org/publications/5258-problem-tree-analysis

https://keydifferences.com/difference-between-monitoring-and-evaluation.html

If you have come to the conclusion that this emphasizes that evaluation is periodic while monitoring is

continuous, then you are right! The YAH team will continuously monitor and only evaluate at specific points

in time.

The relationship between monitoring and evaluation

Before you evaluate, you must monitor (McKenna, 2018). This already indicates that there is a relationship

between the two. Evaluation leads to a judgment of merit or worth and some of the key data used to support

the inferences made are gathered during monitoring. Schwandt (2016) makes a similar observation and

notes that evaluation makes use of monitoring data. Monitoring and evaluation complement and reinforce

each other (UNESCO, 2016). As such, Shangano organization will not be able to have an impact study if they

have not collected baseline and monitoring data on the YAZ programme. The relationship between the two is

shown in figure 1 below:

Figure 1: Relationship between monitoring and evaluation. Adapted from Wildschut (2015)

Much of our discussions in this module will elaborate more on evaluations of programmes and projects. To access more information on evaluations of policies, you can visit these websites:

African monitoring and evaluation systems: exploratory case studies https://www.dpme.gov.za/publications/Reports%20and%20Other%20Information%20Products/Case%20Studies.pdf

The World Bank: Building Better Policies : The Nuts and Bolts of Monitoring and Evaluation Systems: http://www.managingforimpact.org/resource/world-bank-building-better-policies-nuts-and-bolts-monitoring-and-evaluation-systems

Monitoring and evaluation of policy influence and advocacy: http://www.managingforimpact.org/resource/monitoring-and-evaluation-policy-influence-and-advocacy

6

http://www.managingforimpact.org/resource/monitoring-and-evaluation-policy-influence-and-advocacy

http://www.managingforimpact.org/resource/world-bank-building-better-policies-nuts-and-bolts-monitoring-and-evaluation-systems



https://www.dpme.gov.za/publications/Reports%20and%20Other%20Information%20Products/Case%20Studies.pdf

https://www.dpme.gov.za/publications/Reports%20and%20Other%20Information%20Products/Case%20Studies.pdf

Evaluation guidelines: http://www.managingforimpact.org/resource/evaluation-guidelines

The World Bank: Handbook on Impact Evaluation: Quantitative Methods and Practices: http://www.managingforimpact.org/sites/default/files/resource/world_bank_impact_evaluation_handbook.pdf

How Feedback Loops Can Improve Aid (and Maybe Governance): http://www.managingforimpact.org/sites/default/files/resource/center_for_global_development_feedback_loops_whittlefeedbackessay_0.pdf

Influential valuations: Evaluations that Improved Performance and Impacts of Development Programs

http://www.managingforimpact.org/sites/default/files/resource/influential_evaluations_ecd.pdf

M&E and the programme life cycle

In the Monitoring section of this module we discussed the programme life cycle. For this module we adopted the programme life cycle used by UNODC as shown in figure 2 below:

Figure 2: Programme life cycle

Monitoring and evaluation is increasingly being viewed a programme management tool. As shown in the

programme life cycle above, M&E is not a stand-alone activity at the end of the programme or only at the

beginning of the programme. M&E is an integral part of the programme. To drive the point home, let’s look

at the YAH programme at Shangano organisation:

1. Stage 1 – Planning: M&E was part of the planning stage of their programme life cycle. As shown in

the figure above, at planning, baselines, needs assessment, stakeholder analysis, the M&E plan and

the project design and logframe are carried out. This is all part of the design/clarificatory evaluation

7

http://www.managingforimpact.org/sites/default/files/resource/influential_evaluations_ecd.pdf

http://www.managingforimpact.org/sites/default/files/resource/center_for_global_development_feedback_loops_whittlefeedbackessay_0.pdf

http://www.managingforimpact.org/sites/default/files/resource/center_for_global_development_feedback_loops_whittlefeedbackessay_0.pdf

http://www.managingforimpact.org/sites/default/files/resource/world_bank_impact_evaluation_handbook.pdf

http://www.managingforimpact.org/sites/default/files/resource/world_bank_impact_evaluation_handbook.pdf

http://www.managingforimpact.org/resource/evaluation-guidelines

that Shangano undertook. This helped them to establish the need, understand the problem, unearth

assumptions they had about the SRHS issues at campuses and how and why the YAH programme

components of information dissemination and services provision would bring about change to

address those issues.

2. Stage 2 – Implementation: during this stage, a mid term evaluation is conducted and continuous

monitoring takes place during this phase. Monitoring is guided by the M&E plan developed in the

planning stage (an M&E template can be accessed at

http://www.tools4dev.org/resources/monitoring-evaluation-plan-template/). For YAH programme,

a mid term evaluation will be conducted at the end of 2020, which is the second year of the

programme. This evaluation will inform implementation as Shangano makes necessary adjustments

based on evaluation findings. Monitoring data collected from the campuses and service providers

will also inform implementation.

3. Stage 3 – Evaluation: at this stage, the outcome evaluation will be conducted. For YAH, the aim

could be to get an understanding of whether or not the objectives have been met. For instance, do

universities now provide SRH services to students? Is there increased uptake of such services? And

have any policies and ordinances been changed. As can be seen in the figure above, findings will be

used to inform what Shangano will do next. They may decide to upscale – e.g. increase the number

of universities in the programme. They may decide to replicate – introduce the YAH programme at

FETs or poly-techniques in Zimbabwe such as Harare Poly or have backward linkages into

communities close to the universities, for instance rural communities close to Lupane State

Universities. They may also decide to scale back. For instance they may choose refocus their efforts

at just 3 of the 6 universities where they found the need to be highest and the institutional

arrangements supportive of the programme.

Evaluation and researchWe have defined evaluation in the preceding section and it is clear that evaluation follows precedes, informs

or follows an intervention. Research on the other hand is defined as ‘the collection and evaluation of

information about a particular subject. The overarching purpose of research is to answer questions and

generate new knowledge’ (Nordquist, 2019:1). Similary, reserch is

There are 4 views on the relationship between research and evaluation; a) research and evaluation as a

dichotomy, b) evaluation and research are mutually independent, c) evaluation as a subset of research and d)

research as a subset of evaluation.

8

http://www.tools4dev.org/resources/monitoring-evaluation-plan-template/

View 1: Research and evaluation as a dichotomy:

Podems (2014), just because someone is a good researcher it does not necessarily mean that they will be a

good evaluator. The differences between evaluation and research are listed below and also reflected on slide

number 6 in power point presentation. These are also shared by Patton (2014). As shown by slide number 6,

the intersection between the two is strong at methods and analysis. During the design, sampling and or

selection, data collection and analysis, evaluation teams use quantitative and qualitative research methods

and analysis. The differences between evaluation and research are evident on the following:

Purpose: research seeks to prove and evaluation seeks to not only prove but improve. Research generates

new scientific knowledge while evaluation produces information used for decision making e.g. on scaling up,

replicating or terminating an intervention.

Focus: research is researcher focused while evaluation is stakeholder focused (e.g. to meet the information

needs of different stakeholders such as management, funders, implementing partners and community

members benefiting from or affected by the evaluation). As such, research questions come from scholars

within a specific field while evaluation questions originate from stakeholders.

Research tests theory and produces generalizable findings. Evaluation seeks to answer key evaluation

questions that are usually centered on whether or an intervention has been effective. Generalization of

findings is not as important to evaluation. An understanding of the causes of phenomena is seen as more

important (Levine-Rozalis, 2003). While evaluation focuses on a program, research focuses on a population

(Small, 2012).

Dissemination of findings: researchers publish their research results and evaluators report their findings to

the evaluation stakeholders.

Test in value: the ultimate test for research is its contribution to knowledge. On the other hand, for

evaluation, this is found in the extent to which it improves effectiveness.

View 2: Evaluation and research are mutually independent

Based on the understanding that a person can conduct both research and evaluation or neither of the two,

this view sees evaluation and research as two independent variables that not mutually exclusive (Rogers,

2014). This view maintains the following:

Research is empirical while evaluation is value-based and evaluators make evaluative conclusions and pass a

judgment of merit or worth on an intervention. Research does not make any judgments. Rather it gives

factual descriptions e.g. census data. Evaluation without a research aspect does not use a systematic process

9

of data collection to arrive at judgments about the intervention. An overlap between the two occurs when

the evaluators systematically collect and analyze data for them to arrive at their judgment or conclusion. In

this regard, this is similar to the first view discussed above.

View 3: Evaluation as a subset of research

“Doing research does not necessarily require doing evaluation. However, doing evaluation always requires

doing research” (Endias, 1998). This view regards research as a learning process (observe and learn) while

evaluation is considered as a judgmental process (assess and make a decision).

View 4: Research as a subset of evaluation

Under this view, research is seen as one of the activities undertaken during an evaluation process. Other

activities include planning the evaluation, managing the evaluation and promoting use of evaluation findings.

Evaluation approachesA number of approaches have influenced the practice and discipline of evaluation over the years. These

include; responsive evaluation, utilization focused evaluation and developmental evaluation. Slide number 12

in the power point presentation provides a summary of these approaches. In addition to these, we also

discuss the Results Based Monitoring and Evaluation approach from slides 23 to 27.

Utilization focused evaluation: Michael Patton is known as the father of utilization-focused evaluation

(Mouton, 2015). Patton identified that ‘utility’ of evaluation findings was a glaring gap when it came to

judging evaluation proposals. He saw that measurability, generalizability and validity, for instance were

among the main criteria used. As such, evaluators could be divorced from what the commissioners did with

the findings of the evaluation. According to Patton (2008), the value of the evaluation should be its utility and

whether the organization actually uses the findings and recommendations and so attention has to be given to

intended use by intended users, constantly (Ramirez and Brodhead, 2010). This approach gives evaluators the

responsibility to ensure that the design and process of the evaluation took into account evaluation use. It has

10

12 steps as shown in the figure below:

Figure 3: Utilization focused evaluation steps. Adapted from Ramírez and Brodhead (2010:1)

Assuming that you were the leader of the team appointed to conduct an outcome evaluation of the YAH

programme at Shangano, what are some of the key considerations you would make if you were being

informed by the Utilization Focused Evaluation approach? You got it right, ensuring that you involve or

engage stakeholders or evaluation users right from the beginning of the evaluation process is one of the ways

you can ensure evaluation use.

Responsive evaluation: Robert Stake introduced responsive evaluation in 1967. It seeks to personalize and

humanize the evaluation process by incorporating and responding to the stakeholders’ opinions and concerns

throughout the design and implementation of the evaluation process without sacrificing the quest for quality

in the programme or intervention to be evaluated (Mouton, 2015). According to Stake, stakeholders other

11

than the evaluator are in a position to best judge the quality of the evaluand (Robison, 2002). It is not a

particular model or even methodology of how to conduct an evaluation; but rather it as an approach or

philosophy, responsive evaluation upholds the importance of the evaluator getting close to the evaluation

stakeholders and being responsive to their concerns and issues.

There are different stakeholders and constituencies that are affected by every evaluation. Stake posits that it

is the responsibility of the evaluator to ensure that the evaluation would exhibit their expectations, different

values and value positions (Mouton 2015). And yet, consensus is not the aim. The evaluator is viewed as

being in the position to show these values, value positions and expectations without pushing for consensus

(Stake, 2001). Stake was not ‘comfortable’ with the idea that evaluation should stand for something. He

believed that the evaluator’s role is to show the different stakeholder perspectives and provide a description

(Robinson, 2002).

If you were asked to apply the responsive evaluation thinking in the same evaluation of YAH, you would

probably be interested in ensuring that the perspectives of different stakeholders are considered in the

evaluation. You are on the right track; these stakeholders include students, health facility management at

universities, the university leadership as well as other implementing partners (such as the National AIDS

Council) that work with Shangano at the university.

Empowerment/participatory evaluation: In 1993, David Fetterman introduced the empowerment evaluation

approach as a response to the value-free positivist models (Miller and Campbell, 2006). The approach draws

its origins from empowerment theory, community psychology, and action anthropology (Sheriff and Porter,

ud). At the heart of the approach is the aim “to help people help themselves” (Fetterman, 1996:5). Besides

seeking to improve policies and programmes, the approach is intentional about providing capacity building

for communities to have skills to monitor and evaluate their own performance and accomplish their goals

(Better Evaluation, ud). Potter (1999) categorizes the approach in the realm of critical-emancipatory

approaches to programme evaluation. Empowerment evaluation challenges the status quo.

Fetterman identifies five "facets" of empowerment evaluation (Mouton, 2015, Sheriff and Potter, ud):

• Training: training participants to conduct their own evaluations;

• Facilitation role: the evaluator is a facilitator who coaches rather than judges;

• Advocacy role: the evaluator advocates for programme team members to decide on the nature and

purpose of the evaluation;

12

• Liberation: programme personnel have improved skills to redefine their roles and objectives and, in

that progression, improve their own lives.

So, if you were to use empowerment evaluation approach to the evaluation of the YAH programme, what is

one of the key things you would do? You will probably train peer educators to collect and analyze data. You

may also do some capacity building for implementing partners that provide the services during campaigns if

needed.

Theory-based approaches to evaluation: Theory based evaluation is not a specific method used during an

evaluation. Rather, it is an approach to evaluation and a way that an evaluator can use to structure and

undertake the evaluation task at hand (Treasury Board of Canada Secretariat, 2010). Theory based

approaches to evaluation include ‘realist evaluation’ put forward by Pawson and Tilley (1997). They posit that

whether interventions work or not depends on the ‘underlying mechanisms at play in a specific context’.

Simply put, for them an outcome is a result of an interaction between the context and a mechanism that

causes change. As such, we should be asking, ‘what works for whom, under what circumstances?’ (Pawson

and Tilley, 1997).

Theory-based approaches to evaluation also articulate the Theory of Change for the programme. The Theory

of Change is used to draw conclusions about whether and how a programme contributed to the expected

effects. In that regard, they open the black-box that dominated early evaluations about which Weis (1997)

was highly concerned – a phenomena where we can see results but we are not sure about how the change

occurred or how it occurred in a specific context. Slides 17, 18, 19 and 20 in the power point provide brief

summaries of the Theory of Change. In the Shangano case study, we mentioned their theory of change and

some of the outputs and outcomes they identified using the clarificatory/design evaluation. It is noteworthy

that by unpacking their theory of change it was possible to unearth their assumptions about how and why

change would occur. As an evaluator appointed to evaluate the YAH programme, you will have to get an

understanding of their Theory of Change before you can conduct an evaluation.

More details on the Programme Theory (Theory of Change and Logic Models) can be found at

https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/centre-excellence-

evaluation/theory-based-approaches-evaluation-concepts-practices.html. You can also visit this site

https://www.theoryofchange.org/.

Results-based monitoring and evaluation: Results Based Monitoring and Evaluation (RBME) is recognized as

an essential management tool for the public sector (Waidyaratma, 2012; Kusek and Rist, 2004) The

application of Results Based Monitoring and Evaluation (RBME) is described as a continuous process of

improvement (Farrell, 2009). RBM focuses on; achieving results, implementing performance measurement

13

https://www.theoryofchange.org/

https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/centre-excellence-evaluation/theory-based-approaches-evaluation-concepts-practices.html

https://www.canada.ca/en/treasury-board-secretariat/services/audit-evaluation/centre-excellence-evaluation/theory-based-approaches-evaluation-concepts-practices.html

and using feedback learn and change. A number of concerns have pushed government departments and

ministries as well as organizations to adopt the RBME model. These include:

a. Having unclear goals;

b. Focusing on intervention activities and not results;

c. Inability to use feedback or data for decision making and corrections;

d. Inability to justify the need for resources to relevant stakeholders.

When applying RBME, you describe results in a sequential hierarchy. This starts by articulating specific short-

term results. These are followed by the long-term results that are broader by nature and will occur after the

short-term results are accomplished. You then design an M&E process that will be used to assess whether or

not results have been achieved. Resources are then allocated on the basis of activities that will need to be

undertaken to achieve the results. The M&E processes will also guide how results will be communicated to

stakeholders. Kusek and Rist (2004) developed an informative handbook on the steps to follow when building

a RBME system. This can be found at https://www.oecd.org/dac/peer-reviews/World%20bank

%202004%2010_Steps_to_a_Results_Based_ME_System.pdf. Farell (2009) also provides another good

source of information RBME. His handbook can be accessed at

http://oasis.col.org/bitstream/handle/11599/110/MEHandbook.pdf?sequence=1&isAllowed=y

14

http://oasis.col.org/bitstream/handle/11599/110/MEHandbook.pdf?sequence=1&isAllowed=y

https://www.oecd.org/dac/peer-reviews/World%20bank%202004%2010_Steps_to_a_Results_Based_ME_System.pdf

https://www.oecd.org/dac/peer-reviews/World%20bank%202004%2010_Steps_to_a_Results_Based_ME_System.pdf

Figure 4: RMBE and the logic model.

The figure above illustrates how the logic model can be used in the RBM process. Traditional monitoring and

evaluation would focus on process and implementation. As such, the focus would end with inputs and

activities and outputs. But RBME is interested in the ‘so what question’. It looks at the outcomes and impacts

that should result from the outputs – that is where the results are (intermediate effects of the programme

and the long term changes that will be observed in the targeted population – intended and unintended).

Similarities between evaluation approaches

Figure 3 also serves to illustrate that there are some similarities between the different evaluation

approaches. It has to be noted that at the heart of most of the approaches was the real need to make M&E

to help evaluators and organisations to better understand change, how it occurs and why it occurs. These

similarities between the approaches include:

15

1. Emphasis on unpacking why and how change occurs;

2. Linked to point 1 above, emphasis on ensuring that we understand the logical connection between

the various results levels of the intervention;

3. Emphasis on having a better understanding of intervention effects and impact;

4. Emphasis on stakeholder engagement;

5. Emphasis on understanding the context of the intervention, as interventions do not occur in a

vacuum.

Evaluation activities and the evaluatorOn slide 29 we presented the possible and most common stages of an evaluation process. In reality, these

stages are not always in logical sequence. For example, while promoting use of results is at stage 9, this

occurs at most stages of the evaluation process. For instance, at the planning stage and at the initial

implementation stage as well as at the data collection and analysis stage, the evaluator will ensure that he or

she engages relevant stakeholders. Let us use the YAH programme as an example. To conduct an outcome

evaluation successfully, you would have to engage stakeholders such as students, university leadership,

implementing partners and the Shangano implementing team as well as their M&E officer at all these stages.

Engaging them in understanding and defining objectives and articulating evaluations questions is an

important factor for evaluation use. In recent years, most organizations have developed Rosters for M&E

specialists. As such, the evaluation activities may not necessarily start with stages 1 and 2 as shown in the

slide but with stage 3 as the evaluators would already be part of the organisation’s roster who gets

evaluation or evaluation related work assignments as need arises. Please note, the CDC framework for

evaluating public health interventions is a very good framework of reference especially for public health

programs (although it can be applied across sectors). More details on the framework are available at

https://www.cdc.gov/eval/framework/index.htm.

Some competencies are needed through every stage or most stages of the evaluation, for example, cultural

competency, interpersonal skills and communication skills and ethical practice. For the YAH programme, you

will need to be culturally competent when talking to students, when talking to the health facility staff and

when talking to university leadership – for instance as they are likely to have different perspectives and

realities. It is also important to note that the evaluator’s roles will also be affected by whether he or she is an

internal or external evaluator. For example, the former is likely to start their activities at stage 4, which is

‘Planning evaluation’.

No single evaluator comes with all the skills that are required for the evaluation. A full evaluation is different

from offering technical support for a team that is developing a Theory of Change or and M&E plan. These

16

https://www.cdc.gov/eval/framework/index.htm

types of assignments can be carried out by one individual. A full evaluation, for instance an outcome

evaluation or an impact evaluation in reality would require a diverse team of evaluators who bring their

unique set of skills to the team. One team member may be very strong in qualitative data analysis, another in

quantitative data analysis, another may have a stronger understanding of how to conduct Value for Money

studies and another may have subject matter expertise in the field e.g. market development and micro-

finance or public health. As a result, most competency frameworks that have been developed by Voluntary

Organisations for Professional Evaluators (VOPES) e.g. ANZEA (2013), AES (2011) are quite clear on this point.

AES regards its competency framework as a ‘menu rather than a checklist’ (AES 2013:8) because evaluators

have different strengths, knowledge, skills and experience’. The competencies are therefore an instrument

for ‘understanding and managing strengths and gaps in a constructive way’ (AES 2013:8). The same view is

taken by ANZEA and UKES.

It is NOT expected that an individual evaluator or an evaluation team would possess ALL of the proposed

competencies. Rather, evaluators will develop and build on their areas of strength, and address any gaps

through professional development and/or collaborating with others. (ANZEA 2011: 6). The UKES

acknowledges that the responsibility to ensure a quality evaluation does not solely lie with the individual

evaluator. Besides the organisational environment and culture towards evaluations, teams are often involved

in the conduct of evaluation in which evaluators have different strengths and different levels of experience

(UKES 2012:3).

What is the purpose of programme evaluations?Evaluations are conducted with a purpose in mind. The evaluator and stakeholders should always be clear on

its intended purpose i.e. is the evaluation meant for accountability or decision making. That means that the

evaluator has to ensure that the intended audience for the evaluation have been identified. As shown in the

slide, the purpose of an evaluation is often linked to the type of evaluation and inevitably, the timing or stage

at which the evaluation occurs. The evaluation types are discussed in the section that follows. In addition, the

purpose of the evaluation also determines the team composition (whether it is going to be conducted by an

external or internal team), the resources, time and budget that needs to be allocated and the people

responsible for setting the Terms of Reference (TOR).

Formative functions: (before or during implementation) an evaluation can inform design and improve

implementation. For instance, during a clarificatory evaluation programme theory will be articulated and

assumptions unpacked and the logic of the of the proposed intervention examined

Summative functions: (at the end of the programme or close to the end of the project) evaluation findings

can inform decisions about upscaling, downsizing or replication thereby reducing the reduce risk

17

Strategic functions: evaluation results can help the organization to prioritize their investments, they can

facilitate learning and support risk management

Learning functions: evaluation studies can assist in gaining more insight about human behaviour and how

social change occurs. They can bring understanding on what sort of programmes produce what impact in

what contexts and under what circumstances.

The Terms of Reference and inception reports: Before engaging in an evaluation, the TOR is non-negotiable.

It is a strategy-level document that defines the tasks and duties required of the evaluator or evaluators. It

also highlights the evaluation purpose, scope, objectives and audience as well as expected deliverables and

responsibilities of both the evaluator(s) and the commissioners of the evaluation. Assuming that you have

been engaged by Shangano to conduct a process evaluation of YAH, you would want to ensure that the TOR

has all the things that we have just listed. In addition, you will ensure that the TOR specifies the expected

time lines, the reporting lines as well as the initially proposed evaluation design (evaluation designs are

dicussed in latter sections) and questions. Although some of the key items remain the same, TORs vary from

evaluation to evaluation depending on the organisation, field, sector, scope as well as budget available (for

instance Shangano may decided only to hire a consultant to lead the evaluation but all data collection at

campuses will be conducted by trained youth volunteers and provincial coordinators due to financial

contraints of paying a full evaluation team). The Independent Evaluation Group (IEG) provides insightful

guidance on how to develop a TOR. This can be accessed at

https://siteresources.worldbank.org/EXTEVACAPDEV/Resources/ecd_writing_TORs.pdf.

An inception report will follow the TOR. The inception report is prepared by the evauator. In the scenario

where you are evaluating the YAH programme, the inception report is essential. It helps you to get a clear

understanding and agreement beyween yourself and Shangano in terms of what you will deliver. For

instance, it detils the evaluation questions, the evaluation design, the data collection methods and tools as

well as the sampling and selection strategy that you will employ plus how data will be analysed.

Summary of section one:In this section, we have defined evaluation. Our point of departure was that program evaluation follows and

informs an intervention. Evaluation has a number of purposes that include accountability, supporting

decision making and strategic investments. We have looked at how evaluation differs from and relates to

research from 4 perspectives – the evaluation and research perspective, evaluation and research as two

independent exercises, evaluation as a subset of research and research as a subset of evaluation. We also

explored the common activities during an evaluation and how the evaluator will respond with specific roles

that require certain competencies. Remember, our emphasis was the no single evaluator comes with all the

18

https://siteresources.worldbank.org/EXTEVACAPDEV/Resources/ecd_writing_TORs.pdf

competencies required for evaluation work – competencies being the skills, knowledge and attitudes. Among

the evaluation approaches that have influenced the discipline over the years, we looked at utilization focused

evaluation (emphasis on evaluation use), realistic evaluation (what works for whom in what circumstances)

and theory-based evaluation (emphasis on unpacking the theory of the programme – theory of change and

theory of action). We also looked at the RMBE as a management tool for the public sector and identified its

emphasis on moving beyond outputs and activities and inputs to consider results (outcomes and impacts).

Section one: quick knowledge check - individual exerciseBefore we move to section two, complete exercise one to see what you have learnt and double check areas

where you are still not sure. All the questions are in multiple choice and the correct answers will be provided

for you to check.

Questions Responses1. Evaluation occurs continuously throughout the life-cycle of the

programA. TrueB. False

2. An evaluation will never need monitoring data. Evaluators will have sufficient data during their study to make a judgement of merit or worth

A. TrueB. False

3. This evaluation approach believes that stakeholders of the evaluation are best placed, more than the evaluator to judge the quality of the programme

A. Realistic evaluationB. Theory based evaluationC. Utilization focused evaluation D. Responsive evaluation

4. These competencies are required of the evaluator in most stages of the evaluation process

A. cultural competency, B. interpersonal skillsC. ethical practiceD. a and bE. a, b and c

5. RMBE is mostly concerned with the outputs produced from programme activities

A. TrueB. False

19

SECTION TWO Introduction: Having defined evaluation, identified the possible roles and competencies for evaluation

activities and identified the purpose of evaluation as well as the evaluation approaches, we now turn to

evaluation types. This discussion is then followed by evaluation criteria and evaluation standards. You will see

that the evaluation types and the criteria cannot be divorced. We will end off the section by looking at

evaluation and the Sustainable Development Goals. By the end of the section, we expect that you will be able

to do the following:

1. Identify different evaluation types, their purpose and timing in the project life-cycle;

2. Describe the evaluation criteria and the evaluations they apply to;

3. Identify African Evaluation Guidelines;

4. Explain the role of evaluation in the review of national, regional and global progress towards the SDGs.

Types of evaluationEvaluations have to meet the needs of the commissioner or the one who requests the evaluation. There are a

number of evaluations that can be conducted and these fall into two categories of summative and formative

evaluations (The Pell Institute, ud). Summative evaluations are conducted at the end of the project or

programme to determine the value, merit or worth of the intervention against some criteria or standards

(Scriven, 1967). On the other hand, formative evaluations are carried out at the beginning of the program or

during implementation to inform design (Guyot, 1978). Figure 3 below illustrates how formative and

summative evaluations can be used.

20

Figure 5: The various uses of summative and formative evaluations; adapted from the PELL Institute Evaluation Tool Kit.

Table2 below presents a summary of evaluation types, the purpose of the evaluation, the ideal timing within

the program lifecycle and the value that particular evaluation presents to the commissioning organization.

Let’s look at one example, clarificatory evaluation. This type of evaluation is aimed at clarifying the logic of

the evaluation (Owen and Rogers, 1999). The key questions would include whether the planned activities will

lead to the expected outputs and would these be sufficient to lead to the expected outcomes. Clarificatory

evaluations are conducted at the beginning of the programme and during conceptualization and design.

Thus, they are formative by nature. In terms of value to the organization, programme teams and the

management can get a sense of whether they are ready to implement the programme and they can see

whether or not the planned programme is feasible (given their context, resources, experience etc.). Examples

of clarificatory evaluation include; feasibility studies, needs assessments, and using the theory of change

framework to get clarity on the logic of the intervention – from activities through to impacts.

21

Table 2: Evaluation types, timing, purpose and value. Adapted from Mouton, 2015.

Type of Evaluation Purpose Timing Value to the Organisation1.Design/ clarificatory evaluation

To clarify program logic (goals, objectives, outputs and outcomes) in order to establish feasibility of implementation

During conceptualization and design of a program

Ensures clarity and feasibility of program and whether program is ready for implementation

2.Process or implementation evaluation

To establish whether a program is being implemented properly and whether the target group receives the intervention

Concurrent with roll-out of program

Provides timely ad constant feedback on program roll-out and enables quick adjustments to programs

3.Mid-term review(s) (of long-term programs)

To establish whether short-term outputs and outcomes are being achieved in order to advise project management in implementation. It can also help the team to look at contextual changes and how the programme/project is adapting.

Half-way through actual (not planned) implementation

Ensures more systematic reviews of implementation and ‘first’ achievements

4.Diagnostic evaluation (where there is a perceived risk of no or poor implementation)

To establish whether a program is still on course; to identify problems/weaknesses in implementation and to advise donors whether to continue or discontinue funding

Evaluation commissioned during program implementation for trouble-shooting purposes

Ensures an independent assessment of potential problems in program implementation with recommendations re. change

5.Outcomes evaluations and impact assessments

To establish whether the expected outcomes of a program have been achieved and what the overall impact of the program has been

Commence with baseline study + continuous monitoring + structured post implementation measures

Provides rigorous and credible assessment of immediate and possible short-term impact of a program and whether money has been well-spent (ROI/Value for money studies)

6.Cost-benefit analysis To assess the benefit (value) of a program to the target group against the cost of implementation

Similar to impact assessment

Calculation in some standardized units of both cost as well as benefits accruing from program

22

Evaluation CriteriaCriteria is defined as “a standard on which a judgment or decision may be based. How to use criterion in a

sentence” (Merriam-Webster). Evaluation criteria is described as:

Evaluation criteria for a project are like assessment criteria for student work. Before we can fairly assess a piece of student work, we need to clearly identify what we are looking for, and on what basis our assessment will be made. https://education.nsw.gov.au/teaching-and-learning/professional-learning/evaluation-resource-hub/evaluation-design-and-planning/setting-the-scope-of-an-evaluation/evaluation-criteria

Such criteria for evaluation as described above has been provided by the OECD DAC (the criteria are currently

under review) as shown by slide number 33. It is noteworthy that sometimes, depending on the evaluation

type and purpose, one or more criteria will apply:

In addition to using effectiveness as a criterion, an evaluation might employ economic criteria (efficiency in terms of costs and benefits), equity and equality criteria (who benefits, who doesn’t), as well as criteria related to sustainability, cultural and contextual relevance and appropriateness, and sometimes other criteria negotiated with stakeholders (IIED, 2016: 2).

It is important to remember that not all criteria will apply to each and every type of evaluation. Rather, it will

depend on the evaluation type and evaluation purpose. In other words, the criteria are a guide for evaluators

and not a prescription that all should be applied to each evaluation. Table 3 below demonstrates how

different criteria will apply to different evaluations, although the criteria is at times complementary:

Table 3: Evaluation types, purpose and timing. Adapted from Mouton (2015)

Type of Evaluation Purpose Timing Value to the Organisation1.Design/Clarificatory evaluation

To clarify programme logic (goals, objectives, outputs and outcomes) in order to establish feasibility of implementation. E.g. for YAH programme, will the activities planned such as training of peer educators and facilitating web based discussions give Shangano the outputs they need to lead to improved life skills of students?

During conceptualization and design of a program

Ensures clarity and feasibility of program and whether program is ready for implementation

2.Process or implementation evaluation

To establish whether a programme is being implemented properly and whether the target group receives the intervention. E.g. for YAH programme, they would want to find out whether all the intended trainings have conducted with peer-educators as per plan and whether the planned health campaigns have been conducted the expected services delivered. They would also want to know whether the

Concurrent with roll-out of program

Provides timely ad constant feedback on program roll-out and enables quick adjustments to programs

23

information has been reaching their key target groups e.g. female students in their first year or students at campuses that have not had equal access to services.

3.Mid-term review(s) (of long-term programs)

To establish whether short-term outputs and outcomes are being achieved in order to advise project management in implementation. It can also help the team to look at contextual changes and how the programme/project is adapting. E.g. for YAH programme, they may be interested to find out whether there are any changes that are occurring within the institutions that affect the implementation of the programme and consequently the outputs and outcomes. With that information, they can adapt implementation.

Half-way through actual (not planned) implementation

Ensures more systematic reviews of implementation and ‘first’ achievements

4.Diagnostic evaluation (where there is a perceived risk of no or poor implementation)

To establish whether a programme is still on course; to identify problems/weaknesses in implementation and to advise donors whether to continue or discontinue funding. E.g. for YAH programme, there may be concerns about buy-in from universities and Shangano may want to come up with strategies to better engage the universities’ leadership.

Evaluation commissioned during program implementation for trouble-shooting purposes

Ensures an independent assessment of potential problems in program implementation with recommendations re. change

5.Outcomes evaluations and impact assessments

To establish whether the expected outcomes of a program have been achieved and what the overall impact of the program has been. E.g. for YAH programme, Shangano would want to whether or not students have increased access to current and accurate SRH information and services and whether students increasingly use the services available. In addition, they would want to know whether the numbers of new STI infections are reduced at the 6 universities.

Commence with baseline study + continuous monitoring + structured post implementation measures

Provides rigorous and credible assessment of immediate and possible short-term impact of a program and whether money has been well-spent (ROI/Value for money studies)

6.Cost-benefit analysis To assess the benefit (value) of a program to the target group against the cost of implementation. E.g. for YAH programme, Shangano may want to know whether the resources (financial and human)

Similar to impact assessment

Calculation in some standardized units of both cost as well as benefits accruing from program

24

that have been used to achieve the results could have been put to optimal use and whether there could have been better ways of doing things.

Different criteria exist for evaluations in Evaluations of Humanitarian Actions. These include connectedness,

coherence, coverage and impact. It has to be noted that evaluations in the humanitarian response contexts

differ from evaluations of conventional development programs in terms of timing, purpose, data available

and the reality that humanitarian responses occur in very highly dynamic contexts where needs and priorities

shift constantly. More details on these criteria can be accessed at

https://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/2382.pdf. Better Evaluation

recently added a new page with resources on Evaluation of Humanitarian Response. This can be accessed at

https://www.betterevaluation.org/en/blog/evaluation-humanitarian-action-new-page.

Evaluation Standards, guidelines and principlesThere are guidelines, standards and principles that guide evaluation practice. These include the Programme

Evaluation Standards and the African Evaluation Guidelines. In this section we focus on the Africa Evaluation

Guidelines (AEG). They were developed by AFREA through a consultative process that spanned over 5 years -

1998 to 2002 (AFREA https://www.AFREA/African_Evaluation_Guidelines.pdf. As noted by AFREA, these

guidelines are based on the Programme Evaluation Standards (PES) that are used by the American Evaluation

Association. The PES was developed by the American Joint Committee on Standards for Educational

Evaluation (AJCSEE). The standards have been promoted by many VOPEs (UNICEF

https://www.unicef.org/evaluation/files/Evaluation_standards.pdf). They are also promoted by agencies such

as the Center for Diseases Control (CDC).

What is the purpose of the guidelines?

AFREA refer to these guidelines as a ‘checklist’ to assist evaluators in ‘planning evaluations, negotiating clear

contracts, reviewing progress and ensuring adequate completion of an evaluation’. It appears that quality

evaluations that are useful, that inform decision making and that are cognizant of stakeholders are at the

heart of the guidelines. A look at the guidelines also reveals that accountability of the evaluation is important.

Recently, the standard ‘Evaluation Accountability’ was added to the four standards by the CDC as a stand-

alone standard (MacDonald, ud). The CDC framework puts the evaluation standards at the core of the

evaluation process, from Step 1 which is stakeholder engagement to step 6 which is ensuring use and sharing

of lessons learned as shown below in figure 4:

25

https://www.unicef.org/evaluation/files/Evaluation_standards.pdf

https://www.AFREA/African_Evaluation_Guidelines.pdf

https://www.betterevaluation.org/en/blog/evaluation-humanitarian-action-new-page

https://www.odi.org/sites/odi.org.uk/files/odi-assets/publications-opinion-files/2382.pdf

Figure 6: CDC framework for programme evaluation in public health

The standards and the statements: AEG content

Each of the 4 standards in the AEG is supported by statements that further described the standard. These are

detailed in figures 4-6 below. You will see that some of the statements will have (modified) in brackets. That

means that they have been modified from the original statement in the PES. A detailed description of the

process can be accesses at https://afrea.org/the-african-evaluation-guidelines/

26

https://afrea.org/the-african-evaluation-guidelines/

Figure 7: AEG utility standards.

Figure 8: AEG feasibility standards

27

Evaluation standard 1: Utility

Description: The utility guidelines are intended to ensure that an evaluation will serve the information needs of intended users and be owned by stakeholders.

U1 Stakeholder identification (modified): Persons and organisations involved in or affected by the evaluation (with special attention to beneficiaries at community level) should be identified and included in the evaluation process, so that their needs can be addressed and so that the evaluation findings are utilisable and owned by stakeholders, to the extent this is useful, feasible and allowed.

U2 Evaluator credibility: The persons conducting the evaluation should be both trustworthy and competent to perform the evaluation, so that the evaluation findings achieve maximum credibility and acceptance.

U3 Information scope and selection: Information collected should be broadly selected to address pertinent questions about the programme and be responsive to the needs and interests of clients and other specified stakeholders.

U4 Values identification (modified): The perspectives, procedures, and rationale used to interpret the findings should be carefully described, so that the bases for value judgments are clear. The possibility of allowing multiple interpretations of findings should be transparently preserved, provided that these interpretations respond to stakeholders’ concerns and needs for utilisation purposes.

U5 Report clarity: Evaluation reports should clearly describe the programme being evaluated, including its context, and the purposes, procedures and findings of the evaluation, so that essential information is provided and easily understood.

U6 Report timeliness and dissemination (modified): Significant interim findings and evaluation reports should be disseminated to intended users, so that they can be used in a reasonably timely fashion, to the extent that this is useful, feasible and allowed. Comments and feedback of intended users on interim findings should be taken into consideration prior to the production of the final report.

U7 Evaluation impact: Evaluations should be planned, conducted and reported in ways that encourage follow through by stakeholders, so that the likelihood that the evaluation will be used is increased.

Evaluation standard 2: Feasibility

Description: The feasibility guidelines are intended to ensure that an evaluation will be realistic, prudent, diplomatic, and frugal.

F1 Practical procedures: The evaluation procedures should be practical to keep disruption to a minimum while needed information is obtained.

F2 Political viability (modified): The evaluation should be planned and conducted with anticipation of the different positions of various interest groups, so that their cooperation may be obtained, and so that possible attempts by any of these groups to curtail evaluation operations or to prejudice or misapply the results can be averted or counteracted to the extent that this is feasible in the given institutional and national situation.

F3 Cost effectiveness (modified): The evaluation should be efficient and produce information of sufficient value, so that the resources expended can be justified. It should keep within its budget and account for its own expenditures.

Figure 9: AEG propriety standards.

28

Evaluation standard 3: Propriety

Description: The propriety guidelines are intended to ensure that an evaluation will be conducted legally, ethically and with due regard for welfare of those involved in the evaluation, as well as those affected by its results.

P1 Service orientation: Evaluation should be designed to assist organisations to address and effectively serve the needs of the full range of targeted participants.

P2 Formal agreements (modified): Obligations of the formal parties to an evaluation (what is to be done, how, by whom, when) should be agreed to through dialogue and in writing, to the extent that this is feasible and appropriate, in order for these parties to have a common understanding of all the conditions of the agreement and hence are in a position to formally renegotiate it if necessary. Specific attention should be paid to informal and implicit aspects of expectations of all parties to the contract.

P3 Rights of human participants (modified): Evaluation should be designed and conducted to respect and protect the rights and welfare of human subjects and the communities of which they are members. The confidentiality of personal information collected from various sources must be strictly protected.

P4 Human interaction (modified): Evaluators should respect human dignity and worth in their interactions with other persons associated with an evaluation, so that participants are not threatened or harmed or their cultural or religious values compromised.

P5 Complete and fair assessment: The evaluation should be complete and fair in its examination and recording of strengths and weaknesses of the programme being evaluated, so that strengths can be built upon and problem areas addressed.

P6 Disclosure of findings (modified): The formal parties to an evaluation should ensure that the full set of evaluation findings along with pertinent limitations are made accessible to the persons affected by the evaluation, and any others with expressed legal rights to receive the results as far as possible. The evaluation team and the evaluating institution will determine what is deemed possible, to ensure that the needs for confidentiality of national or governmental entities and of the contracting agents are respected, and that the evaluators are not exposed to potential harm.

P7 Conflict of interest: Conflict of interest should be dealt with openly and honestly, so that it does not compromise the evaluation.

P8 Fiscal responsibility: The evaluator’s allocation and expenditure of resources should reflect sound accountability

Figure 10: AEG accuracy standards

29

Evaluation standard 4: Accuracy

Description: The accuracy guidelines are intended to ensure that an evaluation will reveal and convey technically adequate information about the features that determine worth of merit of the programme being evaluated.

A1 Programme documentation (modified): The programme being evaluated should be described clearly and accurately, so that the programme is clearly identified, with attention paid to personal and verbal communications as well as written records.

A2 Context analysis (modified): The context in which the programme exists should be examined in enough detail, including political, social, cultural and environmental aspects, so that its likely influences on the programme can be identified and assessed.

A3 Described purposes and procedures: The purposes and procedures of the evaluation should be monitored and described in enough detail, so that they can be identified and assessed.

A4 Defensible Information sources (modified): The sources of information used in a programme evaluation should be described in enough detail, so that the adequacy of the information can be assessed, without compromising any necessary anonymity or cultural or individual sensitivities of respondents.

A5 Valid information (modified): The information gathering procedures should be chosen or developed and then implemented so that they will assure that the implementation arrived at is valid for the intended use. Information that is likely to be susceptible to biased reporting should be checked using a range of methods and from a variety of sources.

A6 Reliable information: The information gathering procedures should be chosen or developed and then implemented so that they will assure that the information obtained is sufficiently reliable for the intended use.

A7 Systematic information: The information collected, processed and reported in an evaluation should be systematically reviewed and any errors found should be corrected.

A8 Analysis of quantitative information: Quantitative information in an evaluation should be appropriately and systematically analysed so that evaluation questions are effectively answered.

A9 Analysis of qualitative information: Qualitative information in an evaluation should be appropriately and systematically analysed so that evaluation questions are effectively answered.

A10 Justified conclusions: The conclusions reached in an evaluation should be explicitly justified, so that stakeholders can assess them.

A11 Impartial reporting: Reporting procedures should guard against distortion caused by personal feelings and biases of any party to the evaluation, so that evaluation reports fairly reflect the evaluation findings.

A12 Meta-evaluation: The evaluation itself should be evaluated in a formative and summative manner against these and other pertinent guidelines, so that its conduct is appropriately guided and, on completion, stakeholders can closely examine its strengths and weakness.

Sustainable Development GoalsIn 2015, the United adopted the 2030 Agenda for Sustainable Development Goals (SDGs). The SDGs build on

the successes of the Millennium Development Goals (MDGs). Also known as the Global Goals, the 17 SDGs

are “a universal call to action to end poverty, protect the planet and ensure that all people enjoy peace and

prosperity” (UNDP).

SDGs indicators: Each of the 17 ambitious SDGs has its set of targets. In total there 230 targets and each of

these has indicators. You will notice that specific programs, projects ad policies contribute to specific SDGs.

And the SDGs themselves are not mutually exclusive – progress in one is essential for progress to be realized

in another. The outcomes and impacts in your programme logframe or logical model can tell you which SDGs

you are contributing to. For example, an organization that implements programs and projects in shelter or

housing sector is likely to be contributing to SDG number 11 (Sustainable cities and communities) and SDG

number 6 (Clean water and sanitation) through work in Water Sanitation and Health (WASH). Below is an

example of SGD number 6 with its targets and indicators. Details of all the SDGs and their targets and

indicators can be accessed at https://unstats.un.org/sdgs/metadata/.

Figure 11: SDG 6 targets and indicators.

30

Goal 6: Ensure availability and sustainable management of water and sanitation for all

Target 6.1: By 2030, achieve universal and equitable access to safe and affordable drinking water for all

Indicator 6.1.1: Proportion of population using safely managed drinking water services

Target 6.2: By 2030, achieve access to adequate and equitable sanitation and hygiene for all and end open defecation, paying special attention to the needs of women and girls and those in vulnerable situations

Indicator 6.2.1: Proportion of population using safely managed sanitation services, including a hand-

Target 6.3: By 2030, improve water quality by reducing pollution, eliminating dumping and minimizing release of hazardous chemicals and materials, halving the proportion of untreated wastewater and substantially increasing recycling and safe reuse globally

Indicator 6.3.1: Proportion of wastewater safely treated Indicator 6.3.2: Proportion of bodies of water with good ambient water quality

Target 6.4: By 2030, substantially increase water-use efficiency across all sectors and ensure sustainable withdrawals and supply of freshwater to address water scarcity and substantially reduce the number of people suffering from water scarcity

Indicator 6.4.1: Change in water-use efficiency over time Indicator 6.4.2: Level of water stress: freshwater withdrawal as a proportion of available freshwater resources

https://unstats.un.org/sdgs/metadata/

SDGs and M&E: Programs, projects and policies contribute to the attainment of SDGs at the national and global level.

Evaluation will be essential for the review of systems for the SDGs (International Institute for Environment

and Development). The 2030 Agenda committed to systematic review and engagement over the next 15

years to measure progress. This will be accomplished through high quality evaluations. As argued by D'Errico

et al (2016), such evaluation goes beyond ‘conducting a survey’. Rather, it involves critical thinking, asking the

right questions and analyzing claims before arriving at a judgment of value, worth or merit. It is not enough

to monitor and measure. Evaluation of the SDGs is key to show not just how but why progress has been made

nationally, regionally and globally. It can also inform improvements on future national, regional and global

initiatives (IIED, 2016).

Summary of section two

In this section, we built upon the found that was laid in section one. We looked at the different evaluation

types and came to the understanding that evaluations are either formative or summative by nature.

Formative evaluations are conducted at the beginning or conceptualization stages of the programme e.g.

clarificatory evaluations which can be needs assessments or unpacking programme theory. Summative

evaluations are carried out at the end or towards the end of the programme, for instance outcome

evaluations and impact evaluations. We also looked at the evaluation criteria such as effectiveness (have we

met the objectives of the programme) and impact (what are the long-term effects – intended and unintended

of the programme?). Guided by the OECD DAC criteria for evaluations, our point of emphasis was that the

criteria are not prescriptive and the criterion used will depend on the evaluation type and purpose. We also

looked at the African Evaluation Guidelines developed by AFREA. Remember that the guidelines were

adapted from the Program Evaluation Standards and AFREA kept the 4 standards of utility, feasibility,

propriety and accuracy. We ended section two by briefly looking at the SDGs and evaluation for SDGs and

noted that quality evaluation is critical for the review of the progress being made by the SDGs.

Section two: quick knowledge check - individual exerciseBefore we move to section three, complete exercise two to see what you have learnt and double check areas

where you are still not sure. All the questions are in multiple choice and the correct answers will be provided

for you to check.

31

Questions Responses1. Clarificatory evaluation is usually conducted at the end

of the programme to see whether or not programme objectives have been achieved.

A. TrueB. False

2. Which evaluation type is conducted halfway through the programme implementation?

A. Diagnostic evaluationB. Clarificatory evaluationC. Impact evaluationD. Midterm reviewE. None of the above

3. This OECD DAC evaluation criteria is concerned with the question: ‘the rate at which the inputs are converted to outputs – have we made optimal use of resources to achieve results?

A. EfficiencyB. EffectivenessC. Relevance D. Value for money

4. Which evaluation standard is concerned with whether or not the evaluation meets the information needs of the evaluation stakeholders?

A. ProprietyB. AccuracyC. UtilityD. All the above

5. Which of these statements is not true?a. There are 17 Sustainable Development Goalsb. There are 200 targets for the SDGsc. You are able to see what SDGs your programme is contributing towards nationally by looking at the impacts or outcomes in your logframe or logic modeld. Evaluation will help us to understand why and how progress has been made towards the 2030 agenda

A. a and cB. a, c and dC. a and bD. b only

32

SECTION THREE Introduction: In this section we turn to evaluation designs. We will discuss three key types of evaluation

design: the classic experimental design, the quasi-experimental design and the non-experimental design.

We will touch on some of the considerations that you make when selecting a design which include the

evaluation purpose, some research considerations such as external and internal validity as well as the

resources and time available. At the end of this section, we expect that you will be able to do the

following:

1. Identify the key evaluation designs;

2. Identify the main considerations you make when selecting an evaluation design;

3. Identify some of the key strengths and weaknesses of the different designs.

Evaluation DesignsEvaluation design is a critical part of the evaluation planning process. It is important to remember that the

evaluation design that you adopt has to fit the purpose and objective of the evaluation. As shown in slide 45

in the accompanying power point presentation, there are a number of considerations to be made when

selecting the evaluation design. The evaluation design will determine the methodology that will be used for

data collection and naturally analysis of the data. The decision-framework for selecting an evaluation design

presented by Mouton (2014) on slide number 46 shows how the design is tied to the evaluation purpose

(which could be improvement or judgement oriented). This translates into the evaluation type. As discussed

in section two, evaluation types include clarificatory evaluation, outcome evaluation or impact evaluations. It

is a good idea to have someone with expertise to assist you in the process of selecting the evaluation design.

There are three main evaluation designs:

1. The experimental design;

2. The quasi-experimental design and

3. The non-experimental design

Table 4 below provides a summary of the 3 key evaluation designs for an outcome evaluation; their strengths

and challenges:

33

Table 4: Summary of evaluation designs. Adapted from go2itech (ud:4) https://www.go2itech.org/wp-content/uploads/2017/07/Evaluation-Design-and-Methods.pdf

Design type Examples Strengths Challenges Experimental:Compares intervention with non-intervention

Uses controls that arerandomly assigned

Randomized controlled trial (RCT)Pre-post design with a randomized control group is one example of an RCT

Can infer causality withhighest degree ofconfidence

Most resource-intensive

Requires ensuring minimal extraneous factors

Sometimes challenging togeneralize to “real world”

Quasi-Experimental:Compares intervention with non-interventionUses controls or comparison groups that are not randomly assigned

Pre-post design with a nonrandomized comparison group

Can be used when youare unable to randomizea control group, but youwill still be able tocompare across groupsand/or across timepoints

Differences betweencomparison groups mayconfoundGroup selection critical

Moderate confidence ininferring causality

Non-Experimental:Does not use comparison or control group

Case control (post-intervention only): Retrospectively comparesdata between intervention andnon-intervention groups

Pre-post with no control: Data from one group are compared before and after the training intervention

Simple design, usedwhen baseline dataand/or comparisongroups are not availableand for descriptivestudy.

May require leastresources to conductevaluation

Minimal ability to infercausality

The experimental design:

The classic experimental design has two features as discussed below and shown in figure 10 below (Mouton,

2016; The Provincial Center of Excellence for Child and Youth Mental Health; Badiei, 2012; AmeriCorps ud)

1. Random assignment of people into treatment and non-treatment groups. This means that in a

context, people are randomly assigned to the intervention group (the treatment group) where they

receive the intervention. In the case of our example of YAH programme at Shangano, students from

the 6 universities will be the intervention group as they will be exposed to or will participate in the

YAH programme. Or, they are assigned into the control group (the non-treatment group) where they

are not exposed to the programme or intervention. Students from universities that do not get the

YAH programme (about 4 universities in Zimbabwe) will be selected to be part of the control group;

34

https://www.go2itech.org/wp-content/uploads/2017/07/Evaluation-Design-and-Methods.pdf


2. There is a possibility of the pre (baseline) and post (outcomes) tests for both groups to determine

their status before the intervention and after the intervention. In that way, the evaluators can

attribute effects or changes that they observe in the ‘treatment group’ to their programme.

Figure 12: the classic experimental design. Adapted from Mouton (2016: 17)

Experimental designs have been criticized for the following:

1. They encourage the ‘black-box’ in evaluations. This means that they can attribute change to the

programme based on the changes observed in the assigned groups but they cannot tell us how and

why change occurred. Referring to the YAH programme, we could be able to claim at the end of the

day that YAH has resulted in improved health among students but we would not be able to tell how

and why this change has occurred. We would also not be able to tell which combination of

components in the programme worked best together to produce the results that are now seen;

2. In real life, it is not easy to keep those in the control group from accessing the intervention e.g.

community members from a community without a new clinic may access services from the clinic in

another community. Students from Zimbabwe Open University could still access information from

the University of Zimbabwe, for instance if we go back to our example of the YAH programme.

Evaluation approaches such as the Realistic Evaluation Approach (discussed under the Evaluation

Approaches section) by Pawson and Tilley were a response to this weakness;

3. They do not take into account that some other programmes or projects in the context may be

causing the effect observed among the ‘treatment’ group. For example – a programme in the village

may be aiming at teaching community members good hygiene practices in their homes. At the same

time, community members may be receiving information from the clinic on good hygiene practice as

well as through some media campaigns on the local radio station. In the YAH programme example, it

35

could be that students at the universities are also accessing information on social media platforms

such as Twitter and Facebook. Furthermore, they could accessing information from other service

providers when they go home on semester breaks.

Quasi-experimental designs

Quasi-experimental research is similar to experimental research in that there is manipulation of an independent variable. It differs from experimental research because either there is no control group, no random selection, no random assignment, and/or no active manipulation (Abraham and McDonald, 2011).

Some of the leading proponents of quasi-experimental designs include Carola Weiss, Thomas Cook and Peter

Rossi (Mouton, 2016). In this design, two groups that have similar features are selected. However, random

selection is not used to assign people or communities to the groups. Both groups are then measured before

and after the one group has received the intervention. For example, 10 schools in a district may be selected

to receive a Heads Of Department training intervention and 10 other schools in the same district with similar

features would not receive the same intervention but be measured before and after the intervention

(Measure Evaluation, ud). Bringing in our YAH programme example, universities that have similar

characteristics with those receiving the YAH programme would be considered as comparison groups. These

could be universities that are also public universities and have a similar mix of programmes and faculties as

those in the intervention group. The baseline study would be conducted for all the institutions. Then they

would be measured at the end of 2022 with the planned outcome evaluation. Below are some of the key

features of quasi-experimental designs:

1. Statistical matching is stronger

2. Propensity score matching is used

3. Judgmental matching is weaker

4. It can be used after the target group for the programme has been selected

5. It is not free from issues of sample selection bias

Non-experimental evaluation designs

When using a non-experimental design, only people who are participating in the program or have received

the intervention get the pre- and post-test (Measure Evaluation, ud). This means that in terms of the YAH

programme, Shangano would only be concerned with baseline and post implementation measurements of

only the 6 universities that are participating in the programme. In terms of scientific rigor, this design is

deemed as the weakest. It does not provide information about what may have occurred in the absence of

the intervention. The following are the examples of non-experimental evaluation designs; a) correlation, b)

mixed methods, c) case studies and d) surveys.

36

Summary of section three

In this section we looked at the three key evaluation designs; experimental, quasi-experimental and

non-experimental designs. We saw that the experimental design and the quasi experimental design

differ when it comes to assignment into treatment and non-treatment groups. The former assigns

randomly and carries pre and post tests with both groups (the intervention group and the control

group). The latter does not assign to groups randomly. Rather, the ‘comparison’ group is the made up of

a community or population, for example with similar traits that does not participate in the programme.

Non-experimental designs conduct pre and post measures only for the group that has participated in the

intervention.

Section three: quick knowledge check - individual exerciseWell-done on completing sections one, two and three. Here is another quick exercise for you to complete and

see you have learnt and double check areas that you need to revisit. Just as in sections one and two, the

questions are in multiple choice and the correct answers will be provided for you to check.

Questions Responses1. This evaluation design has been criticized because it doesn’t help

us to explain how change happensA. Quasi-experimental designB. Non-experimental designC. Experimental design

2. In this design, there is group assignment but the assignment is not random. The group that does not get the intervention is called the comparison group.

A. Quasi-experimental designB. Non-experimental designC. Experimental design

3. Case studies, mixed methods and surveys are examples of which evaluation design?

A. Quasi-experimental designB. Non-experimental designC. Experimental design

4. The Randomized Control Trial is an example of a classic experimental evaluation design.

A. True B. False

37

ReferencesAfrican Evaluation Association (ud) The African Evaluation Guidelines. Available at https://afrea.org/the-african-evaluation-guidelines/ Accessed 4 July 2019

Better Evaluation. Empowerment Evaluation. Available at https://www.betterevaluation.org/en/plan/approach/empowerment_evaluation Accessed 17 June 2019

Business Dictionary (ud). Available at http://www.businessdictionary.com/definition/program-evaluation.html. Accessed 20 May 2019.

International Training and Education Center for Health (ud). Choose Evaluation Designs and Methods. Available at https://www.go2itech.org/wp-content/uploads/2017/07/Evaluation-Design-and-Methods.pdf Accessed 3 July 2019

Guyot, W.M. (1978). Summative and Formative Evaluation. The Journal of Business Education. 54(3):127-129.

International Institute for Environment and Development (2016). Effective Evaluation for the Sustainable Development Goals. Available at https://www.iied.org/effective-evaluation-for-sustainable-development-goals

MacDonald, G (ud), Framework for Program Evaluation in Public Health: A Checklist of Steps and Standards. Available at https://wmich.edu/sites/default/files/attachments/u350/2014/CDC_Eval_Framework_Checklist.pdf Accessed 4 June 2019

Merriam-Webster Dictionary. Available at https://www.merriam-webster.com/dictionary/criterion Accessed 3 June 2019

Mouton, J. (2014). Programme Evaluation Designs and Methods. SunMedia. Stellenbosch.

NSW Government (ud). Evaluation Criteria. Available at https://education.nsw.gov.au/teaching-and-learning/professional-learning/evaluation-resource-hub/evaluation-design-and-planning/setting-the-scope-of-an-evaluation/evaluation-criteria Accessed 3 June 2019

Podems, D. (2014). Evaluator competencies and professionalising the field: where are we now? “The Canadian Journal of Programme Evaluation” Vol.28. No.3. Pp. 127-136. Canadian Evaluation Society.

Quora (2018). Wat is the Difference Between Monitoring and Evaluation? Available at https://www.quora.com/What-is-the-difference-between-monitoring-and-evaluation Accessed 13 June 2019

Ramirez, R and Bodhead, D (2010). Utilization Focused Evaluation A primer for evaluators. Available at https://evaluationinpractice.files.wordpress.com/2013/04/ufeenglishprimer.pdf Accessed 2 July 2019

Robinson, J (2002). Responsive Evaluation: Responsible Evaluation: What is Evaluation Research for? Putting Derrida to Stake. Available at http://www.leeds.ac.uk/educol/documents/00002586.htm Accessed 8 July 2019

Sheriff, B and Potter, S (ud). An Introduction to Empowerment Evaluation: Teaching Materials. Available at https://pdfs.semanticscholar.org/090e/7cbd7895c72d2f0c19901603f9351b384af1.pdf Accessed 15 July 2019

38

https://pdfs.semanticscholar.org/090e/7cbd7895c72d2f0c19901603f9351b384af1.pdf

http://www.leeds.ac.uk/educol/documents/00002586.htm

https://evaluationinpractice.files.wordpress.com/2013/04/ufeenglishprimer.pdf

https://www.quora.com/What-is-the-difference-between-monitoring-and-evaluation

https://education.nsw.gov.au/teaching-and-learning/professional-learning/evaluation-resource-hub/evaluation-design-and-planning/setting-the-scope-of-an-evaluation/evaluation-criteria%20Accessed%203%20June%202019



https://www.merriam-webster.com/dictionary/criterion%20Accessed%203%20June%202019

https://www.merriam-webster.com/dictionary/criterion%20Accessed%203%20June%202019

https://wmich.edu/sites/default/files/attachments/u350/2014/CDC_Eval_Framework_Checklist.pdf%20Accessed%204%20June%202019

https://wmich.edu/sites/default/files/attachments/u350/2014/CDC_Eval_Framework_Checklist.pdf%20Accessed%204%20June%202019


http://www.businessdictionary.com/definition/program-evaluation.html.%20Accessed%2020%20May%202019

http://www.businessdictionary.com/definition/program-evaluation.html.%20Accessed%2020%20May%202019

https://www.betterevaluation.org/en/plan/approach/empowerment_evaluation



Surbhi (2017). Differences Between Monitoring and Evaluation. Available at,

https://keydifferences.com/difference-between-monitoring-and-evaluation.html Accessed 12 June 2019

The PELL Institute (ud). Evaluation Tool Kit. Available at

http://toolkit.pellinstitute.org/evaluation-101/evaluation-approaches-types/ Accessed on 5 July 2019

UNESCO (2016). Designing Effective Monitoring and Evaluation of Education Systems for 2030: A Global Synthesis of Policies and Practices. Available at http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/ED/pdf/me-report.pdf Accessed 3 July 2019

UNICEF (ud) Evaluation Standards. Available at https://www.unicef.org/evaluation/files/Evaluation_standards.pdf Accessed 10 June 2019

Waidya, M. (2012). Results-Based Monitoring and Evaluation System (RBME). A Tool for Public Sector Management. Available at https://www.slideshare.net/madhawa66/results-based-monitoring-and-evaluation Accessed on5 September 2019.

39

https://www.slideshare.net/madhawa66/results-based-monitoring-and-evaluation

https://www.slideshare.net/madhawa66/results-based-monitoring-and-evaluation

https://www.unicef.org/evaluation/files/Evaluation_standards.pdf

http://www.unesco.org/new/fileadmin/MULTIMEDIA/HQ/ED/pdf/me-report.pdf

http://toolkit.pellinstitute.org/evaluation-101/evaluation-approaches-types/

https://keydifferences.com/difference-between-monitoring-and-evaluation.html

Documents

SECTION ONE - ZEA€¦ · Web viewIn the Shangano case study, we mentioned their theory of change and some of the outputs and outcomes they identified using the clarificatory/design