28
P P r r o o g g r r a a m m E E v v a a l l u u a a t t i i o o n n M M o o d d e e l l s s a a n n d d P P r r a a c c t t i i c c e e s s : : A A R R e e v v i i e e w w o o f f R R e e s s e e a a r r c c h h L L i i t t e e r r a a t t u u r r e e David Townsend Pamela Adams Faculty of Education The University of Lethbridge March, 2003

Evaluation

Embed Size (px)

Citation preview

Page 1: Evaluation

PPPrrrooogggrrraaammm EEEvvvaaallluuuaaatttiiiooonnn MMMooodddeeelllsss aaannnddd PPPrrraaaccctttiiiccceeesss:::

AAA RRReeevvviiieeewww ooofff RRReeessseeeaaarrrccchhh LLLiiittteeerrraaatttuuurrreee

David Townsend Pamela Adams

Faculty of Education The University of Lethbridge

March, 2003

Page 2: Evaluation

1

RREESSEEAARRCCHH AANNDD LLIITTEERRAATTUURREE RREEVVIIEEWW OONN PPRROOGGRRAAMM EEVVAALLUUAATTIIOONN 1. Quantitative and Qualitative Approaches to Research and Program Evaluation

Man is an animal in webs of significance he himself has spun…. the analysis of that is not an experimental science in search of law, but an interpretive one in search of meaning.

Clifford Geertz, 1973.

Essays written by the left hand need to be read with as much rigor as those written with the right hand.

Elliot Eisner, 1990.

The schism that exists between social research methods favoring either a

qualitative or quantitative approach to program review is not new. On the one hand,

qualitative researchers criticize strictly quantitative program evaluation models for

drawing conclusions that are often pragmatically irrelevant (Reichardt and Rallis, 1994;

Woods, 1986); for employing methods that are overly mechanistic, impersonal, and

socially insensitive (Maturana, 1991; Scott and Usher, 2002); for compartmentalizing,

and thereby minimizing, the complex multidimensional nature of human experience

(Moerman, 1974; Silverman, 2000; Yutang, 1937); for encouraging research as an

isolationist and detached activity impervious to collaboration (Scott and Usher, 2002); for

tipping the scales of understanding excessively toward objective disenchantment (Bonβ

and Hartmann, 1985; Weber, 1919); and for forwarding claims of objectivity that are

simply not fulfilled to the degree espoused in many quantitative studies (Flick, 2002).

On the other hand, qualitative program reviews are seen as quintessentially

unreliable forms of inquiry (Gardner, 1993). Some educational researchers suggest that

even the most rigorous qualitative study provides no assurance of linking research with

relevant practice (James, 1925; Kerlinger, 1977); that the degree to which qualitative

study variables are uncontrolled offers certainty that causation can rarely be proven (Ary,

Jacobs,and Razavieh, 1979); that methodologies such as narration and autobiography can

yield data that is unverifiable, deceptive, and narcissistic; that qualitative researchers

often inadvertantly influence the generation of data and conclusions (Durkheim, 1982);

and that the Hawthorne effect, rather than authentic social reality, is responsible for many

events observed in these types of studies. Nonetheless, recognition of unique contexts,

Page 3: Evaluation

2

intellectual diversity, and reasonable yet poetic thinking can contribute a strong

foundation, as well as gifts of insight, to even the most complex educational experiences.

In support of more holistic methods of inquiring into the nuances of educational practices

and programs, Lin Yutang (1937) suggests,

As a result of this past dehumanized logic, we have dehumanized truth. We have a

philosophy that has become a stranger to the experience of life itself, that has

almost half disclaimed any intention to teach us the meaning of life and the

wisdom of living; a philosophy that has lost that intimate feeling of life or

awareness of living which is the very essence of philosophy. (p. 422)

Scott and Usher (2002) contend that, “Knowledge….is always a matter of

knowing differently rather than cumulative increase, identity, or confirmation” (p. 19).

Similarly, Eisner (1986) suggests that knowledge is never a complete and undisputed

form of truth, and that understanding is often rich with idiosyncratic perspective. He

states that, “All methods and forms of representation are partial” (p. 15). That the

methods used in this study is both multi-theoretical and multi-methodological is at once a

strength that parallels constructivist beliefs about the nature of reality, and a weakness

that renders it susceptible to positivist criticisms of reliability and validity.

Denzin and Lincoln (2000) point out the difficulty in defining valid educational

research and program review. Should it be, as Eisner (1997) proposes, “….a new way of

thinking about the nature of knowledge and how it can be created” (p. 4)? Is it, as Reason

and Marshall (1987) suggest, “….a cooperative endeavor which enables a community of

people to make sense of and act effectively in their world” (p. 112)? Is it, as Glesne

(1999) argues, “….like learning to paint. Study the masters, learn techniques and

methods, practice them faithfully, and then adapt them to your own persuasions when

you know enough to describe the work of those who have influenced you and the ways in

which you are contributing new perspectives” (p. 3)? Or it is, as Glaser and Strauss

(1967) contend, a venture involving the “….ability to make imaginative use of personal

experiences…” (Becker, 1970, p. 22).

Patton (2002) defines meaningful program research as, “….detailed descriptions

of situations, events, people, interaction, and observed behaviors; direct quotations from

people about their experiences, attitudes, beliefs, and thoughts; and excerpts or entire

Page 4: Evaluation

3

passages from documents, correspondence, records, and case history” (p. 22). Wolcott

(1992) alliteratively describes effective educational research as the activities of

experiencing and attending to sensory data; enquiring with curiosity beyond mere

observation and; examining and reviewing materials prepared by self and others. His

diagram of the complex activities of qualitative educational research is included below.

Figure 1.

Wolcott’s Qualitative Strategies

In North America, the past three decades have witnessed a dramatic increase in

the social and political inspection and critique of schools. Public examination of student

achievement has occurred on an unprecedented scale. While a plethora of fiscal, social,

ideological, and economic influences have provided impetus for this scrutiny, one well-

documented response to demands for educational accountability has been an escalating

interest in programs that link incentives or fiscal rewards to student achievement. Perhaps

it is the generalized belief that education cannot save itself that has led to such increases

in standardized and high-stakes testing, prescriptive curricula, and externally mandated

professional development of teachers. Yet, there is persuasive evidence to suggest that if

students, teachers, administrators, school boards, parents, and, indeed, the community as

Page 5: Evaluation

4

a whole, are to be held accountable for children’s learning, programs to assess and

support educational reforms should be based on models that are internally empowering,

rather than externally interrogative. This notion of empowerment evaluation is a shift

away from the singular criterion of quantitative merit and worth toward a fundamentally

democratic process that seeks to foster self-determination, self-improvement, and

capacity building in a spirit of responsiveness (Fetterman, 2001).

Successful models of school reform, such as the Manitoba School Improvement

Program (MSIP) or the Improving the Quality of Education for All Project (IQEA) in the

United Kingdom, have carefully constructed cushioning networks of technical assistance

and site-to-site support to buoy schools and educators seeking to implement change.

These initiatives are also based, in part, on research indicating that the exclusionary use

of standardized testing and resultant student achievement as the primary barometer of

school effectiveness is increasingly insufficient in providing the most useful information

to stakeholders in the educational community, including policy makers. Such initiatives

are seen to promote expansive improvements in all types of learning as the goal of

schools, signalling a shift away from the conventional use of accountability systems

“toward a more cooperative and transitional path of program review procedures”

(Schmoker, 2000, p. 62).

2. Definitions and Models of Program Evaluation

Program evaluation is a collection of methods, skills, and sensitivities necessary to determine whether a human service is needed and likely to be used, whether the service is sufficiently intensive to meet the unmet needs identified, whether the service is offered as planned, and whether the service actually does help people in need at a reasonable cost without unacceptable side effects. (Posavac and Carey, 1997, p. 2)

To the extent that educational initiatives are most commonly evaluated in order to

determine past effectiveness and to create future goals, “evaluation is an essential,

integral component of all innovative programs” (Somekh, 2001, p. 76). It is “the process

of making judgment about the merit, value, or worth of educational programs, projects,

materials, and techniques” (Borg and Gall, 1983, p. 733). A more extensive definition of

program evaluation might outline, “the sets of activities involved in collecting

information about the operations and effects of policies, programs, curricula, courses,

Page 6: Evaluation

5

educational software, and other instructional materials” (Gredler, 1996, p. 13). That

program evaluation not be confused with other forms of inquiry or data collections which

are conducted for different purposes is of critical importance (Gredler, 1996). For

example, it is the use of evaluation as a strategy for program improvement rather than for

accountability, justification, and program continuity that has traditionally differentiated

formative from summative evaluation. The latter consists of activities “to obtain some

kind of terminal or over-all evaluation in order that some type of general conclusion can

be made” (Tyler, Gagne and Scriven, 1967, p. 86). While summative evaluation can serve

to justify additional funding, it may also generate modification or elimination of a

program or its individual components. Formative evaluation, however, takes place at a

more intermediate stage, “permit[ting] intelligent changes to be made….” (Tyler, et al.,

1967, p. 86) as the initiative evolves. The benefits of such action can usually facilitate the

saving of both finite time and money. Alternately, programs conducted without an

evaluation component run the very real risk of wasted funding when “opportunities [are]

lost for policy makers to learn either from their successes or from what went wrong”

(Somekh, 2001, p. 76).

Hopkins (1989) suggests that evaluation in schools should be used for three types of

decisions: course improvement (instructional methods and materials); decisions about

individuals (pupil and teacher needs); and administrative regulation (rating schools,

systems, and teachers). Evaluation of schools, evaluation for school improvement, and

evaluation as school improvement characterize these three approaches. Regardless of the

nature of the evaluation, Sanders (2000) identifies the following as key tasks:

• deciding whether to evaluate.

• defining the evaluation problem.

• designing the evaluation.

• budgeting the evaluation.

• contracting for the evaluation.

• managing the evaluation.

• staffing the evaluation.

• developing evaluation policies.

• designing a program for evaluators, collecting the information.

Page 7: Evaluation

6

• analyzing the information.

• reporting the evaluation.

Of course, the importance and applicability of each task will vary, depending on the

nature of the evaluation.

According to Sanders (2000), four categories comprise the essential

characteristics of sound and fair program evaluation. Utility standards ensure that an

evaluation serves the information needs of the intended users. More specifically, they

include: stakeholder identification, evaluator credibility, information scope and selection,

values identification, report clarity, report timeliness and dissemination, and evaluation

impact. Feasibility standards require that an evaluation be realistic, prudent, frugal, and

diplomatic. Practical procedures, political viability, and cost effectiveness comprise these

standards. Standards of propriety guarantee that any evaluation will be conducted

ethically, legally, and with due regard for the welfare of those both involved and affected

by the evaluation. These standards embrace service orientation, formal agreements, rights

of human subjects, human interactions, fair and complete assessment, disclosure of

findings, conflict of interest, and fiscal responsibility. Finally, accuracy standards

guarantee that the evaluation reveals and conveys information that is technically adequate

relative to determining the worth or merit of the program under review. They consist of

program documentation, context analysis, described procedures and purposes, defensible

sources of information, valid information, reliable information, systematic information,

analysis of quantitative information, analysis of qualitative information, justified

information, impartial reporting, and meta-evaluation.

An obvious, but often overlooked, characteristic of the evaluation process is the

degree to which results are relevant, functional, and useful. Clearly, an evaluation should

not be undertaken if no use will be made of the results. “Time and resources are too

valuable to waste in this manner” (Sanders, 2000, p. 52). However, provided the results

are produced in a format that is timely, tailored to suit the audience, and reported using

appropriate media, educators can use them to plan program improvements. This can lead

to a cyclical process in which, “changes in the program may be necessary, benchmarks

and results-based goals may need to be redefined, or action strategies may need to be

continued, replaced, or redesigned” (Hertling, 2000, p. 3).

As with other forms of research and development, the people conducting program

Page 8: Evaluation

7

evaluations should be expected to acknowledge the personal and professional biases they

bring to evaluative processes. This awareness and recognition will frame theoretical

considerations as well as establish, “the advantages and limitations of what is chosen, as

opposed to what is disregarded…” (Rebien, 1997, p. 2). Evaluators must also be

explicitly aware of relative strengths and weaknesses of different evaluative approaches

(Shadish, Cook and Leviton, 1991), and take care to not “create their own establishment

and glamorize it as an elite ” (Stenhouse, quoted in Hopkins, 1989, p. iii).

During the last five decades, several major models of program evaluation have

emerged. To compare these models is one way to understand the breadth and depth of the

subject. A study of alternate approaches might also be crucial to the scientific

advancement of evaluation. Moreover, such an appraisal can help evaluators assess and

consider frameworks which they may employ as they plan and conduct studies. It is

important to identify strengths and weaknesses of a variety of models in order to refine

specific relevant approaches, rather than “to enshrine any one of them….” (Stufflebeam

and Webster, as cited in Maldaus, Scriven, and Stufflebeam, 1984, p. 24). The underlying

theoretical assumptions of each will provide a basis for comparison, as “…. models differ

from one another as the base assumptions vary” (House, as cited in Maldaus, Scriven and

Stufflebeam, 1984, p. 24). In Table 1, Maldaus, et al. (1984) compare assorted models,

proponents, major audiences, understandings, methodologies, outcomes, and examples of

typical questions associated with formative program evaluation.

Table 1. A Taxonomy of Major Evaluation Models. Model Proponents Major

Audiences Assumes Consensus On

Methodology Outcome Typical Questions

Systems Analysis

Rivlin Economists, managers

Goals; known cause & effect; quantified variables.

PPBS: linear programming; planned variation; cost benefit analysis.

Efficiency Are the expected effects achieved? Can the effects be achieved more economically?

Behavioral objectives

Tyler, Popham

Managers, psychologists

Prespecified objectives; quantified outcome variables

Behavioral Objectives; achievement tests

Productivity; accountability

Are the students achieving the objectives? Is the teacher producing?

Decision Making

Stufflebeam, Alkin

Decision makers, esp. administrators

General goals; criteria

Surveys, questionnaires, interviews; natural

Effectiveness; quality control.

Is the program effective? What parts are effective?

Page 9: Evaluation

8

variation Goal Free Scriven Consumers Consequences

criteria;; Bias control; logical analysis; modus operandi

Consumer choice; social utility

What are all the effects?

Art Criticism Eisner, Kelly

Connoisseurs, Consumers

Critics, standards,

Critical review Improved Standards

Would a critic approve this program?

Accreditation North Central Association

Teachers, public

Criteria, panel, procedures

Review by panel; self- study

Professional acceptance

How would professionals rate this program?

Adversary Owens, Levine, Wolf

Jury Procedures and judges

Quasi-legal procedures

Resolution What are the arguments for and against the program?

Transaction Stake, Smith, MacDonald, Parlett-Hamilton

Client, Practitioners

Negotiations; activities

Case Studies, interviews, observations

Understanding; diversity

What does the program look like to different people?

All spelling and punctuation copied as original document displayed

All are dependent to some degree upon the philosophy of liberalism, and “partake

of the ideas of a competitive, individualistic, market society…. the most fundamental

idea is freedom of choice, for without choice, of what use is evaluation?” (House, as cited

in Maldaus, Scriven and Stufflebeam, 1984, p. 49).

There was a flurry of evaluation protocol development in the late 1960s when a

number of academics produced several alternative theoretical approaches. This

renaissance in the field was fuelled, in part, by “the mounting responsibilities and

resources that society assigned to educators” (Stufflebeam and Webster, as cited in

Maldaus, Scriven, and Stufflebeam, 1984, p. 23). Table 2 presents Scriven’s (1993)

review of program evaluation approaches of this period.

Page 10: Evaluation

9

Table 2. Scriven’s Past Conceptions of Evaluation.

Strong Decision Support View

Weak Decision Support View

Relativistic View

Rich Description Approach / Social Process School

Constructivist, or Fourth Generation Approach

DEVELOPER

Tyler, CIPP with Stufflebeam,Guba

Alkin Provus Rossi & Freeman,

Stake, Cronbach

Guba & Lincoln, many supporters in UK and US

WHEN 1971 (CIPP) 1972 1971, 1989 1980 1981

PURPOSE

Process of rational program management

Info is gathered in service for a decision maker

Uses only client’s values as framework without judgment

Ethnographic enterprise even without client’s values

Purports that all evaluation results from construction by individuals and negations by groups

STRONG/DIRECT CONCLUSIONS REACHED? Yes No No

Rejects summative evaluation

No, rejects all claims

An extensive analysis of the work of several other authors engaged in evaluation from

1960 through the 1980s is presented in Appendix A.

3. Empowerment Models: An Alternate Perspective

More recently, another approach to evaluation has slowly gained favor with

educators. It has been developed, in part, in response to a concern that increased

“politicization of evaluation…tight time lines, restricted budgets, and an over-emphasis

on cost-effectiveness …. often distort currently accepted evaluation procedures….”

(Hopkins, 1989, p. 9). For Hopkins, evaluation should be viewed as an illuminative,

rather than recommendatory exercise, as a guide for improvement, rather than evidence

for judgement. He promotes the concept of empowerment evaluation as an iterative

process of value assessments and resultant plans for program improvement in which

participants are helped to conduct their own evaluation. Hopkins describes empowerment

evaluation as a collaborative activity that employs both qualitative and quantitative

methodologies. Teams of educators, with the assistance of trained evaluators, learn to

assess, progress towards goals, and re-shape the goals according to theoretical

foundations, resulting in a type of self-determination that Earl (2000) refers to as

“agency” (p. 60). Hopkins also contends that program effectiveness is not achieved

Page 11: Evaluation

10

merely by conforming to externally imposed models of evaluation, but through

participants acquiring skills and strategies, understanding, and reflection in a process of

collaborative refinement and judgment. Fetterman (2001) suggests that empowerment

evaluation can foster self-determination, generate illumination, and actualize liberation. It

involves a fundamentally democratic process that promotes, among other things, self-

improvement and capacity-building. Improvements in the quality of education occur,

“not through legislative fiat, but through the exercise and development of professional

judgement of those within the system” (Hopkins, 1989, p. 194). A central premise of this

method is that efficacy of schools is not contingent on external forces, but on their

properties as social systems.

Empowerment evaluation methods question the overly-judgmental nature of

traditional evaluation, and seek to moderate the importance of external critique. They are

based on the development of pedagogy as opposed to methodology, the exploration of

new kinds of educational research, and the integration of evaluation and development

(Hopkins, 1989). Fetterman (2001) contends that, “merit and worth are not static values”

(p. 3), and that any event must be understood in context from multiple worldviews.

“Populations shift, goals shift, knowledge about program practices and their values

change, and external forces are highly unstable….” (ibid). Empowerment evaluation is a

method that accommodates these shifts by internalizing self-evaluation processes and

practice. Strongly influenced by action research, it is a dynamic and responsive approach

that emphasizes inclusion rather than exclusion.

Additionally, empowerment evaluation acknowledges the constructivist belief that

people can discover knowledge and solutions based on their own experiences. Thus, the

assessment of value and worth of a program becomes a continuous, rather than a

terminal, process. While findings remain grounded in collected data, program

stakeholders are able to establish their own goals, processes, outcomes, and impacts.

External evaluators are able to provide training, coaching, and assistance in an

atmosphere of honesty, trust, support, criticism, and self-criticism. Neither a servant, nor

a judge, nor a slave, the external evaluator, seen as a “critical friend” (Earl, 2000, p. 59),

can help keep the effort credible, useful, directed, and rigorous, contributing positively to

the formation of “ a dynamic community of transformative learning” (Fetterman, 2001, p.

6).

Page 12: Evaluation

11

Empowerment evaluation proceeds through three steps. The first establishes the

mission or vision of the program. That is, the participants state the results they would like

to see, based on the projected outcome of the implemented program, and then map

through the process in reverse design. The second step involves taking stock of,

identifying, and prioritizing the most significant program activities. Staff members rate

present program effectiveness using a nominal instrument and the ensuing discussion

determines the current program status. Charting a course for the future is the third step.

The group outlines goals and strategies to achieve their dream with an explicit emphasis

on improvement. External evaluators assist participants in identifying types of evidence

required to document progress toward the goal. In the presence of a strong, shared

commitment on the part of the participants, deception is inappropriate and unnecessary;

the group itself is a useful, powerful check (Fetterman, 2001). For empowerment

evaluation to be effective and credible, participants must enjoy the latitude to take risks

and simultaneously assume responsibility. A safe atmosphere, in which it is possible to

share success and failure, is as essential as a sense of caring and community.

Other influential authors (Posavac and Carey, 1997) have adopted a model of

program evaluation that honors many of the principles of empowerment evaluation. Their

improvement-focused model, they contend, best meets the criteria necessary for effective

evaluation. That is, the needs of stakeholders are served; valid information is provided;

and alternate viewpoints are acknowledged. As Posavac and Carey note, “to carry this off

without threatening the staff is the greatest challenge of program evaluation” (p. 27).

4. Models of School Improvement with Empowerment Assumptions

Within the span of 11 months during the summers of 1994 and 1995, the

Manitoba provincial government presented its agenda for school reform in three key

policy documents. Six interrelated priority areas for government action appeared to be

aimed at renewing and revitalizing the system, and sought a more rigorous and relevant

curriculum underpinned by the premise of raising educational standards and achievement

of all students. Essential learnings, educational standards and evaluation, school

effectiveness, parental and community involvement, distance education and technology,

and teacher education were all declared priorities for government action. These

documents provided impetus for the rapid introduction of policies and strategies aimed at

Page 13: Evaluation

12

improving the quality of Manitoba public schools, such as the adoption of provincially

mandated curricula accompanied by subject and grade level outcomes, and province-wide

testing at grades three, six, nine, and twelve (Harris and Young, 1999). The provincial

government denied that such initiatives constituted an explicit attack on the failures of

teachers and schools. Yet, such a comprehensive attempt at system change, seemingly

driven by a political agenda and accompanied by a reduction of funding to education,

provoked considerable resistance from the Manitoba Teachers’ Society, as well as

extensive public controversy and debate (Harris and Young, 1999).

The Manitoba School Improvement Project actually came into being before this

contentious governmental reform. Originating in an independent charitable foundation, it

drew on the professional praxis of teachers---rather than academics---to focus exclusively

on school reform at the secondary school level. In existence since 1991, the program was

born as a result of the vision and support of the Walter and Duncan Gordon Foundation, a

Canadian philanthropic group interested in enhancing educational opportunities for

students at risk. The educational community in Manitoba welcomed and supported this

involvement. Accordingly, the Foundation elected to support secondary projects that

were designed by individual schools in urban centres, with later initiatives expanded to

rural and northern settings. Thus began one of Canada’s major initiatives to empower

teachers as catalysts for change.

MSIP has provided multi-year funding to more than 30 schools in 13 school

divisions in Manitoba. In 1998, an external evaluation of the initiative was commissioned

to examine achievement of project goals, increased student learning, increased student

engagement, and successful school improvement. The report concluded that, while not

every school showed high levels of improvement, a majority of schools in the project had

been successful in these four areas. In fact, improved academic performance, increased

student enrolment, reduced disciplinary problems, improved attendance, increased family

and community involvement, and increased student graduation were also noted as

unanticipated positive results (Harris and Young, 1999). Michael Fullan, a leader of the

external evaluation team, remarked that, “at the secondary level, I know of no other

strategy which has taken 20 or more schools and shown the level of success in this short

amount of time….[it shows] secondary schools can move more quickly, even more

quickly than we thought possible, and in a cost efficient way ” (Fullan, quoted in Harris

Page 14: Evaluation

13

and Young, 1999, p. 110).

Another well-established school improvement project, similar to the MSIP and

well-known within the international research community, is the Improving the Quality of

Education for All Project (IQEA) in the United Kingdom. Like the MSIP, it embodies an

alternate model that is not politically driven, and that empowers schools to operate under

a developmental, rather than an accountability, framework (Harris and Young, 1999).

Both focus on teacher professional development and both support efforts to improve

schools by developing a critical, but supportive, culture.

In the last decade, schools and teachers in both Canada and the United Kingdom

have come under increased public scrutiny and political pressure. Society’s expectations

of schools and reforms inevitably exceed the knowledge and capacity of educators to

meet these demands, particularly when expectations are ambiguous or contradictory and

when additional resources to fund system-wide reforms are sparse (Harris and Young,

1999). As a result of the 1990s movement toward higher standards of student

achievement and teacher performance, the British government legislated school-based

management, a national curriculum, national targets, national tests, and a standard model

of inspection. During this period of continuing financial stringency, top-down

restructuring was the change mechanism for increasing teacher accountability. In this

context, the IQEA project has been a unique school improvement initiative in Britain,

launched during a time of unprecedented regulation, standardization, and politicized

school reform. Since its inception, it has operated in over fifty schools across England

and Wales; schools in Iceland, South Africa, and Puerto Rico have since been

incorporated into the program. Faculty from the universities of Cambridge and

Nottingham lead the project, and represent academic support and vision. This model is

based on the inextricable relationship between professional growth and school

development, and the assumption that schools are more likely to provide enhanced

outcomes for all students when they adopt ways of working that are consistent with both

the aspirations of the school community and the demands of external change.

Unlike the MSIP, however, the IQEA is self-funding: it is financially dependent

on schools joining the project. Participating institutions pay an initial amount for a four

term or sixteen month project, with the possibility of extension at a reduced cost. For

many schools, these payments represent a large part, if not all, of their professional

Page 15: Evaluation

14

development fund. Entry is limited; consequently, schools are required to agree to a prior

set of conditions before joining the project. They must gain the support of 80% of the

staff, they must commit their professional development time to IQEA over the four terms,

and a cadre must be formed which will be responsible for leading the school change.

Finally, schools commit themselves to undergo a process of internal and external

evaluation. For their part, the universities design a program of staff development

activities, and provide a liaison advisor for each school whose responsibilities include

networking, training, support, consultancy, feedback, advice, and pressure (Harris and

Young, 1999).

Teachers in any school considering joining the project are expected to share the

philosophy and values of the IQEA Project. The following tenets are worthy of note:

• Schools do not improve unless teachers, individually and collectively,

develop. While teachers can often develop their practice on an individual

basis, if the whole school is to develop there need to be many staff

development opportunities for teachers to learn together.

• Successful schools seem to employ decision-making mechanisms that

encourage feelings of involvement from a number of stake-holder groups,

especially students.

• Schools that are successful at reform and improvement establish a clear

vision for themselves and regard leadership as a responsibility of many

staff, rather than a single set of responsibilities vested in a single

individual.

• Co-ordination of activities is an important way to keep people involved,

particularly when changes of policy are being introduced. Communication

within the school is a vital aspect of co-ordination, as is informal

interaction between teachers.

• Schools which recognize the importance of professional inquiry and

reflection find it easier to gain clarity and establish shared meaning around

identified development priorities, and are better able to monitor the extent

to which policies actually deliver intended outcomes for pupils.

• Through the process of planning for development, a successful school is

able to link its educational aspirations with identifiable priorities, to

Page 16: Evaluation

15

sequence those priorities over time, and to maintain a focus on classroom

practice. (Harris and Young, 1999, pp. 4-5)

Several commonalities emerge when comparing the MSIP and the IQEA in terms

of stimulating potent and lasting change. Both employ an external monitoring agency.

Both focus on specific teaching and learning activities. They are committed to

professional interchange, collaboration, and networking. They espouse devolved

leadership and temporary systems. Finally, both support formative and summative

evaluation, and demonstrate that “some of the best evaluation occurs in response to

questions that teachers and other school personnel ask about their professional practice”

(Sanders, 2000, p. 3). That is, they establish inquiry and reflection as intrinsic to school

growth and improvement.

Among several other authors, Rapple (1994) and Barth (1990, 2001) contend that

these types of strategies are necessary to foster educational accountability grounded not

in passive and external models which encourage compliance and subservience, but in

active internally reflective models which build responsibility and capacity.

....there is sheer futility in attempting to regulate education by

economic laws. Accountability in education should not be facilely

linked to mechanical examination results, for there is a very distinct

danger that the pedagogical methods employed to attain those results will

themselves be mechanical and the education of children will be so

much the worse. (Rapple, 1994, p. 11)

5. The Alberta Initiative for School Improvement

In 1999, the Alberta Ministry of Learning in consultation with representatives of

school boards, teachers, superintendents, parents, and school business officials, helped

create the Alberta Initiative for School Improvement (AISI). As AISI projects were

designed, there was an expectation that schools would propose a balance of quantitative

and qualitative measures of success. The AISI Administrative Handbook (1999) indicated

that, “measures should not drive the project design” (p. 8). The use of Provincial

Achievement Test results was indicated, where appropriate, while reference was made to

a very broad range of other measures, too. In a province with Canada’s most extensive

achievement testing program, the even-handedness of the Ministry on this issue may have

Page 17: Evaluation

16

been one of the critical factors in encouraging the "collaboration, system leadership, and

consensus building" (Booi, Hyman, Thomas, and Couture, 2000, p. 35) that characterized

successful proposal development in the early stages of the Initiative.

In less than three years, AISI has given rise to hundreds of system and school

projects. Many of them have chosen to re-focus educational vision and structure through

an empowerment-based action research model that facilitates and encourages teachers to

make internal assessments of strengths, and establish action plans based on collaborative

problem solving. Alberta Deputy Minister of Learning, Maria David-Evans (2000),

suggests that this process will enhance the “collective capacity” within schools as a result

of “greater sharing [and] pursuit of a common goal….” (p. 11). Cimbricz (2002) concurs,

noting that projects of the kind developed within AISI can increase the likelihood that

administrators and staffs will engage in positive goal setting that, in turn, can encourage a

common focus and purpose for all members of the school community.

In addition, Elliott’s (1991) model of action research as a method to undertake

educational change is one of many that has been seen to fit the purpose of most AISI

projects, to “….study an educational situation with a view to improving the quality of

action within it” (p. 69). Moreover, the balanced perspective inherent in AISI processes

renders documents such as Long Term Strategic Plans, Individualized Student Education

Plans, and Individual Teacher Growth Plans potentially more valuable as they are seen to

dovetail with the daily decisions made by staff about the learning activities of students.

For Rogers (2000), this perspective increases the likelihood of collaboration and

cooperation among teachers and principals, and across schools.

In systems all over the world, many schools are exploring newly emerging forms

of assessment designed to demonstrate and celebrate students’ knowledge and skills, and

the effectiveness of schools and teachers. Several factors contribute to this reform in

student and program evaluation: the changing nature of educational goals, the

relationship between assessment and pedagogy, and the clear limitations of current

methods of judging performance (Marzano, Pickering, and McTighe, 1993). Demands for

external accountability, advances in the technology and science of assessment, the advent

of large scale testing in schools, and calls for educational reform are other significant

factors influencing the changing face of student learning and program assessment

(Brandt, 2000). In addition, academic and non-academic competencies necessary for the

Page 18: Evaluation

17

modern workplace, such as creative thinking, decision making, problem solving,

metacognition, life-long pursuit of knowledge, collaboration, and self-management are

making necessary the re-creation of notions of efficiency and effectiveness in reform

initiatives.

Up to the present, evaluations of many corporate and educational organizations

have been based on ideologies derived from seventeenth century Newtonian physics

(Costa and Kallick, 1995). That worldview venerated mechanics, leverage, hierarchies,

and rigid organization. Today there exists more of a collective realization that ours is a

world, not only of things but of relationships, in which the greatest natural resource is the

human mind in synergy with the human spirit. Costa and Kallick (1995) contend, “The

new paradigm of industrial management emphasizes a trusting environment in which

growth and empowerment of the individual are the keys to unlocking corporate success”

(p. 66). This is the same paradigm that will allow and encourage educators to re-envisage

and re-frame the mission, vision, outcomes, and assessment of schools in order to align

them with “modern, relevant policies, practices, and philosophies consistent with the

concept of multiple intelligences in a quantum world” (p. 67). In all this change, program

evaluation is certainly a legitimate activity. However, it is a means to improving

education and learning; it is not an end in itself. Reform and improvement initiatives as a

vehicle for developing personal efficacy, flexibility, adaptability, craftsmanship, high

personal standards, consciousness, metacognition, interdependence, and sense of

community will be the bulwark of the shifting paradigm (Costa & Kallick, 1995). Student

evaluation will also be as important an influence as external assessment. Costa and

Kallick (1995) outline this transition in educational change and program assessment in

Table 3.

Table 3.

Existing and Desired States of Change and Assessment.

From the Existing State… …To a Desired State Bureaucratic institution that fosters dependence based on external evaluation offered as summative rather than formative.

The assumption that change takes place by mandating new cognitive

A system that recognizes the necessity for those who are being assessed to be part of the evaluation process, to become self-evaluating. A system that encourages continuous external feedback to be used for ongoing, self-correcting assessment. Operating within people’s maps of reality (personal knowledge) and creating conditions for people to examine and alter their own internal maps.

Page 19: Evaluation

18

maps and training people to use them. Assessments that limit the frame of reference by which people will judge the system.

Assessments that assist learners in understanding, expanding, and considering alternative frames.

Assessments that impose external models of reality. Assessments that communicate that knowledge is outside the learner. Assessments that signal that personal knowledge and experience are of little worth. Conceptions of curriculum, instruction, and assessment are separate entities in a system. Each aspect of the system that is assessed is considered to be separate and discrete. Individual and organizational critique perceived as negative and a barrier to change.

Assessments that allow different demonstrations of strengths, abilit ies, and knowledge. Assessments that allow the capacity to make meaning of the massive flow of information and to shape raw data into patterns that make sense to the individual. Assessments of knowledge being produced from within the learner. Communicating that the learner’s personal knowledge and experience is of great worth. Cont. Assessment is an integral component for all learning at all levels and for all individuals who compose the system. All parts of the system are interconnected and synergistic. Critique is perceived as a necessary component of quality for individual and organizational decision-making.

In the educational system of today, accountability is not an option. When

systematic testing outside the classroom emerged, it changed forever the nature of

educational assessment (Brandt, 2000). Assessment was no longer a private tool for the

classroom. It became an instrument of public policy. The politicization of achievement

tests made them, potentially, a means to bring about change. Well-intentioned policy

makers, members of legislatures, and politicians who advocate external accountability

measures as the engine of rapid and cost effective reform believe that positive things will

happen to schools and children as a result.

Authentic assessments showcase demonstrations of strengths, abilities, and

knowledge. Accountability in the future ought to focus on what students actually know

and can do, rather than on how much they know compared to others. Otherwise,

evaluation as school improvement is a tautological phrase (Hopkins, 1989). Different

forms of assessment must be used. Performance assessment needs to play a more

prominent role in large scale assessment. The degree to which an assessment reflects the

instruction should become a major indicator of quality in teaching and learning. In short,

Page 20: Evaluation

19

there is a new role on the horizon for assessment, one which overrides other limited goals

such as accountability and classification, as it helps to provide more and better education

for the learner (Brandt, 2000). When student achievement is tied to reform, those school

systems that are being assessed need to be a part of a continuous, empowering process.

Public officials must allow for teachers collectively enhancing their professional efficacy,

“the essential foundation stone for school improvement” (Hopkins, 1989, p.194). In

powerful contrast to traditional external approaches, evaluations with an empowerment

component can serve as catalysts to influence, clarify, expand, and improve more

traditional forms of evaluation. Proven initiatives such as the Manitoba School

Improvement Project, the Improving Quality of Education for All Project and, now, the

Alberta Initiative for School Improvement, represent a paradigm shift in which

enlightened, willing educators can dedicate themselves to promoting social change,

democratic participation, and shared decision making. Evaluators and other participants

can help foster interdependence, and professional growth, and all can contribute to the

cultivation of a community of learners (Fetterman, 2001).

Finally, in a summary of their recent text, Posavac and Carey (1997) reaffirm the

importance of interpersonal relations and the attitude of evaluators as they work with

stakeholders. The authors offer the following statements for the guidance and conduct of

program evaluations:

• Humility won’t hurt.

• Impatience may lead to disappointment.

• Recognize the importance of the evaluator’s perspective.

• Work on practical questions.

• Work on feasible issues.

• Avoid data addiction.

• Make communications accessible.

• Seek evaluation [of the work of the evaluators].

• Encourage the development of a learning culture. (p. 262)

References

Page 21: Evaluation

20

Aitken, A., Gunderson, T., & Wittchen, E. (2000). AISI and the superintendent: Opportunities for new relationships. Paper presented at the annual meeting of the Canadian Society for Studies in Education Symposium. Edmonton, Alberta. Adams, P. (2002). Incentive-based School Improvement: Searching for a Philosopher’s Stone. Unpublished manuscript. Alberta Learning. (1999a). Alberta Initiative for School Improvement administrative handbook. Alberta: Ministry of Learning. Alberta Learning. (1999b). Framework for the Alberta Initiative for School Improvement. Alberta: Ministry of Learning. Alberta Learning. (2000). Alberta Initiative for School Improvement Opportunities and Challenges. Alberta: Ministry of Learning.

Alberta Teachers Association. (1989). School and program evaluation: A manual for teachers. Edmonton, AB: The Alberta Teachers Association.

Anderson, S., & Ball, S. (1978). Profession and practice of program evaluation. San Francisco, CA: Jossey-Bass.

Ary, D., Jacobs, L., & Razavieh, A. (1979). Introduction to research in education. New York: Holt, Rinehart and Winston.

Barth, R. (1990). Improving schools from within. San Francisco: Jossey-Bass. Barth, R. (2001). Learning by heart. San Francisco: Jossey-Bass. Becker, H. (1970). Sociological work: Method and substance. Chicago: Aldine.

Booi, L., Hyman, C., Thomas, G., & Couture, J-C. (2000). AISI opportunities and

challenges from the perspective of the Alberta Teachers’ Association. Paper presented at the annual meeting for the Canadian Society for the Study of Education. Edmonton, Alberta.

Bonβ , W., & Hartmann, H. (1985). Konstruierte gesellschaft rationale Deutung –

Zum Wirklicjkeitscharakter soziologischer diskurse. In Bonβ , W., & Hartmann, H. (Eds.). Entzauberte wissenschaft: Zur realitat und geltung soziologischer forschung. Gottingen: Schwartz.

Borg, W., & Gall, M. (1983). Educational research. New York: Longman. Borg, W., & Gall, M. (2003). Educational research: An introduction, 5th Edition.

New York: Longman.

Brandt, R. (2000). Education in a new era. Alexandria, Virginia: Association for Supervision and Curriculum Development.

Page 22: Evaluation

21

Cimbricz, S. (2002). State mandated testing and teachers’ beliefs and practices. Educational Policy Analysis Archives, 10(2).

Conley, D. (1999). Roadmap to restructuring: Charting the course of change in

American education. Eugene, Oregon: Eric Clearinghouse on Educational Management.

Cooley, W. W., & Lohnes, P. R. (1976). Evaluation research in education. New York, NY: Irvington.

Costa, A., & Kallick, B. (Eds.). (1995). Assessment in the learning organization:

Shifting the paradigm. Alexandria, Virginia: Association for Supervision and Curriculum Development.

Cronbach, L. (1982). Designing Evaluations of Educational and Social Programs.

San Francisco, CA: Jossey-Bass. Covaleskie, J. (2002). Two cheers for standardized testing. International

Electronic Journal for Leadership in Learning, 6(2).

David-Evans, M. (2000). AISI opportunities and challenges: The government’s view. Paper presented at the annual meeting of the Canadian Society for Studies in Education Symposium. Edmonton, Alberta.

Denzin, N., & Lincoln, Y. (Eds.). (2000). Handbook of qualitative research 2nd

edition. London, UK: Sage. DuFour, R., & Eaker, R. (1998). Professional learning communities: Best

practices for enhancing student achievement. Virginia: Association for Supervision and Curriculum Development.

Durkheim, E. (1982). The rules of sociological method. London, UK: Macmillan. Earl, L. (2000). AISI: A bold venture in school reform. Paper presented at the

annual meeting of the Canadian Society for Studies in Education Symposium. Edmonton, Alberta.

Eisner, E. (1986). The primacy of experience and the politics of method. Lecture

delivered at the University of Oslo, Norway.

Eisner, E. (1997). The promise and perils of alternate forms of data representation. Educational Researcher, 26(6).

Eisner, E., & Peshkin, A. (Eds.). (1990). Qualitative inquiry in education: The continuing debate. New York: Columbia University Teachers College.

Elkind, D. (2001). The Cosmopolitan School. Educational Leadership. 58(4).

Page 23: Evaluation

22

Elliott, J. (1991). Action research for educational change. Philadelphia: Open University Press.

Fern, E. (1982a). The use of focus groups for the idea generation: The effects of group size, acquaintanceship, and moderator on response quality and quantity. Journal of Marketing Research. 19.

Fern, E. (1982b). Why do focus groups work: A review and integration of small

group process theories. Advances in Consumer Research, 8. Fern, E. (1983). Focus groups: A review of some contradictory evidence,

implications, and suggestions for future research. In Bagozzi, R., & Tybout, A. (Ed.s). Advances in Consumer Research. Reading: Addison-Wesley.

Fern, E. (2001). Advanced focus group research. London: Sage.

Fetterman, D. (2001). Foundations of empowerment evaluation. Thousand Oaks,

California: Sage.

Flick, U. (2002). An introduction to qualitative research. London, UK: Sage. Foucault, M. (1980). Truth and power. In Gordon, C. (Ed.). Power/knowledge:

Selected interviews and other writings by Michael Fuocault. New York: Pantheon. Fullan, Michael. (2001). The new meaning of educational change. New York: Teachers College Press. Gardner, M. (1993). The great Samoan hoax. Skeptical Inquirer, 17(2). Gay, L., & Airasian, P. (2003). Educational research: Competencies for analysis

and applications. New Jersey: Pearson. Geertz, C. (1973). The interpretation of cultures. London: Fontana. Glaser, B., & Strauss, A. (1967). The discovery of grounded theory: Strategies for

qualitative research. New York: Aldine.

Glesne, C. (1999). Becoming qualitative researchers. New York: Longman.

Gredler, M. E. (1996). Program evaluation. Englewood Cliffs, NJ: Prentice-Hall.

Harris, A., & Young, J. (1999). Comparing school improvement programs in the United Kingdom and Canada: Lessons learned. Retrieved on November 11, 2002 from www1.worldbank.org/education/est/resources/case%20studies/UK&Can-SchoolImp.doc

Page 24: Evaluation

23

Hertling, E. (2000). Evaluating the Results of Whole-School Reform. (Report No. EDO-EA-00-06). Eric Digest Number 140. Eugene, OR. Eric Clearing House on Educational Management, (ERIC Document Reproduction Service No. ED446345).

Hoffer, E. (1972). Reflections on the human condition. New York: Harper-

Collins. As cited in DuFour, Richard. (1998). Professional learning communities at work. Virginia: Association for Supervision and Curriculum Development.

Hopkins, D. (1989). Evaluation for school development. Buckingham, England:

Open University Press. James, W. (1925). Talks to teachers on psychology, and to students on some of

life’s ideals. London UK: Longman. Kahneman, D., Slovic, P., and Tversky, A. (Eds.). (1997). Judgment under

uncertainty: Heuristics and biases. New York: Cambridge University Press.

Kerlinger, F. (1977). The influence of research on educational practice. Educational Researcher, 6(6).

Kohn, A. (1999). The schools our children deserve. New York: Houghton Mifflin.

Lee, T. (1999). Using qualitative methods in organizational research. Thousand

Oaks: Sage.

Mason, J. (2002). Qualitative researching. Thousand Oaks: Sage.

Maldaus, G., Scriven, M., & Stufflebeam, D. (1984). Evaluation models: Viewpoints on educational and human services evaluation. Boston Mass: Kluwer.

Manzer, R. (1994). Public schools and political ideas: Canadian educational policy in historical perspective. Toronto: University of Toronto Press.

Marzano, R., Pickering, D., & McTighe, J. (1993). Assessing student outcomes:

Performance assessment using the dimensions of learning model. Alexandria, Virginia: Association for Supervision and Curriculum Development.

Maturana, H. (1991). Science and daily life: The ontology of scientific explanations. In Steier (Ed.). Research and reflexivity. London UK: Sage.

Moerman, M. (1974). Accomplishing ethnicity. In Turner, R. (Ed.)

Ethnomethodology. Harmondsworth: Penguin Neuman, L. (1997). Social research methods: Qualitative and quantitative

approaches. Boston: Allyn and Bacon.

Morgan, D. (1997). Focus groups as qualitative research. Thousand Oaks: Sage.

Page 25: Evaluation

24

Patton, M. (2002). Qualitative research and evaluation methods 3rd Edition. Thousand Oaks: Sage.

Popham, J. (1988). Educational evaluation. Englewood Cliffs, NJ: Prentice Hall. Popham, J. (1999). Why standardized tests don’t measure educational quality.

Educational Leadership. 56(6). Popkewitz, T., & Brennan, M. (Eds.). (1998). Foucault’s Challenge: Discourse,

knowledge, and power in education. New York: Teachers College Press. Posavac, E., and Carey, R. (1997). Program evaluation: Methods and case

studies. New Jersey: Prentice Hall. Radwanski, G. (1987). Ontario study of the relevance of education and the issue

of dropouts. Toronto: Ministry of Education.

Rapple, B. (1994). Payment by results: An example of assessment in

elementary education from nineteenth century Britain. Education Policy Archives, 2(1).

Reason, P., & Marshall, J. (1987). Research as personal process. In Griffin, V., &

Boud D. (Eds.). Appreciating Adults’ Learning. Center for the Study of Organizational Change and Development: University of Bath Press.

Reichardt, C., & Rallis, S. (1994). The relationship between the qualitative and

quantitative research traditions. In Reichardt, C. & Rallis, S. (Eds.). The qualitative – quantitative debate: New perspectives. San Francisco: Jossey-Bass.

Rebien, C. (1997). Development assistance evaluation and the foundations of program evaluation. Evaluation Review, 21(4).

Rogers, T. (2000). Potential and challenges of the Alberta Initiative for School Improvement. Paper presented at the annual meeting of the Canadian Society for the Study of Education. Edmonton, Alberta. Sanders, J. (2000). The program evaluation standards: How to assess evaluations of educational programs. Thousand Oaks, CA: Corwin.

Schmoker, M. (2000). The results we want. Educational Leadership. 57(5). Scott, D. & Usher, R. (2002) Understanding educational research. New York:

Routledge.

Scriven, M. (1993). Hard-Won Lessons in Program Evaluation: New directions for program evaluation. Tennessee: Jossey-Bass.

Page 26: Evaluation

25

Shadish, W., Cook, T., & Leviton, L. (1991). Foundations of program evaluation: Theories of practice. Newbury Park, CA: Sage.

Silverman, D. (2000). Research and social theory. In Seale, C. (Ed.). Researching society and culture. London, UK: Sage.

Smith, M., Stevenson, D., & Li, C. (1998). Voluntary national tests would improve education. Educational Leadership, 55(6). Somekh, B. (2001). The role of evaluation in ensuring excellence in communications and information technology initiatives. Education, Communication and Information, 1 (1), 75-98.

Stein, J. (2002). The cult of efficiency. Toronto: Anansi.

Texas Educational Agency. (1994). Accountability manual: The 1994-1995 accountability rating system for Texas public schools and school districts. Austin: Office of Policy Planning and Evaluation.

Tyler, R., Gagne, M., & Scriven, M. (1967). Perspectives of curriculum evaluation. Chicago: Rand Mc.Nally & Company.

Tulenko, J. (2001). “Frontline: An interview with James Popham” Retrieved on

December 10, 2002 from www.pbs.org/wg...ows/schools/interviews/popham.html

Weber, M. (1919). Wissenschaft als Beruf. In Winkelmann, J. (Ed.). (1988). Max Weber: Gesammelte Aufsatze zur wissenschaftslehre. Tubingen: Mohr.

Wengraf, T. (2001). Qualitative research interviewing. London: Sage.

Wholey, J., Duffy, H., Fukumoto, J., Scanlon, J.W., Berlin, M., Copeland, W., &

Zelinsky, J. (1972). Proper Organizational Relationships. In C. H. Weiss (Ed.), Evaluating action programs: Readings in social action and education. Boston: Allyn & Bacon. Williams, A. & Katz, L. (2001). The use of focus group methodology in education: Some theoretical and practical considerations. International Electronic Journal for Leadership in Learning. 5(3). Retrieved from http://www.ucalgary.ca/~iejll.

Wolcott. H. (1992). Posturing in qualitative inquiry. In LeCompte, M., Millroy, W., & Preissle, J. (Eds.). The handbook of qualitative research in education. Toronto: Academic Press.

Woods, P. (1986). Inside schools: Ethnography in educational research. London, UK: Routledge and Kegan Paul.

Worthen, R., Sanders, J., & Fitzpatrick, J. (1997). Program evaluation: Alternative approaches and practical guidelines. White Plains, New York: Longman.

Page 27: Evaluation

26

Yutang, L. (1937). The importance of living. New York: William Morrow.

Page 28: Evaluation

27

Appendix A: An Analysis of Values-Orientation Study Types (True-Evaluation). From Madaus, Scriven, & Stufflebeam. (1984).

Approaches Values-Orientation (True Evaluation) Definition Studies that are designed primarily to assess some object’s worth Study Types Accreditation/Certification

guidelines Policy Studies Decision-

oriented studies Consumer-oriented studies

Client-centered studies

Connoisseur-based studies

Advance Organizers

Accreditation/certification guidelines

Policy issues Decision Situations

Societal values and needs

Localized concerns and issues

Evaluators’ expertise and sensitivities

Purpose To determine whether institutions, programs and personnel should be approved to perform specified functions

To identify and assess the potential costs and benefits of competing policies for a given institution or society

To provide a knowledge and value base for making and defending decisions

To judge the relative merits of alternative educational goods and services

To foster understanding of activities and how they are valued in a given setting and form a variety of perspectives

To critically describe, appraise and illuminate an object

Source of questions

Accrediting/certifying agencies

Legislators, policy boards and special interest groups

Decision makers (administrators, parents, students, teachers) their constituents, and evaluators

Society at large, consumers, and the evaluator

Community and practitioner groups in local environments and educational experts

Critics and authorities

Main Questions

Are institutions, programs, and personnel meeting minimum standards; and how can they be improved?

Which of two or more competing policies will maximize the achievement of valued outcomes at a reasonable cost?

How should a given enterprise be planned, executed, and recycled in order to foster human growth and development at a reasonable cost?

Which of several alternative consumable objects is the best buy, given their costs, the needs of the consumers, and the values of society at large?

What is the history and status of a program and how is it judged by those who are involved with it and those who have expertise in program areas?

What merits and demerits distinguish an object from others of the same general kind?

Typical Methods

Self-study and visits by expert panels to assess performance in relation to specified guidelines

Delphi, experimental and quasi-experimental design, scenarios, forecasting, and judicial proceedings

Surveys, needs assessments, case studies, advocate teams, observation, and quasi-experimental and experimental design

Checklists, needs assessment, goal-free evaluation, experimental and quasi-experimental design, modus operandi analysis, and cost analysis

Case study, adversary reports, sociodrama, responsive evaluation

Systematic use of refined perceptual sensitivities and various ways of conveying meaning and feelings

Pioneers College Entrance Examination Board (1901)

Rice Cronbach, Stufflebeam

Scriven Stake Eisner

Developers Cooperative study of secondary school standards (1933)

Coleman, Jenks, Clarke, Owens, Wolf

Alkin, Ashburn, Brickell, Estes, Guba, Merriman, Ott, Reinhard

Glass McDonald, Rippey, and Guba

Guba, Sanders