Upload
anita-muho
View
213
Download
0
Embed Size (px)
DESCRIPTION
education
Citation preview
The Development of Educational Evaluation
Extract from my book:
Assessment of the Sudan School Certificate English Examinations (http://www.lulu.com/content/hardcover-book/assessment-of-the-sudan-school- certificate-English examinations/7583003)
e can begin this chapter by quotation from Sax:(1980:5) who told us that our
ancestors used principles of measurement to build shelters and tools, select a mate,
kill a prey, and fashion clothing long before the advent of educational and
psychological tests. The Ancient Egyptians devised sophisticated, complex
measurement to construct pyramids. By 2700 b.c they had mastered geometrical
concepts and were able to use measurements in three dimensions. The Bible also
attests to the early use of measurement. Noah was commanded to build an ark 300
cubits long, 50 cubits wide, and 30 cubits high, as the story came in the
Bible.(Genesis 6:15). We can trace the historical perspective of educational
assessment and evaluation with another quotation for Horace Mann, the famous
American lawyer who lived in the 19th century and who abandoned law for
education. We venture to predict that the mode of examination, by printed questions
and written answers, will constitute a new era in the history of our schools. Ebel:
(1972:5). So Mann can be considered the first educator who called for test
authentication.
Test and measurement of one kind or another played a big role in human history than
generally recognized. In fact, among the earliest records of the use of various testing
devices are those found in the Bible, although they generally have no direct reference
to education. One illustration will suffice. Ross:(1963:27) narrated the biblical story
of the "Gilieadites who took the passage of Jordon before the Ephraimiates: and it was
so, that one of those Ephraimiates which were escaped said, let me go over; that the
men of Gilead said unto him , art you an Ephraimate. If he said, nay; then said they
unto him, Say now Shibboleth: and he said Sibbloeth: for he could not frame to
pronounce it right. Then they took him, and slew him at the passage of Jordon: and
there fell at the time of Ephaimites forty and two thousand (Judges).
McNamara:(2000:68) and Hughes:(1995:37) quoted the same story and the later
commented that in general, the more important the decision based on test, the longer
the test should be. Jebhthah used the pronunciation of the word 'shibboleth' as a test to
distinguish his own men from Ephrainites, who could not pronounce the (sh). Those
W
who failed the test were executed. Any of Jephthah own men killed in error might
have wished for a longer, more reliable test. The development of measurement can be
traced from 257 B.C in China where an extensive system of written examinations of
educational achievement formed the basis for admission and promotion in the civil
service of ancient China. The system persisted until this century and was least partly
responsible for maintaining the internal stability of that society, and its relatively high
level of culture for over two millennia. It provides an alternative to the more usual
organization of ancient society, in which power and privilege were hereditary
prerogatives. Ebel: (1972:3). When universities were established in Europe in the
renaissance; examinations were largely oral and frequently took the form of public
disputation on controversial questions. The Society of Jesus was founded in 1540,
placed a high value on education and scholarship. Departing from the popular practice
of the times, Jesuit insisted on the use of written examination. In 1599 the society
issued comprehensive statement of the theory and practice of instruction. The
statement included a detailed sect of rules for the conduct of written school
examination, apart from the fact that it is in Latin could be used in an examination
room today.
Then in 1836, as a result of competition and friction between universities, the
University of London was charted to serve primarily as an examining and degree-
certifying authority. Ross :(1963:29) Then in 1845, Horace Mann, who was a
towering figure in the development of public education in the United States, took his
responsibilities very seriously as Secretary of the Massachusetts Board of Education.
He became involved in a controversy with the Boston schoolmasters over the
effectiveness of some of their methods. Mann felt special need for more adequate,
more objective evidence of pupils achievement than oral examinations. Written
examination would have a number of advantages. An idea which he shared with
another American who realized the value and the limitations of examinations -
Emerson. E White, who wrote in 1886 'It may be stated as a general fact that school
instruction and study are never much wider or netter than the test by which they are
measured? He enumerated several special advantages of the written test. Ross
:(1963:29)
Intelligence tests started in America by W. Stanley Jevons in 1874 and in 1879
Wilhelm Wundet in Germany began to do experimental psychological studies. In
1912 another important ideas, suggested by Stern, was that of representing
intelligence as the ratio of mental age to chronological age. This concept for which
Stren suggested the term "mental quotient" was later adopted by Terman as the
familiar Intelligence Quotient (IQ).
The distinctive contribution of the English to the measurement of intelligence has
been that of statistical methods as a tool for the analysis of the test results. Sir Francis
Galton in 1883 outlined a method of studying free association by quantitative method.
But his most notable contribution was in statistical analysis where he suggested
among other things a graphical method of representing the correlations as cited from
Ryan in Ros: (1963:31). These ideas were developed by his pupils such as Karl
Pearson and Charley E. Spearman. Spearman developed his well-known two-factor
theory of intelligence on the basis of statistical analysis. Cyril Burt who had been a
leader in the introducing and adopting Binet's work in Great Britain was in 1913
officially appointed school psychologist, possibly the first person to occupy that
position.
The French have long been leaders of the abnormal psychology. This brings us to the
most important name in the history of intelligence testing, Alfred Binet. He
contributed a technique of scale contribution and another one consisting of test
situations selected according to predominated criteria and standardized. The date 1905
was important, therefore, because it marked the appearance of measurement of
intelligence for the first scale, which, crude it was has served the pattern for
subsequent tests and scales of the world over. In 1890, the concept of mental tests
took place by measuring precisely certain sensory, motor and basic mental faculties
such as visual and auditory sensitivity. This was first done by James McKeen Cattell
who thought there should be a direct relation between a persons elemental processes
and his ability to use higher mental processes such as reasoning, critical thinking, and
creative imagination.
Achievement Tests: What is an achievement test? It is an ability test designed to
appraise what the individual has learned to do as result of planned previous
experience or training, often that provided in school. Thorndike: (1969:643). Another
definition is that, achievement test is an objective examination that measures
educationally relevant skills or knowledge about such subjects as reading, spelling, or
mathematics. (www.eric_digest/index:2003)
The first textbook in educational measurement appeared in 1903 by Dr. Edward L.
Thorndike. "Learning" as Thorndike believed does consist in the cultivation of
http://www.eric_digest/index:2003
faculties such as memory, willpower, reasoning or imagination. It consists in the
formation of fast numbers of specific connections, the strength of which is governed
by the law of readiness, exercise and effect. Ebel :(1972:11). Thorndike was the father
of educational measurement as it was put by Ayres in Ross :(1963:39). Thorndike
was the one who made the discrimination between the 'inventor' and the 'father' of the
movement. Although Thorndike's publications on statistical methods were influential
in education, he was not responsible for the early standard tests. The first test was the
Stone Arithmetic Test which was published in 1908 and the first scale was the
Thorndike Handwriting Scale which was announced in 1909 and polished in 1910.
The idea of standardized test made educators discover for the first time just how bad
the existing measurements were. Ross: (1963:39). Beginning in 1910 several studies
in the unreliability of examinations were carried out. A distinction should be made
between the limitations of the school marks and the limitation of the school
examinations. Then the need for reforms in college marking of examinations took
place and forcibly brought to public attention after the research carried out by Meyer
Max as in Ross(1963:39) who reported on marks collected from forty instructors for a
period of five years at the University of Missouri, and found astonishing variations.
There was also 'Franklin W. Johnson :(1911) who found a similar condition in the
University of Chicago High School, in Study of High-School Grades'.
In 1918 Thorndike published what had proved to be probably the most influential
paper that has ever appeared on educational measurement. It began with the well-
known dictum "whatever exist all exits in some amount." And as he looked in the
future, Thorndike saw it conditioned by a series of 'ifs' - " if those who object to the
quantative thinking in education will set themselves to work to understand it; if those
who criticize its presuppositions and methods will do actual experimental work to
improve its general logic and detailed procedures ; if those who are now at work in
devising and in using means of measurement will continue their work, the next
decades will bring sure gain in both theory and practice". This statement by
Thorndike followed by many developments in the field of measurement.
Objective Test: The decades between the two wars were years of rapid development
in the technique and the uses of educational measurement. There were many figures
such as A. McCall, who seems to have been the first to suggest the objective test."
Ross: (1963:47). There were also Edward K. Strong and Carl Brigham who developed
for the College Entrance Examinations Board an objective test of general verbal and
quantitative skills, called Scholastic Aptitude Test, to supplement the essay tests of
subjects matter achievement. The year 1912 witnessed the first attempts to measure
character by a test that designed by G. G. Fernald followed by Voelker in 1921 who
devised some actual test situations for measuring character.
By far the most ambitious attempt so far made is that of Character Education Inquiry
under the direction of Hartshorne and May which extended over five years from 1924
to 1929. Their method was to select repetitive and varied life situations which would
afford a valid index of the totality of the character of the individual. Ebel:(1972:15) In
the mid 1930s a small group of directors of state testing programs began meeting
each fall in New York City following the annual conference of the Educational
Record Bureau and the Cooperative Test service. These meetings evolved into an
annual invitation conference on testing problems. Measurement specialists present
papers on topics of current interests, which are then published in a book of processing.
(ibid)
Achievement Tests: In 1904 appeared E.L.Thorndike's 'An Introduction to the
Theory of Mental and Social Measurement' which made it available for the first time
to American students the statistical techniques necessary for educational research and
measurement. In 1914, Truman Kelley's, Education Guidance introduced education
workers to the alluring possibilities of partial and multiple correlation. In 1918
appeared 'The Seventeenth Yearbook of the National Society of the Study of
Education', where an essay for Thorndike contained the famous quotation "whatever
exists at all exists in some amount,"
The year 1927 witnessed the publication of Thorndike's 'The Measurement of
Intelligence' and C.E. Spearman's ''The Abilities of Man' which represented a distinct
point of view. In 1930 appeared a collection of books but the most important was the
'Bibliography of Mental Tests and Rating Scales' by Oscar Buros and Personality tests
of 1933,1934, and 1935, 1949, 1950 were milestone in the history of educational
measurement as was put by Ross. In 1947 appeared Lee.J.Gronbach's, 'Essentials of
Psychological Testing' and Robert A, Thorndike's 'Personal Selection of Test and
Measurement techniques'. In 1949 appeared Frank S. Freeman's 'Theory and Practice
of Psychological Testing' as well as E.L.Lindquist's 'Educational Measurement' in
1950. In 1956 appeared the 'Taxonomy of Education Objectives' which marked the
beginning of a new era in educational measurement.
The Taxonomy of Educational Objectives: In 1956 Benjamin S. Bloom and others
initiated and developed the concept of the taxonomy of educational objectives in their
book 'Taxonomy of Educational Objectives' which was first published in 1956, by
David MacKay's Company in New York. Three types of objectives were identified:
cognitive, affective and psychomotor. And because the cognitive taxonomy has
become especially well known and has had considerable impact in stimulating the
development in tests that measure more than knowledge, the researcher will make a
summary of its two important books, "Taxonomy of Educational Objectives, Book 1
Cognitive Domain, and Book 2 Affective Domain By: Benjamin S. Bloom as editor
(et.al). The first book was issued on 1956 and 1979 the edition which is at our hand
now. The second book was issued in 1964 and we are dealing with the 1971 edition
published by David McKay Company, New York.
Educational Objectives in Pupils' Evaluation: Efforts to translate the needs of
young people into educational objectives are typically directed towards producing
precise statements that identify the observable changes in pupils' behaviour that
should take place if the learning experiences are successful. Unquestionably such
objectives are excellent stating points for the development of the curriculum, planning
of teaching strategies and construction of testing. Eisner in Ahmann: (1975) identifies
two types of objectives, namely instructional and expressive. The former specify the
unambiguously the particular pupil behaviour to be acquired as result of learning. The
later do not.
The search for precise definition of educational objectives and goals has been and still
is a major field of research. The efforts of researchers have yielded very good
attempts such as the 'The Taxonomy of Educational Objectives' by Benjamin S.
Bloom and others. This attempt has widely contributed in the development of
education since it was first published in 1956. It has contributed in the development of
curriculum designing, teaching methods as well as in student measurement and
evaluation. And due to an advice from the supervisor and due to the importance of
this 'Taxonomy' the researcher will make a brief summary of the Taxonomy as this
would help many other researchers to make use of this attempt in their future studies.
The Three Domains of the Taxonomy:
Cognitive Objectives: emphasize remembering or reproducing something which has
presumably been learned, as well as objectives which involve the solving of
intellective tasks, for which the individual has to determine the essential problem and
then reorder given materials to combine it with ideas , methods, or procedures
previously learned.
Affective Objectives: emphasize a feeling tone, an emotion or a degree of acceptance
or rejection. Affective objectives vary from simple attention to selected phenomena to
complex but internally consistent qualities of character and conscious. A large number
of such objectives are found in the literature expressed as interests, attitudes,
apprehension, values and emotional sets or biases.
Psychomotoric Objectives: emphasize some muscular or motor skills, some
manipulation of material and objects or some act which requires a neuromuscular co-
ordination. When found, they were mostly related to handwriting, speech and to
physical education, trade, and technical courses.
The Taxonomy as a Classifying Device: The major task in setting up any kind of
taxonomy is that of selecting appropriate symbols, giving them precise and usable
definitions, and securing the consensus of the group which is to use them. Similarly
developing a classification of education objectives requires the selection of an
appropriate list of symbols to represent all the major types of educational outcomes.
Next, there is the task of defining these symbols with sufficient precision to permit
and facilitate communication about this phenomenon among teachers, administrators,
curriculum workers, testers, educational research workers and others who are likely to
use the taxonomy. Finally, there is the task of trying classification and securing
consensus of the educational workers who wish to use the taxonomy.
The Cognitive domain
It is classified into domains in six classes
Knowledge: it involves the recall of specifics and universals, the call of methods and
process, the recall of a pattern structure or setting and intellectual abilities and skills
Knowledge of specifics: The recall of specific and isolable bits of information. The emphasis is on the symbols with concrete reference. This which material is at
a very low level of abstraction may be thought of as elements from which more
complex and abstracts forms of knowledge are built.
Knowledge of ways and means of dealing with Specifics: Knowledge of the ways of organizing, studying, judging, and criticizing. This includes the methods
of inquiry, the chronological sequences, ant the standard of judgment within a
field as well as the patterns of organizing through which the areas of the fields
themselves are determined.
Knowledge of the universal and abstraction in a field: Knowledge of the major schemes and patterns by which phenomena and ideas are recognized. These are
the large structures, theories and generalization which dominate a subject field or
which are quite generally used in studying phenomena or solving problems. These
are at the highest levels of abstractions and complexity.
Intellectual Abilities and Skills: Abilities and skills refer to organized models of
operation and generalized techniques for dealing with materials and problems. The
materials and problems may be of such a nature that little or specialized and technical
information is needed. Such information as required can be assumed to be part of the
individual general fund of knowledge. Other problems may require specialized and
technical information at a rather high level such that specific knowledge and skill in
dealing with the problem and the materials required. The abilities and skills objectives
emphasize the mental process of organizing and reorganizing materials to achieve a
particular purpose. The materials may be given or remembered. The abilities are
classified under six classes come as follows:
1. Comprehension: This represents the lowest level of understanding. It refers to the
type of understanding or apprehension, such as the individual knows about what is
being communicated and can make use of the materials or idea being communicated
without necessarily relating it to other material or seeing its full implication.
Comprehensions consists of three sub classes; translation, interpretation and
extrapolation.
2. Application: The use of abstractions in particular and concrete situations. The
abstractions may be in the form of general ideas, rules of procedures, or generalized
methods. The abstractions may also be technical principles, ideas, and theories which
must be remembered and applied.
3. Analysis: the breakdown of communication into its constituents elements or parts
such that the relative hierarchy of ideas is made under clear/or the relationship
between the ideas expressed are made explicit. Such analyses are intended to clarify
the communication, to indicate how the communication is organized; and the way in
which it manages to convey its effects, as well as its basis and arrangement. It
includes sub classes such analyses of elements, relationship, and analyses of
organization principles.
4. Synthesis: The putting together of elements and parts so as to form a whole This
involves the process of working with pieces, parts, elements, etc., and arranging and
combing them in such a way as to constitute a pattern or structure clearly there
before. It has three sub classes, production of a unique communication, and
production of a plan or propped set of operations, derivation of a set of abstract
relations.
5. Evaluation: It is the judgment about the value of material and methods for given
proposes. Quantative and qualitative judgments about the extent to which, material
and method satisfy criteria. Use of standard of appraisal. The criteria may be those
determined by the students or those are given to him. Evaluation includes these
classes: judgment in terms of internal evidence, judgment in terms of external criteria,
which means evaluation of material with reference to selected or remembered criteria.
Classes of the Affective Domain:
1. Receiving (attending): Here we are concerned that the learner be sensitized to the
existence of certain phenomena and stimuli and that he be willing to receive or attend
to them. It has been categorized in three subclasses awareness, willingness to receive,
controlled or selected attention.
2. Responding: The term used to indicate the desire that a child becomes sufficiently
involved in or committed to a subject phenomena, or activity that he will seek it out
and again satisfaction from working with it, or engaging in it. It has three sub-
categories, acquiescence in responding, wiliness to respond and satisfaction in
response.
3. Valuing: At this level we are not concerned with the relationship among values but
rather with the internalization of a set of specific, ideal, values. This category will be
found appropriate for many objectives that use the term "attitude" as well as "values".
It has three sub-categories, acceptance of a value, preference for a value and
commitment.
4. Organization: This category is intended to as the proper classification of for
objectives which describe the beginning of a building of a value system. It is
subdivided into two levels conceptualization and organization of a value system.
5. Characterization by value or value complex: At this level of internalization the
value already have a place in the individual value hierarchy, are organized into some
kind of internally consistent system, have controlled the behaviour of the individual
for a sufficient time that he has adapted behaving this way; and an evocation of the
behaviour no longer arouses emotion or effect except when the individual is
threatened or challenged. It has two sub-categories; generalized set and
characterization. For further details see "Source: The Taxonomy of Educational
Objectives" by Benjamin S. Bloom (196:176-193).
The Value of the Taxonomy in Education: The impact of the taxonomies on
educational thinking is indeed powerful. They (taxonomies) do assist appreciably in
the task of helping teachers and other educational specialists discuss their curricular
and evaluation problems with greater precision. The three taxonomies represent the
total framework of educational objectives for all types of educational institutions.
Ahmann:(1975:4). As stated by Metzger in Orlich :( 1998:88) taxonomies had been
used in general curriculum design by (Pratt:1994) to provide the stimulating
experiences for preschoolers through technology), by (Morgan;1996) and test
construction, by (McLaughlin and Philips:1991- Marks, Vitek and Allen) who used
the taxonomies to relate data obtained from a satellite remote sensing exercise.
Perhaps the taxonomy's greatest contribution has been in the development of a
professional language. Teachers and administers who describe and analyze
instructions know that terms such as knowledge level and higher levels of learning
will be understood by educators everywhere. This Universals vocabulary reflecting a
specialized body of knowledge was an essential step in the professionalizing of
teaching. : ibid: (1998:89)
Psychological Evidence Supporting Bloom's Taxonomy: Does Bloom's Taxonomy
make sense psychologically? Evidence from a number of sources supports the idea,
that increased level of processing means better student learning. Perhaps the most
fundamental of these evidences is that teacher actions influence academic tasks,
which, ultimately influence learning. Doyle:(1983),Nickerson:(1985)-
Wakefield:(1996).
The Outcomes of Taxonomies on American Education: The taxonomy has played
a major role in shaping the American education. It helped in designing the curriculum,
developing the teaching methodology, and helped in assessment of the students'
performance and it helped research workers to do their job in the best professional
way. As came in Orlich:(1998:73:77), one recent important attempt to set goals for
education was the National Education Goals for the Year 2000, the product of an
education summit called by president George Bush in October 1989. All of the
nation's governors attended that historic meeting, (including Bill Clinton, who was
then governor of Arkansans).
New Models for the Cognitive Taxonomy: Blooms' taxonomy has provided a
number of useful insights about teaching and learning in the classroom since 1956.
Raymond Nickerson (1985) as mentioned in Orlich (1998:89) wrote a paper.
"Understanding Understanding" that triggered a serious reexaminations of the nature
of understanding and its role in Bloom's taxonomy. Having a keen look on the
traditional analogues of the cognitive taxonomy will explain to us the interaction of
all elements of the cognitive domain. We can see the analogue of the taxonomy in
these shapes.
(Fig.1) Taxonomy as staircase
(Fig.2) Taxonomy as a ladder
However, the researcher believes that the taxonomy had made it and still makes it
easier for most people in educational fields to classify, identify and apply clearly and
objectively the educational goals. But a few dealt to some extent with the taxonomy in
one phase or another. We can here mention some of Sudanese scholars who have
treated this part of the taxonomy (e.g., Mohammed Abdul Al-Fatah Shaheen(1983),
Abdel-Rahiem Ahmed Salim:(1985), Mohammed Hassan Sinadah (1986) and Farouq
Mohammed Ahmed A/Asalam, who advised and provided the researcher with
evaluation
synthesis
analysis
application
comprehension
knowledge
Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge
valuable authorities and references in the field of evaluation and measurement; and
who took the burden of the supervision of this research..
It is worth mentioning that Bloom did not stand alone in the field of taxonomies.
There are other taxonomies as well such as David R. Krathwol collaborated with
Bloom and others :(1964) in "Taxonomy of Educational Objectives: the Classification
of Educational Goals: Handbook 2: The Affective Domain, in (1964). there is also
Raymond Nickerson (1985) as mentioned in Orlich (1998:89). Arnold B. Arons
(1988) also examined concepts similar to Nickerson's where his studies created more
speculations about the role comprehension plays in learning. Many other researchers
contributed pieces to this dilemma such as Wittrock (1986); Jones (1985) ; Ennis
(1985); Beyer (1988); Whimby (1984); Haller, Child and Walberg (1988): See
Orlich:(1998:88). Gilbert Sax (1980:) also pointed to a taxonomy prepared by Anita
Harrow (1972) who wrote about the psychomotor domain.
Criticisms of the Taxonomies: Despite its widespread acceptance and use Bloom's
taxonomy has raised some continuous questions, one question is
incomprehensiveness. Some critics such Furst:(1994) in Orlich:(1991:89)). Others
raised some questions see that the taxonomy is too narrow and does not include all the
important outcomes taught in our school sequence of the levels whether the levels in
the hierarchy are discrete or overlapping. For some purposes, the taxonomies are not
sufficiently prcis as it has been used in general curriculum design. Ahmann
(1975:34)