View
1
Download
0
Category
Preview:
Citation preview
17c
/,o. 6 31
REVISION OF THE LOGICAL REASONING SUBTEST
OF THE CALIFORNIA TEST OF MENTAL MATURITY
THESIS
Presented to the Graduate Council of the
North Texas State University in Partial
Fulfillment of the Requirements
For the Degree of
MASTER OF SCIENCE
By
Patrice M. Ryan, B. A.
Denton, Texas
December, 1986
Ryan, Patrice M., Revision of the Logical Reasoning
Subtest of the California Test of Mental Maturity.
Master of Science (Industrial/Organizational Psychology),
December, 1986, 43 pp., 4 tables, references, 23 titles.
The purpose of the study was to develop a revision of
the logical reasoning section of the California Test of
Mental Maturity which increases its discriminative ability
while maintaining an acceptable measure of reliability.
Subjects were 102 students of general psychology classes
at North Texas State University. All were administered
the Logical Reasoning section of the California Test of
Mental Maturity in its original form and an experimental
revision of it (LRTR). The Wesman Personnel Classifica-
tion Test was administered at the same time to demonstrate
the tests' construct validity. Pearson product-moment
correlations, item and homogeneity analyses were run.
Results indicated that the revised test correlated sig-
nificantly with the original test and the WPCT. Internal
validity of the revised test was satisfactory, showing an
improvement over the original test in terms of clarity,
reliability and homogeneity.
TABLE OF CONTENTS
LIST OF TABLES. . . .*. .*...
REVISION OF THE LOGICAL REASONING SUBTEST OF
CALIFORNIA TEST OF MENTAL MATURITY
Introduction. . . . . . . . . . .
Method . . . . . . . . .. . . .
SubjectsMaterialsProcedure
Resu2Lts.. .. ........ .
Discussion - -.-.". ..... ..
Appendix -. . . ..... ... .. . .
References -.- - . . . . . . .
" . . . . .
THE
. . . .f
. . . . .
" 0 ."
." . .r ."
. f. . .
. . ". .
iii
Page
iv
I
15
18
25
28
40
"
."
."
."
LIST OF TABLES
Table Page
1. Descriptive Statistics. . . . . . . . . . . . 19
2. Correlation Matrix . .. . . . . . . . . . . . 213. Mean Difficulty Scores, Logical Reasoning
Test Revision. . . . .. . . . . . . . . . . . 23
4. Summary of Mean Difficulty Scores Data. . . . 24
iv
REVISION OF THE LOGICAL REASONING SUBTEST
OF THE CALIFORNIA TEST OF MENTAL MATURITY
The importance of the use of standardized, valid, and
reliable tests has long been recognized in the psychologi-
cal community. Within the psychology field, tests have
been used for diagnostic purposes; for educational assess-
ment and placement; for career and personal growth and
development; and for personnel selection, evaluation and
placement. Cascio (1978) discussed the role of psychologi-
cal assessment in the area of personnel selection. He
described it as an important tool to be used in the
decision-making process, and placed the measurement of
individual differences at the heart of personnel selection.
Dunnette (1962) described an ideal world where the aim of
those involved in personnel selection would be:
To place all persons on jobs perfectly suited to
them and to society. This aim assumes that each
person should use his abilities, temperament, and
motivations in the best possible way for him; it
also assumes that society will make the best possible
use of its total manpower resources. (p. 2)
In order to make the most effective decisions regard-
ing employment and placement within an organization, an
adequate assessment of an individual's knowledge, skills
1
2
and abilities is essential. This assessment must be done in
a consistent, systematic and objective manner in order to
offer the maximum degree of fairness to all candidates under
consideration for a position.
In 1972 the Equal Employment Opportunity Act was
passed into law. According to its major tenets, it became
unlawful for any employment practice to unfairly treat or
otherwise discriminate against an individual because of his/her
race, color, religion, sex, or national origin and required
that the job-relatedness of any test used as a part of a
selection procedure be able to be demonstrated (McCormick &
Ilgen, 1980). The Equal Employment Opportunity Act also
established the Equal Employment Opportunity Commission
(EEOC) which was given considerable power to enforce the
new laws. In 1978 the EEOC drew up the Uniform Guidelines
on Employee Selection Procedures. Since the institution
of these changes, the field of Industrial/Organizational
Psychology, and specifically the area of psychological
assessment has been challenged in the courts with accusa-
tions of unfairness toward minorities. Resnick and
Resnick (1982) noted that the field of psychological assess-
ment was the first area of applied psychology which has
received legal attention in attempts to restrict its use.
This attention from the media, business, and legal com-
munities has had varied results. More and more care has
3
been taken in designing selection procedures which will
provide an employer with qualified candidates for a position
and not unfairly treat one group over another. Due to the
Uniform Guidelines' lack of clarity in its definition of
validity and job relatedness, it has been very difficult to
demonstrate an acceptable standard in light of the differ-
ing interpretations of these topics by judges, lawyers,
government officials, and psychologists. Thus, the valida-
tion process has become a long and tedious one, attempting
to decrease the potential for litigation against the hiring
institution, while avoiding unfair selection situations.
Finally, there appears to have been a decrease in the over-
all use of valid selection procedures. Boehm (1982)
reported what may be a significant result of the increased
public attention and government intervention in the field
of personnel testing. She studied the research in the
Industrial/Organizational field which had been conducted
during the twenty year period spanning pre- and post EEOC
changes. She reviewed all the articles published from 1960
through 1979 in two professional journals--Personnel
Psychology and Journal of Applied Psychology. She noted
that while there was a significant increase in the number of
articles published during this time frame dealing with
employment-related topics, there was also a significant
decrease in the number of publications reporting the results
4
of validation studies. She suggested that the attention
from the private and government sectors has had both a
positive and negative influence on employment-related
research. While the public attention has driven an increase
in the need for, and interest in, this area of research, it
also has caused a reticence in the conducting of research in
the specific areas of validation. Boehm felt that govern-
ment intervention had injected a fear factor in the scien-
tific community which could have a dramatic, far-reaching
impact on the future in the field of Industrial./Organizational
psychology.
Because psychological assessment is a dynamic field,
psychologists whose work is involved with tests used in
personnel selection are called on now more than ever to
continually evaluate these instruments and the circumstances
in which they are used.
The feature that distinguishes reputable work in
personnel selection from that of the mass of self-
styled "psychologists", "personnel experts", and
other quacks is that the reputable worker in the
field is continuously concerned with testing, verify-
ing, and improving the adequacy of his procedures.
(Thorndike, 1949, p. 2)
The present study was developed with these goals in
mind. The test under investigation was the California
5
Test of Mental Maturity (CTMM). It has been used extensive-
ly in the field of education as a screening and selection
instrument and has been found to be of use in industrial
settings, usually as a part of a selection battery (Sullivan,
Clark & Tiegs, 1963). Originally developed in 1936, the
CTMM is a paper-and-pencil test designed to give information
on capacities which the authors claim are basic to learning
(e.g., problem solving and responding to new situations).
The authors measure the rate and scope of mental development
in terms of four factors derived through factor analysis of
the test items. These factors are: Logical Reasoning
(measured by Opposites, Similarities and Analogies tests),
Verbal Concepts (Verbal Comprehension tests), Numerical
Reasoning (Numerical Values and Number Problems tests), and
Memory (measured by the Delayed Recall test). Seven subtests
are grouped into two sections, Language and Non-language,
which differentiate in general between responses to stimuli
that are essentially nonverbal or pictorial, and traditional-
ly verbal in terms of the means of presentation.
The CTMM has been used in connection with various job
classes within industry. Of particular relevance to this
present study is a series of reports on a five-year
research endeavor conducted for the Federal Aviation Admin-
istration (FAA) at the Aeronautical Center in Oklahoma City
(Cobb, 1962; Cobb, 1964; Trites & Cobb, 1964; Trites, 1964).
6
Throughout.the series, the CTMM was used with a variety of
other commercially marketed instruments, supervisors'
ratings, and medical history information to predict success-
in-training and on-the-job effectiveness for persons
involved in the basic FAA Air Traffic Controller (ATC) train-
ing program.
The initial research project was a follow-up study of
subjects who had been tested five years earlier while still
in the FAA training program. At that time, a variety of
information was collected on the trainees by means of
psychological tests, medical historical information, and
class instructors' evaluations at the time of training.
The results of the study showed that (1) psychological
tests could be used as predictors of on-the-job effective-
ness, (2) instructors' evaluations during training
accurately predicted on-the-job performance years later,
(3) those trainees who entered ATC training at a later
chronological age than their peers performed poorer both
during the training program and on the job as well, and
finally (4) the medical history information collected from
the trainees was found to have no predictive value (Trites,
1961). This fundamental study established the usefulness
of psychological tests as predictors of future ATC success
and also pointed out further selection-related questions
regarding entry-age and type of job training which warranted
further research.
7
The next phase of this five-year study was to determine
the most effective psychological tests to use to predict
performance in a training course for the En Route Controller
School - a section of the FAA ATC training program. All
tests of the CTMM, College and Adult level, were administered
to students in the training program, as well as five sub-
tests of the Differential Aptitude Tests (DAT), seven sub-
tests of the Moran Repetitive Measures (RPM) battery, and
the California Personality Inventory (CPI). The criterion
measures used were a combined academic-laboratory grade
obtained from the training program, and scaled objective and
subjective personality ratings. Results showed that psycho-
logical tests contributed to the accurate selection of
personnel in the ATC training program. The five test areas
which emerged as most predictive of ATC En Route training
school performance were the DAT Abstract Reasoning, Numerical
Ability and Spatial Relations subtests; the CTMM Analogies
test, and the specifically designed ATC problems (paper-and-
pencil problems) (Cobb, 1962). Thus it was demonstrated
that success-in-training could be predicted by standardized,
psychological tests, while personality measures were not
shown to be predictive.
This five-year study proceeded to investigate various
factors which could later predict job performance. One of
these factors was the type of pre-employment job experience
the trainees might have had prior to enrolling in the ATC
8
training program. The experimenters hypothesized that
trainees having any air traffic-related experience would
perform better both in training and on the job. The subjects
were divided into two groups according to the type of ATC
work for which they were being trained, either Terminal or
En Route. Thr trainees' scores on the five aptitude tests
found to be most valid at predicting success in training
for the En Route training course were used. These tests were
the Space Relations, Numerical Ability and Abstract Reasoning
subtests of the DAT; the Analogies test from the CTMM, and
ATC problems designed by the FAA training center. The
trainee's level of education was also considered a variable
along with the aptitude test scores. The trainee's aptitude
test scores and educational level were compared with the
trainee's academic grade average, the laboratory grade
average, a combined academic and lab grade, a supervisor
rating, and a pass-fail determination for the training
course. The aptitude tests were chosen from the larger bat-
tery of tests used in the initial study of this research
project as they had been found most valid as a composite for
prediction of En Route training course criteria. Only one
test of the CTMM Logical Reasoning subtest, Analogies, was
used in this study. It was found to differentiate between
the Terminal and En Route samples. With respect to the
En Route sample, the CTMM Analogies test showed significant
9
( < .01) positive correlation with the following variables:
combined academic and laboratory grades (r =.28); training
course pass-fail determination (r = .18); academic grade
average (r = .30); and laboratory grade average (r = .23).
It showed a positive correlation with Supervisor ratings
(r = .14), which was significant at the 2 < .05 level. A
significant (C < .01) negative correlation (r = -.14) was
found between the Analogies test and Age.
Correlations of the CTMM Analogies test and the criteri-
on measures for the Terminal sample produced significant
positive correlations (C< .01) with the academic gradeacquired through the training program (r = .22, 2 < .01).
For this sample no other correlations were significant. The
data suggested that other tests might be more useful in
providing accurate predictions, especially for the Terminal
section of training (Trites & Cobb, 1964).
Cobb (1964) atempted to determine which of the apti-
tude and personality measures originally administered to a
sample of En Route trainees could be used as predictors of
success-in-training and future job performance effectiveness
for trainees in the Terminal training program. The results
of the study showed that a nonverbal abstract reasoning, or
induction factor, and a number-facility aptitude were most
heavily represented in the prediction equations derived for
10
each training group. The difference between the En Route
and Terminal groups appeared to be represented by a factor
related to verbal abstract reasoning best measured by the
CTTM Inferences test.
Research in the FAA study continued in an effort to
determine those aptitude and/or personality factors which
varied with age and to evaluate their effects on training
success and eventual on-the-job performance. The original
aptitude and personality test data collected from the
trainees at the outset of the study were evaluated in terms
of having either failed the FAA training course, success-
fully passed the training course and at a period of ten
months after completion of training were still employed by
the FAA, or passed the training and been separated within
the first year of employment. Results showed significant
(E <.CI1) negative correlations between age and scores on
the CTMM Analogies (r=-.08) and Similarities (r=-.15) tests.
F-test values were provided for differences in the adjusted
and unadjusted means of all aptitude tests. The Analogies
test had an F-test value significant (pf<.001) for both the
adjusted (M = 7.7) and unadjusted means (M =11.7). The
F-test values for the Opposites test were significant (p <.01)
for unadjusted means(M = 5.1), and at the p <.05 level for
adjusted means (M = 3.5). The Similarities test was signifi-
cant (p <.05) for unadjusted means only (M= 3.2). The
composite score on the CTMM Non-Language section showed a
11
significant negative correlation (r=-.14) with age as did
the Air Traffic Problems, Part I (Trites, 1964). Throughout
this five-year employment-related study, aptitude tests in
general, and certain tests of the CTMM specifically, were
shown to be useful in predicting job-related information.
This information included the prediction of training success
and job performance as well as differentially predicting job
performance according to the type of training program and
the trainee's pre-training experience.
In other employment-related research, King, Norrell,
and Erlandson (1959) attempted to derive a multiple regres-
sion equation to predict first term grade point averages
of students in a Police Administration curriculum. They
determined that the weighted scores on the CTMM Language
section and an internally developed Michigan State Univer-
sity reading test were the best predictors of the grade
point average criterion.
Another investigator (Topetzes, 1957) voiced his con-
cern for society in general and the disservice to all that
is done when appropriately qualified personnel are not
selected and trained for jobs to which they are best suited.
He used the CTMM along with the Wechsler-Bellevue in order
to devise a sufficiently reliable and valid predictor of
success in a Physical Medicine program at the University of
Wisconsin. As both tests used were intelligence tests, a
12
predictably high correlation (.63) was found between the
CTMM and the Wechsler-Bellevue. A moderately high corre-
lation (.59) between the CTMM and actual grade point average
was also found.
Again, the desire to best allocate human resources was
expressed in a study to determine the best predictors of
on-the-job success for sub-professional recreation leaders
(Parker, 1966). The skill and abilities of the professional
recreation director were being misapplied in activities
which could be sufficiently performed by para-professionals.
The para-professional incumbents were found to be signifi-
cantly above the average for the normative group in terms of
language ability as measured by the CTMM. The study indica-
ted that the Language section of the CTMM would be useful
in the prediction of on-the-job success as well as pre-
selection screening device.
Using the paired comparison rating technique, 33 produc-
tion supervisors in a steel production plant were ranked in
terms of their managerial effectiveness by the plant super-
intendent and two assistant supervisors (Poe & Berg, 1952).
The highest 10 were placed in the High group, and the lowest
10 were placed in the Low group. All were administered the
CTh4. The Logical Reasoning section of the CTMM was shown
to be able to accurately discriminate (P <.05) the highly
ranked supervisors from the low ranked supervisors (t ratio =
2. 25).
13
Lichtman (1982) used the Non-Language section of the
CTMM along with the DAT, an internally developed reading
comprehension test, a general information test, and a bio-
data questionnaire to select students to a six-month train-
ing program to prepare them for various jobs available in
nuclear power plant operations. The test battery as a whole
showed good prediction of success in the training program.
The study also concluded that there was no evidence of
differential predictions of success for minority groups.
The present project was initiated after a review of
recent relevant literature regarding the CTMM and its use
in the area of personnel selection. This review revealed
a number of areas which needed to be addressed. Despite
the findings (cited above) indicating some usefulness for the
CTMM, there was an overall scarcity of studies, especially
those using adult populations in any setting, but most
critically the scarcity of those done in employment-related
fields. Secondly, there has been a 20-year span since the
most recent revision of the test. Problems of ambiguous
artwork and poor distractors on the Logical Reasoning Sub-
test of the CTMM (Stanley, 1965) indicated another revi-
sion was in order.
This revision project was limited to the Logical
Reasoning Subtest of the CTMM, College and Adult Level 5.
This subtest consists of three separate tests: Similarities,
14
Opposites, and Analogies. The subtest was designed with a
pictorial presentation; however, the small picture size
caused problems of recognition for test takers. Drawings of
all fifteen test items are arranged on one page, causing
recognition of some of the individual pictures which serve
as distractors to be very difficult at times. Some pictures
themselves were out of date, adding another possible source
of unnecessary confusion to a test taker. The goal of the
study was to improve upon the artwork of the original test,
clarify ambiguous distractors and demonstrate the validity
and reliability of the revised instrument were comparable
to that of the original as reported in the test's manual.
In addition, the original and revised Logical Reason-
ing subtests were compared with a well-established, com-
mercially marketed test also designed to test general
intelligence, (Wesman, 1965) the Wesman Personnel Classifi-
cation Test (WPCT). The WPCT was chosen in order to
compare the scores of the original and revised tests with
it. It was hoped that this would demonstrate the instrument's
concurrent validity. The WPCT has been used widely as a
tool for personnel selection. For example, in a study to
develop a battery of tests to be used in the selection and
evaluation of editorial personnel for a technical magazine,
it was found that both the Verbal and Numerical sections of
the WPCT effectively discriminated between those rated high
or low by their supervisors, in terms of job performance
(Abt, 1949).
15
The WPCT was also used along with another paper-and-
pencil test to attempt to predict successfull life insurance
salesmen. The tests could not predict success at sales, but
the WPCT was shown to differentiate between managers and
agents (Baier & Dugan, 1956). Finally, the WPCT was found
to have a significant positive relationship between success-
in-training, as measured by final grade, and scores on both
the Verbal and Numerical sections of the test when used as
a screening device for entry to a seven-week training pro-
gram in drafting (Perrine, 1955).
The purpose of this project was to produce a revision
of the Logical Reasoning subtest of the CTMM to be used in
the field of personnel selection. It was hoped that the
revision would improve on the original in terms of clarity
of thought and design, and yet still maintain the reliabil-
ity and validity of the original.
Method
Subj ects
Subjects were 102 undergraduate students of general
psychology classes at North Texas State University who
participated in the study on a volunteer basis. They were
first contacted by the experimenter through their Introduc-
tory Psychology classes. Participation in psychological
research earned them extra credit. The mean age was
22.34 years. Thirty-eight of the subjects were male, and
sixty-four were female.
16
Materials
The following tests were used in the project: the
Logical Reasoning Subtest of the CTMM in its original form;
the revised version of the Logical Reasoning Test (LRTR);
and the Wesman Personnel Classification Test (WPCT).
The Logical Reasoning Subtest of the CTMM consists of
separate Similarities, Opposites, and Analogies tests. Each
of these tests consists of 15 items, all presented in pic-
tures. The subjects are asked to discern the relationship
between a cue item and a set of four or five possible
responses.
The WPCT was selected for its brevity, ease of admin-
istration and its demonstrated high correlation with other
noted intelligence tests. It yields Verbal, Numerical and
Total scores. The Verbal score is based on an 18 minute
analogies test, in which each of the 40 items has two blanks.
The Numerical section is a ten minute arithmetic computation
test.
The LRTR is the experimental instrument to be compared
with two separate criteria, the CTMM and the WPCT. The
construction of the LRTR occurred in three phases. In
the first conceptualization stage, the new test items were
created to demonstrate either the "similar," "opposite," or
"analogous" relationship between two objects. A number of
"distractor" answers were generated for each proposed item.
In the second phase, sketches were made of all the items to
decide which were amenable to the chosen nonverbal mode.
17
Abstract concepts such as "dawn" or "twilight" were simply
too difficult to convey by drawing. Those remaining items
and distractors were taken to a professional illustrator who
did the final drawings. Finally, the items were assembled
and critiqued by test development specialists using the group
consensus process. These specialists were psychologists who
were involved in the development effort. Ambiguous items
and those of questionable value were deleted, although items
were not screened for sex or race bias at this point.
The final revision of the LRTR consists of 53 items.
This exceeds the length of the original CTMM measu-e by
eight items. Using the information collected through this
study, the least effective items will be eliminated in order
to equate the tests in terms of length. New time limits for
the revised test were determined by ratio to attain equiva-
lence with the limits of the original CTMM. The Opposites
test has 19 items with a six-minute time limit for comple-
tion. The Similarities test contains 18 items with a five-
minute limit, and the Analogies test has 16 items with a
four-minute limit (see Appendix A). Each of the separate
tests of the CTMM has 15 items with of four-minute limit.
Procedure
Each subject was provided with sufficient materials for
test completion. Each test was timed according to the
authors' instructions. In order to control for any practice
effect, the tests were administered in a counter-balanced
design. Half of the subjects took the tests in the CTMM-
18
WPCT-LRTR order, and the other half received them in the
reverse order.
Subjects were tested in small groups of no more than 10
per session. Classrooms and small meeting rooms were used
for testing sessions, insuring comfortable seating and ade-
quate lighting and ventilation. Testing sessions were con-
ducted during the late summer and early fall, from 8:00 a.m.
to 5:00 p.m. at the subject's convenience.
Item and homogeneity analyses were run on the three
measures under consideration. Pearson product-moment
correlations were run on each individual test in order to
compare them with each other and the total test scores.
Results
Table 1 is a compilation of descriptive information on
the three instruments used in the study. When it was avail-
able the same information for the 1963 revision of the CTMM
is included as well (California Test Bureau, 1963). Mean
test scores and standard deviations are calculated for each
test, as well as mean item/test correlation figures which are
reported for all three instruments used in the study.
Cronbach's Alpha is calculated for the three tests as a
measure of test reliability. The 1963 revision of the CTMM
used the Kuder-Richardson Formula Number 21 to demonstrate
its reliability.
An average of the item/test correlations for each of
the three tests shows the WPCT to have the greatest degree
of homogeneity with the LRTR being somewhat higher than
19
Table 1
Descriptive Statistics
CTMM'63a CTMMb LRTRb WPCTb
Mean 24.00 27.02 36.77 21.97
Percent Correct 54.00 60.00 69.80 54.00
SD 6.50 5.786 5.918 5.789
Mean item/test correlation N/A .22 .23 .28
Reliability .75 .7456 .7724 .8129
aNormative data from 1963 revision of CTMM.
bData from present sample.
cReliability calculated by Kuder-Richardson Formula 21.
that of the CTMM administered during the study. The report-
ed reliability value for the LRTR is also slightly higher
than both the CTMM which was administered during the study
or the normative data collected by the publishers on the
1963 CTMM revision. This increase in the reliability value
is possible due to the longer length of the revised test.
Due to the difference in length between the CT4 and
the LRTR, a conversion of the means into a "percentage
correct" score is necessary for accurate comparison. Of
the three tests administered in the study the LRTR has the
highest percent correct score (69.8), indicating that it
is not as difficult as the other tests. The sample's
20
percent correct score for the CTM is also higher than that
achieved by the 1963 normative sample. This difference is
predictable, based on the fact that the test sample con-
sisted of college students. The LRTR showed greater varia-
bility than all the other experimental measures, though the
standard deviation of the CTMM '63 is larger than all of
these.
Table 2 shows the matrix for correlations between the
individual subtests and the total test score, and correla-
tions between total test scores of each of the three
instruments being researched.
Inspection of the correlations reported in Table 2
shows significant positive correlations between all of the
LRTR tests (p <.001). There also are significant correla-
tions between some of the individual CTMM tests. The CTMM
Analogies test correlates with both the CTMM Opposites
test (r = .24, P < .05) and the CTMM Similarities test
(r = .54, p< .001). There was no correlation between the
CTMM Opposites and Similarities tests. With the exception
of these two uncorrelated tests, this moderate degree of
overlap within the original CTMM and within the LRTR indi-
cates that the separate subtests are not able to strongly
differentiate three different, unique constructs but are
tending to measure the same thing. Interestingly, the sub-
tests from the LRTR correlate weakly or not at all with the
e 0
H
21
1-4
co
4-4
(U
50
C
4-4
co
0
(U0
ce
v
-a0
U)
0
it, -K ;: s
N' t 0 C*1 0 U. N - 'Nco 0 cc I r- 00 CA -4
N u) CT N - Nt cc"1 Nt i-
4 NtN N cn c"
u, (N r" C N -4
110 cr1 0 0 N %C0 nc C0T Nt N 000 rfN N N c' w N cc
4 r r D ',D CT
0"T r1 N u N0 r- 4 C - tN N r-4 N N Lu1
r+ Nt N cc u10 0" N -0 LN r-! N N r14
o0 0 N 14
0 u1 /D0r-1-4 N 1- N
is-1 -C(
7" 4 a.7
l0 NtN
~N
N~
r7'
0 Oca C
0 cn < 0 p 2c ce H 0 0
0 C~4 P 04.94-4Z 0H04 a a U
H -0 - <UHUH
,,0.4 0-4 04
Z0I
N0
I I
c4i
N z
(U
0
0
0rQ
-4
C 00 0
0
0
E-aM-
0d.
22
corresponding subtest from the original CTMM. The total
test scores correlated modestly but significantly with each
other. All the subtests of both the LRTR and the CTMY1 as
well as the total test scores on both instruments were
correlated with the WPCT. This supports the claims of the
test authors of both the CTMM and the WPCT to have designed
basic measures of intelligence. It also appears that the
LRTR is a measure of intelligence. In terms of demonstra-
ting construct validity, the correlation coefficients were
in the moderate range when the LRTR was correlated with the
CTMM (_ = .3263) and when the LRTR was correlated with the
WPCT (r = .4210). Finally,. the CTMM correlated with the
WPCT (r =.3812). All were significant, positive correla-
tions (p< .001).
Table 3 lists the mean difficulty scores by item for
each of the three LRTR subtests. It is defined as the pro-
portion of the sample population which responded with the
correct answer for each item. This concept is used to
determine the effectiveness of the LRTR items in terms of
their ability to discriminate. In order to equate the
instruments on length, it is necessary to shorten each of
these subtests so that each is of the original 15-item
CTMM length. Examination of this table indicates which
items were chosen correctly or incorrectly by the majority
of the test takers. The higher the level indicates an item
23
Table 3
Mean Difficulty Scores
Logical Reasoning Test Revision
Test 1 Test 2 Test 3Item Opposites Similarities Analogies
26.73
6.93
91.09
67.33
89.11
72.28
62.38
88.12
85.12
53.47
57.43
70.30
82.18
54.46
39.60
7.92
61.39
74.26
27.73
58.83
6.93 - 91.09
73.27
69.31
93.07
61.39
93.07
93.07
88.12
90.10
82.18
80.20
86.14
93.07
68.32
62.38
59.41
34.65
31.68
26.73
65.35
76.24
97.03
94.06
85.15
86.14
99.01
82.18
91.09
92.08
79.21
73.27
85.15
47.52
80.20
40.59
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
M
Range
71.45 79.64
26.73 - 93.07 40.59 - 99.01rrr.ri i i irr rrrr .rr r -__-
rrrrr rir rrr
24
that the majority answered correctly. It is most desirable
to eliminate the items with extremely high or low difficulty
scores since these do not descriminate well
Table 4
Summary of Mean Difficulty Scores Data
Opposites Similarities Analogies
Mean LRTR 58.83 71.45 79.64
Mean CTMM 54.58 58.61 66.93
Range LRTR 6.93 - 91.09 26.73 - 93.07 40.59 - 99.01
Range CTMM 13.86 - 84.16 24.75 - 97.03 25.74 - 92.08
Number of items 19 18 16
Table 4 compares the difficulty level for both the LRTR
and the CTMM by individual subtest. This provides yet
another method by which to compare the two instruments. The
range for each test also appears and gives some indication
of the strengths and weaknesses of each of the individual
subtests as they compare with the CTMM. By comparing the
three subtests, it is evident that the Opposites test is the
one which will require the most revision. Though the Mean
Difficulty for the L:RTR Opposites test is very close to that
of the CTMM mean, the range of scores of the LRTR is much
broader than that of the CTMM, indicating that there were
25
some items that were very difficult for almost all those
taking the test, and some items to which nearly everyone
responded incorrectly. Since this is the longest of the
three subtests, the elimination of four of the least effec-
tive items would hopefully have a positive impact on the
instrument in terms of homogeneity and differentiation
capability.
Table 4 also indicates that of the three tests, the
Similarities test is the most similar to the original CTMM
in terms of the range scores. Ranges are very nearly the
same; however, the difficulty levels vary the most going
from the CTMM to the LRTR. The data on the Analogies test
suggest that it may be the least difficult of all the sub-
tests, as indicated by the high mean values of both the
CTMM and the LRTR. Since the LRTR version of this subtest
had only sixteen items it may need some further revision to
make it more comparable to the others in terms of difficulty.
Discussion
The field of testing is as dynamic and changing as the
individuals and groups which these measures attempt to des-
cribe and evaluate. The necessity to continually verify
and improve the procedures and instruments used by reputable
professionals in the field of personnel selection is clearly
demonstrated through the.process described in this study.
It was the goal of this proj ect to attempt to update and
26
verify an out-of-date test and evaluate its potential in the
area of personnel selection in the '80's. This report shows
that the overall goals of the project were achieved. The
LRTR shows a slightly higher reliability than the CTMM, and
its internal validity, as shown by correlations of the three
subtests with each other as well as the total test score,
and construct validity demonstrate its usefulness in this
field.
The LRTR's high level of internal consistency indicates
a very homogeneous test, apparently more so than the original
CTMM. The fact that the individual tests of the LRTR did not
correlate with those same tests of the CTMM indicates that
the different mental functions which the CTMM authors claim
comprise the Logical Reasoning factor were not adequately
captured in the LRTR. The greater degree of homogeneity of
the LRTR may explain its higher construct. validity coeffi-
cient (.42) when compared to the WPCT than the original
CTMM - WPCT correlation (.38). Since the CTM Opposites
and Similarities tests show no significant correlation, one
might argue that at least two types of logical reasoning
abilities are being measured, thus confounding the. validity
measure to a limited degree. Though neither validity
coefficient is extremely high, they are in a moderate range
which lends support to an earlier study which shows a .63
correlation between the CTMM with the Wechsler-Bellevue
intelligence test.
27
Higher percent correct scores on the LRTR could indicate
an easier test simply in terms of item difficulty, or that
indeed the improvement in artwork and removal of ambiguous
distractors, cited as some of the original CTMM's weaknesses,
had the desired effect. This will need to be determined in
future research, as well as other issues such as the overall
effects of equating the lengths of the three LRTR tests to
the original CTMM 15-item length, determining whether the
LRTR is race or sex biased, and the predictive validity of
the LRTR in the field of personnel selection.
This preliminary study using the CTMM on an adult
population indicates its utility in various different areas.
In an educational setting it might be useful in screening
and placement for adult literacy programs, or other programs
where assessment is necessary and yet language restrictions
(i.e., non-native speaking population, reading handicaps)
limit the use of traditionally verbal instruments. The
tests' ease of administration, short time limits, pictorial
format, high reliability, and acceptable validity show it
to be very useful as a tool in a personnel selection environ-
ment. As EEOC changes are continually changing the field of
personnel selection, these tests could be utilized wherever
a general intelligence/logical reasoning ability has been
shown to be required on the job.
28
Appendix A
Logical Reasoning Test Revision
TEST 1
DIRECTIONS: In each row there is one picture that shows somethingwhich is different from the other four pictures. Findit and mark its letter.
A
.L.
Note. This entire test has been reduced in size by 26%
from that which was administered to the subject. This was
done to fit space requirements.
Appendix A--Continued
29
1
S _ _ _ _ _
A noC" E
LA C 0 E
6
A
A igDE
Appendix A--Continued
30
A 8 Q
10
A e C O E"
A '
11 C1__
12 I
13
14
A _8_CE
Appendix A--Continued
15
pc to
16
A e
C t o
E
18 __
A C E
19
A is M C 0
32.
rr yww ni ",m
Imo, mmumi
a
LA
Appendix A--Continued
TEST 2
DIRECTIONS: The first three pictures in each row are of things whichare alike in some way. Decide how they are alike andthenfind the picture to the right of the double linethat is most like them and mark Its letter.
I " ~ . _______________ -
A B C 0
32
rnLB1
1 .
d
101
. riw ~ A
r"1r
A g
et. kl-
_ C 0
a ."
e v
r
s
Appendix A--Continued33
2,
A _. C 0
3
B.S. M.A. Ph.D Jr. M.D. Mrs. Rev.A C o
4
A-AA B5 C
6I
A 0
Appendix A--Continued34
7
s o
C 1
10
Agb
112[
11g C ICL
Appendix A--Continued35
R C J L E C UA L
14 - ---- - --- _--------
A a C o
15
Mn Al Mg He 0 AgNe________IC
16
A e 1
17
A1B c
18
A 9
Appendix A--Continued
TEST 3
DIRECTIONS: In each row the first two pictures show things that areare related in some way. The third picture is relatedin the same way to one of the four pictures to the rigntof the double line. Find it and mark its letter.
A B C 0
36
Appendix A--Continued37
2 1 c
3
s c D
4
IA -Ac
5
A__B
- 6
- -
Appendix A--Continued
38
7
1A B
ACD
9
10
I ~ A _______
12
L ~lAL ICL
Appendix A--Continued
39
13J VL~ W---.. A8 p
14
___-___A C0
15
A0C
16
B C D
40
References
Abt, L. E. (1949). A test battery for selecting technical
magazine editors. Personnel Psychology, 2, 75-91.
Baier, D. E., & Dugan, R. D. (1956). Tests and perform-
ances in a sales organization. Personnel Psychology,
75-91
Boehm, V. R. (1982). Are we validating more but publish-
ing less? (The impace of governmental regulation on
published validation research--an exploratory investiga-
tion). Personnel Psychology, 35(1), 175-187.
Cascio, W. F. (1978). Applied psychology in personnel
management. Reston, VA: Reston Publishing Co., Inc.,
59-65.
Cobb, B. B. (1962). Problems in air traffic management II,
prediction of success in air traffic controller school.
Aerospace Medicine, 33, 702-713.
Cobb, B. B. (1964). Problems in air traffic management IV,
identification and potential of aptitude test measures
for selection of tower air traffic controller trainees.
Aerospace Medicine, 35(2), 1019-1027.
Dunnette, M. (1966). Personnel selection and placement.
New York: Wadsworth Publishing Co., Inc.
King, P., Norrell, G., & Erlandson, F. L. (1959). The
prediction of academic success in a police administration
41
curriculum. Educational and Psychological Measurement,
19, 649-651.
Lichtman, R. J., & Dyak, J. D. (1982). Test validities for
minorities. The Human Equation in Electric Power Plants
--a Symposium. Memphis State University.
McCormick, E. J., & Ilgen, D. (1980). Industrial
Psychology.(7th ed.). Englewood Cliffs, NJ: Prentice-
Hall, Inc.
Parker, A. D. (1966). Projections for the selection,
training and retention of sub-professional recreation
leaders based on an analysis of personality, interest,
aptitude, and preference data. Unpublished doctoral
dissertation, University of Illinois.
Perrine, M. W. (1955). The selection of drafting trainees.
Journal of Applied Psychology, 39, 57-61.
Poe, W. A., & Berg, I. A. (1952). Psychological test
performance of steel industry production supervisors.
Journal of Applied Psychology, 36, 234-237.
Resnick, L. B., & Resnick, D. P. (1982). Testing in
America: The current challenge. International Review
of Applied Psychology, 31(1), 75-90.
Stanley, J.C. (1965). Review of the California Test of
Mental Maturity. In O.K. Buros (Ed.), The Sixth
Mental Measurement Yearbook (pp. 694 - 697). Highland
Park, N.J.: The Gryphon Press.
42
Sullivan, E. T., Clark, W. W., & Tiegs, E. W. (1963). Test
Manual California Test of Mental Maturity 1963 Revision.
Thorndike, R. L. (1949). Personnel selection--Test and
measurement techniques . New York: J. Wiley.
Topetzes, N. J. (1957). A program for the selection of
trainees in physical medicine. Journal of Experimental
Education, 25, 263.
Trites, D. K. (1961). Problems in air traffic management
I. Longitudinal prediction of effectiveness of air
traffic controllers. Aerospace Medicine, 32, 1112.
Trites, D. K., & Cobb, B. B. (1964). Problems in air
traffic management IV. Comparison of pre-employment,
job-related experience with aptitude tests as predictors
of training and job performance of air traffic control
specialists. Aerospace Medicine, 35(1), 428-436.
Trites, D. K. (1964). Problems in air traffic management
VI. Interaction of training entry date with intellectual
and personality characteristics of air traffic control
specialists. Aerospace Medicine, 35(2), 1184.
Wesman, A. G. (1965). Wesman PCT Manual 1965 Revision.
New York: The Psychological Corporation.
Recommended