25
Educational Research Review 5 (2010) 25–49 Contents lists available at ScienceDirect Educational Research Review journal homepage: www.elsevier.com/locate/EDUREV The impact of instructional development in higher education: The state-of-the-art of the research Ann Stes a,, Mariska Min-Leliveld b , David Gijbels a , Peter Van Petegem a a University of Antwerp, Institute for Education and Information Sciences, Venusstraat 35, 2000 Antwerpen, Belgium b Leiden University, ICLON, P.O. Box 905, 2300 AX Leiden, The Netherlands article info Article history: Received 6 November 2008 Received in revised form 29 June 2009 Accepted 9 July 2009 Keywords: Instructional development Higher education Impact abstract In this article we give a systematic review of the nature and design of earlier research into the impact of instructional development in higher education. Studies are clustered on the basis of the level of outcome that was measured, meaning that another synthesis technique is used than in prior reviews related to the same topic. In addition, we address some questions related to the differential impact of initiatives with varied duration, format, or target group, because these questions were left unanswered in earlier reviews. The results of our review provide a guide to improve studies of instructional development in order to get more insight into the real impact at different levels (teachers’ learning, teachers’ behavior, the institution, and the students). Some evidence is found of the influence of the duration and nature of instructional development on its impact. © 2009 Elsevier Ltd. All rights reserved. 1. Introduction Instructional development in higher education has become an important topic in recent years. In spite of its acknowledged importance, evaluations have generally been limited to measures of participants’ satisfaction: little is known about the impact on daily teaching practice (Eison & Stevens, 1995; Norton, Richardson, Hartley, Newstead, & Mayes, 2005; Wilson & Berne, 1999). What teachers learn from instructional development remains unclear (Fishman, Marx, Best, & Tal, 2003). In this article, we will summarize the findings of prior reviews into the effects of instructional development in higher education. We will address some questions left unanswered in previous reviews as well as describe the state-of-the-art of the nature (in terms of levels of outcome) and design of existing research. In the past, terminology regarding instructional development was used inconsistently (Freeth, Hammick, Koppel, Reeves, & Barr, 2003; Taylor & Rege Colet, 2009). Taylor and Rege Colet (2009) defined the different related terms very clearly. We will begin by referring to their study in order to clarify what we mean in this article by instructional development. Instructional development can be described as any initiative specifically planned to enhance course design so that student learning is supported (Taylor & Rege Colet, 2009). This definition excludes: (a) curriculum development, which focuses on the development and improvement of programs of study as a whole (Cook, 2001); and (b) organizational development, which focuses on creating institutional policies and structures that foster an effective learning and teaching environment (Taylor & Rege Colet, 2009). Professional development, faculty development and academic development are terms related Corresponding author at: University of Antwerp, Centre of Excellence in Higher Education (ECHO), Venusstraat 35, 2000 Antwerpen, Belgium. Tel.: +32 3 265 43 53; fax: +32 3 265 45 01. E-mail addresses: [email protected] (A. Stes), [email protected] (M. Min-Leliveld), [email protected] (D. Gijbels), [email protected] (P. Van Petegem). 1747-938X/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.edurev.2009.07.001

The impact of instructional development in higher education: The state-of-the-art of the research

Embed Size (px)

Citation preview

Page 1: The impact of instructional development in higher education: The state-of-the-art of the research

Educational Research Review 5 (2010) 25–49

Contents lists available at ScienceDirect

Educational Research Review

journa l homepage: www.e lsev ier .com/ locate /EDUREV

The impact of instructional development in higher education:The state-of-the-art of the research

Ann Stesa,∗, Mariska Min-Leliveldb, David Gijbelsa, Peter Van Petegema

a University of Antwerp, Institute for Education and Information Sciences, Venusstraat 35, 2000 Antwerpen, Belgiumb Leiden University, ICLON, P.O. Box 905, 2300 AX Leiden, The Netherlands

a r t i c l e i n f o

Article history:Received 6 November 2008Received in revised form 29 June 2009Accepted 9 July 2009

Keywords:Instructional developmentHigher educationImpact

a b s t r a c t

In this article we give a systematic review of the nature and design of earlier research into theimpact of instructional development in higher education. Studies are clustered on the basisof the level of outcome that was measured, meaning that another synthesis technique isused than in prior reviews related to the same topic. In addition, we address some questionsrelated to the differential impact of initiatives with varied duration, format, or target group,because these questions were left unanswered in earlier reviews. The results of our reviewprovide a guide to improve studies of instructional development in order to get more insightinto the real impact at different levels (teachers’ learning, teachers’ behavior, the institution,and the students). Some evidence is found of the influence of the duration and nature ofinstructional development on its impact.

© 2009 Elsevier Ltd. All rights reserved.

1. Introduction

Instructional development in higher education has become an important topic in recent years. In spite of its acknowledgedimportance, evaluations have generally been limited to measures of participants’ satisfaction: little is known about the impacton daily teaching practice (Eison & Stevens, 1995; Norton, Richardson, Hartley, Newstead, & Mayes, 2005; Wilson & Berne,1999). What teachers learn from instructional development remains unclear (Fishman, Marx, Best, & Tal, 2003). In this article,we will summarize the findings of prior reviews into the effects of instructional development in higher education. We willaddress some questions left unanswered in previous reviews as well as describe the state-of-the-art of the nature (in termsof levels of outcome) and design of existing research. In the past, terminology regarding instructional development was usedinconsistently (Freeth, Hammick, Koppel, Reeves, & Barr, 2003; Taylor & Rege Colet, 2009). Taylor and Rege Colet (2009)defined the different related terms very clearly. We will begin by referring to their study in order to clarify what we mean inthis article by instructional development.

Instructional development can be described as any initiative specifically planned to enhance course design so that studentlearning is supported (Taylor & Rege Colet, 2009). This definition excludes: (a) curriculum development, which focuses onthe development and improvement of programs of study as a whole (Cook, 2001); and (b) organizational development,which focuses on creating institutional policies and structures that foster an effective learning and teaching environment(Taylor & Rege Colet, 2009). Professional development, faculty development and academic development are terms related

∗ Corresponding author at: University of Antwerp, Centre of Excellence in Higher Education (ECHO), Venusstraat 35, 2000 Antwerpen, Belgium.Tel.: +32 3 265 43 53; fax: +32 3 265 45 01.

E-mail addresses: [email protected] (A. Stes), [email protected] (M. Min-Leliveld), [email protected] (D. Gijbels),[email protected] (P. Van Petegem).

1747-938X/$ – see front matter © 2009 Elsevier Ltd. All rights reserved.doi:10.1016/j.edurev.2009.07.001

Page 2: The impact of instructional development in higher education: The state-of-the-art of the research

26 A. Stes et al. / Educational Research Review 5 (2010) 25–49

to instructional development as well; however, each concept has its own specific focus. Whereas instructional developmentexplicitly aims to develop faculty in their role as a teacher, professional development concerns the entire career developmentof a faculty member and is not limited to teaching, but also considers research and social services (Centra, 1989). The termsacademic development and faculty development have the same focus as the concept of professional development, butthey also include the aspect of organizational development as described above. While the term academic development isused in Australasian and British contexts, the term faculty development is common in North America (Taylor & Rege Colet,2009). The concept of educational development is used by Taylor and Rege Colet (2009) to indicate the whole range ofdevelopment activities as described above: instructional, curriculum, organizational, professional, academic, and facultydevelopment.

2. Past syntheses of instructional development research

Five prior reviews related to the effects of instructional development in higher education (Levinson-Rose & Menges, 1981;McAlpine, 2003; Prebble et al., 2004; Steinert et al., 2006; Weimer & Lenze, 1998) informed the current synthesis.

Levinson-Rose and Menges (1981) reported on a synthesis of 71 studies (from the mid-sixties to 1980) of interventionsto improve college teaching using narrative review. Studies were clustered on the basis of the kind of intervention theydescribed, using the following five categories: (1) grants for faculty projects; (2) workshops and seminars; (3) feedback fromstudent ratings; (4) practice with feedback and (5) concept-based training on the basis of films or videotapes illustratingeducationally relevant concepts. The effect of the intervention as reported in the study was labeled as: (a) self-reportedchange in teacher attitude; (b) tested or observed change in teacher knowledge; (c) observed change in teacher skill; (d)self-reported change in student attitude; or (e) tested or observed change in student learning. Studies reporting only oneffects belonging to one or both of the first two categories were excluded from the review. The results indicated positiveeffects for 78% of the interventions studied. However, this percentage diminished to 62% when taking into account onlythose studies where high confidence could be placed in the results. Levinson-Rose and Menges (1981) concluded that quitea lot of the research into the impact of instructional development up to 1980 was of low quality. Individual differencesbetween participants in a development initiative and/or between students were seldom taken into account. Studies werenot comparable with regard to the way dependent variables were operationalized or measured. Qualitative data to revealdeeper levels of experience as well as collaborative research were missing.

Weimer and Lenze (1998) updated the review of Levinson-Rose and Menges (1981) by focusing on literature published inthe eighties. Their review considered five kinds of interventions: (1) workshops, seminars and programs; (2) consultation;(3) instructional grants; (4) resource materials such as newsletters, manuals or sourcebooks; and (5) colleagues helpingcolleagues. While two of these categories (workshops/seminars and grants) were identified by Levinson-Rose and Menges(1981), the others were not. This was explained by the fact that the character of instructional development had undergonesome changes during the eighties (Weimer & Lenze, 1998). Weimer and Lenze replaced two of the categories identified byLevinson-Rose and Menges (1981) (namely practice with feedback and concept-based training) by two new ones (resourcematerials and collegial help), while one, feedback from student ratings, was adapted (i.e., Weimer and Lenze took intoaccount only feedback in combination with a form of consultation). The studies reviewed were not just clustered on thebasis of the kind of intervention they described: if a study described the effect of an intervention meant for a specific targetgroup, this was also taken into account. A distinction was made between interventions targeting new faculty and thosetargeting teaching assistants. Reported effects were categorized using the same five categories used by Levinson-Rose andMenges (1981). Whereas Levinson-Rose and Menges (1981) excluded studies reporting only effects concerning change inself-reported teacher attitude and/or in tested or observed teacher knowledge, Weimer and Lenze (1998) did not. Overall,the results of their review were inconclusive about the effects of instructional development in higher education. Weimerand Lenze (1998), as well as Levinson-Rose and Menges (1981) 17 years before, concluded that more research and especiallyresearch with greater sophistication in empirical design should be conducted. Types of inquiry other than quantitativesurveys should be used. The effect of instructional interventions on specific faculty groups should be studied. Related fieldsof knowledge (e.g., regarding adult learning, diffusion of innovation or motivation) should be taken into account.

McAlpine (2003) addressed the question of how instructional development initiatives in higher education can best beevaluated. Seven studies (published between 1983 and 2002) reporting on the impact of an instructional workshop at thelevel of students and/or the institution (not only at the level of the participants) were reviewed. The studies were comparedon the basis of characteristics of the workshop described (i.e., aim, extent of generic and voluntary character, number ofparticipants, duration, and kind of activities) as well as on the basis of methodology (i.e., design, focus of evaluation, andinstruments used). The categorization described by Levinson-Rose and Menges (1981) and Weimer and Lenze (1998) wasused to compare the studies regarding their focus of evaluation. In addition, an extra category described by Kreber andBrook (2001) concerning effects on the culture of the organization was included. The description of the studies made clearthat in some cases other dependent variables were also examined, namely the learning experiences of students, their studyapproaches, and the teaching conceptions of teachers (i.e., the way teachers think about teaching). All studies reviewed paidcareful attention to rigorous research design; in each case comparison was made to a control group and three of the sevenstudies used a pretest/posttest design. It is possible that her consideration only of studies examining an impact beyondthe level of the individual participants is why McAlpine found studies with higher-quality design than those found byLevinson-Rose and Menges (1981) and Weimer and Lenze (1998). McAlpine (2003) concluded that measuring the impact

Page 3: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 27

of instructional development initiatives, especially impact that goes beyond the level of the individual participants, is noteasy. Future research should concentrate on the development of instruments to measure the impact at the level of students’learning and/or the institution.

On request of the Ministry of Education of New Zealand, Prebble et al. (2004) synthesized the research about the impactof student support services and instructional development programs on student outcomes in higher education. Part of theirreport gave an overview of the research evidence for the effects of instructional development, referring to the earlier reviewsby Levinson-Rose and Menges (1981) and Weimer and Lenze (1998). The main focus of the report was studies publishedbetween 1990 and 2004. In order to cluster the studies on the basis of the kind of intervention they described, the categoriesidentified by Levinson-Rose and Menges (1981) and Weimer and Lenze (1998) were adapted to take into account newdevelopments in the field (Prebble et al., 2004). The following categories were used: (1) short training courses; (2) in situtraining, whereby an activity is built to meet the objectives of a specific and entire academic group; (3) consulting, peerassessment and mentoring; (4) student assessment of teaching; and (5) intensive staff development. For each category,reported effects were labeled using the categories as determined by Levinson-Rose and Menges (1981). As in the reviewby McAlpine (2003), an extra category concerning effects at the level of the institution was included. On the basis of thestudies reviewed, Prebble et al. (2004) came to the conclusion that short training courses tend to have only limited impact onactual teaching practice and should be reserved for disseminating institutional policy information or for training in concretetechniques. Other forms of instructional development were viewed more favorably: in situ training, (peer) consulting, studentassessments, and intensive programs were described as potentially leading to significant improvements in the quality ofteaching and student learning. Although Prebble et al. (2004) pointed out limitations and shortcomings of the publishedresearch, as did Levinson-Rose and Menges (1981) and Weimer and Lenze (1998), Prebble et al. (2004) paid less attention tothese in reporting their synthesis of studies.

A discipline-specific review was conducted by Steinert et al. (2006). They synthesized the existing evidence regardingthe effects of instructional development interventions in medical education, covering the period of 1980–2002. They didnot use the categories described by Levinson-Rose and Menges (1981), Weimer and Lenze (1998), or Prebble et al. (2004)to cluster the 53 studies included in their review on the basis of the kind of intervention. Instead they used the followingcategorization: (1) workshops; (2) short courses; (3) seminar series; and (4) longitudinal programs and fellowships. Here,the collective, course-like types of instructional interventions were considered, and were distinguished on the basis of theirduration. Studies describing the effects of instructional interventions such as grants, student feedback, consultation, or insitu training were categorized as other and were given less consideration. A slightly adapted version of Kirkpatrick’s (1994)model of educational outcomes was used to label the outcomes of the interventions. The following levels of outcome weretaken into account: (1) reaction; (2) learning, further subdivided into (2a) change in attitudes and (2b) change in knowledgeor skills; (3) behavior; (4) results, further subdivided into (4a) change in the organization; and (4b) change among students,residents or colleagues. We recognized these categories as those introduced by Levinson-Rose and Menges (1981); howeverthey are supplemented with the levels reaction, behavior, organization and residents and colleagues. The level of organizationwas also considered by McAlpine (2003) as well as by Prebble et al. (2004). Steinert et al. (2006) concluded that the literatureon medical education suggested high satisfaction of teachers with instructional development initiatives and positive changesin teachers’ attitudes, knowledge, skills and behavior following participation in an instructional development activity. LikeMcAlpine (2003), they suggested that future research should concentrate on measuring the impact at the level of students’learning and/or the institution, because the impact at these levels was not as frequently investigated. Steinert et al. (2006) alsoaddressed the issue of the methodological quality of the reported studies and concluded that many of the studies employedweak designs. More rigorous research studies should be conducted with randomized controlled trials, comparison groups,qualitative methods, or mixed designs. Evaluation methods other than self-assessments and survey questionnaires shouldbe developed and tested for validity and reliability. Comparable measures should be used across studies. In addition, thedurability of change and the interaction between different factors should be given increased attention. In short, Steinert etal. (2006) suggested that new methodologies to assess impact should be developed by collaborating systematically acrossprograms and institutions.

3. The need for a new synthesis of the instructional development literature

The first reason for conducting a new synthesis of the instructional development literature is to apply a new synthesistechnique. The earlier reviews summarized above (Levinson-Rose & Menges, 1981; McAlpine, 2003; Prebble et al., 2004;Steinert et al., 2006; Weimer & Lenze, 1998) made a plea for more collaboration across programs and institutions in order toenhance the knowledge of the field. However, in these reviews studies were clustered on the basis of the type of interventionthey described. Research in other domains has indicated that taking into account the level of measured outcome and theresearch designs used in the original studies can result in more fine-grained findings (e.g., Dochy, Segers, Van den Bossche,& Gijbels, 2003; Gijbels, Dochy, Van den Bossche, & Segers, 2005). Our synthesis will cluster studies on the basis of the levelof outcome that is measured and will attend explicitly to the research design used. The state-of-the-art of the nature (interms of levels of outcome) and design of existing research will be described. This will provide a guide for researchers to usesimilar methods across studies and a comparable measurement of dependent variables, so that studies can build upon oneanother and collaboration across programs and institutions can be more easily achieved.

Page 4: The impact of instructional development in higher education: The state-of-the-art of the research

28 A. Stes et al. / Educational Research Review 5 (2010) 25–49

Our first research question is: What is the nature (in terms of levels of outcome) and design of instructional developmentresearch in higher education? To answer this question we will look at the literature from a similar angle as did McAlpine(2003). She compared studies on the basis of research methodology. However, she only reviewed studies reporting on theimpact of an instructional workshop and only considered the impact at the level of students and/or the institution. Studiesexamining the impact solely at the level of the participants were not taken into account. Moreover, McAlpine (2003) did notreport a systematic literature search. Our synthesis will focus on all types of instructional development and will also considerstudies solely examining the impact on participants. A systematic literature search will also be reported. Earlier reviews(Levinson-Rose & Menges, 1981; McAlpine, 2003; Prebble et al., 2004; Weimer & Lenze, 1998) pointed out the importanceof rigorous research designs. As far back as 1981, Levinson-Rose and Menges made a plea for impact research drawing notonly on participant self-reports, but also measuring actual changes in performance. Steinert et al. (2006) synthesized theexisting evidence regarding the effects of instructional development in medical education, covering the period 1980–2002.They predicted an increase in well-designed studies researching behavioral and systems outcomes in the first five years ofthe twenty-first century. Our synthesis aims to examine whether the suggestions of earlier reviews were put into practice.

A second reason for conducting a new synthesis of the instructional development literature is to address some questionsrelated to the differential impact of initiatives with varied duration, format, or target group, because these questions wereleft unanswered in the earlier reviews summarized above. An important aim of these reviews was to give an overview of theeffects of different kinds of instructional development. Earlier evidence and conclusions on such effects from Levinson-Roseand Menges (1981) and Weimer and Lenze (1998) were updated by Prebble et al. (2004) and, with regard to medical educationby Steinert et al. (2006). However their reviews still left unanswered some questions that are relevant to investigate, becausethey have practical implications regarding the implementation of instructional development in higher education. So, a secondreason for conducting a new synthesis of the instructional development literature is to determine whether the followingfour questions can now be answered.

1. Do instructional development initiatives extended over time have more positive outcomes than one-time events?Although Weimer and Lenze (1998) analyzed the research on workshops along the dimension of varying length, no

conclusions about the effect of length on the final impact of the instructional development initiative were drawn. Thereviews by McAlpine (2003) and Steinert et al. (2006) suggested that longer programs, extended over time, tended toproduce more positive outcomes than one-time interventions. However, it was concluded that further investigation wasneeded to test out the hypothesis that longer interventions may have more outcomes.

2. Do collective, course-like instructional development initiatives have outcomes comparable to those of initiatives that arealternative in nature?

The reviews by Levinson-Rose and Menges (1981), Weimer and Lenze (1998), Prebble et al. (2004) and Steinert etal. (2006) all clustered studies on the basis of the kind of intervention they described. Each review used a differentcategorization in order to take into account new developments in the field. However, collective, course-like interventionsappeared in every categorization; they seem to be a constant in the field of instructional development. This led us towonder if their impact is comparable to that of the alternative initiatives whose specific character seems to change overtime (e.g., research grants regarding teaching, peer learning).

3. Do instructional development initiatives targeting teaching assistants or new faculty have more positive outcomes thaninitiatives with another or no specific target group?

Weimer and Lenze (1998) remarked that targeting teaching assistants and new faculty stems from two assumptions,namely that these target groups have: (a) little or no teaching experience resulting in lower teaching quality; and (b) notenure yet so that it is easier to encourage them to engage in instructional development. However, Weimer and Lenze(1998) stated that these assumptions were not tested and that it remained unclear if the beginning of the teaching careeris indeed the best time to intervene.

4. Do instructional development initiatives targeting a discipline-specific group have outcomes comparable to discipline-general initiatives?

Because initiatives are sometimes organized as discipline-specific and at other times given a generic character, we wouldlike to examine whether targeting a discipline-specific group makes any difference with regard to the impact.

In answering these four questions the results for our first research question regarding the nature and design of instruc-tional development research in higher education will be taken into account.

4. Levels of outcome of instructional development

Because our synthesis will cluster studies on the basis of the level of outcome measured, it was important to take intoaccount as broad as possible a range of measures of effects. The categorization of levels of outcome used by Steinert et al.(2006), which was an adaptation of Kirkpatrick’s (1994) model of educational outcomes, seemed the most appropriate. Itencompassed all levels as described in the reviews by Levinson-Rose and Menges (1981), Weimer and Lenze (1998), McAlpine(2003), and Prebble et al. (2004), by broadening them with the levels reaction, behavior, and residents and colleagues. Holton(1996) questions whether participants’ reactions (i.e., their views on the instructional development learning experience,its organization, presentation, content, teaching methods, materials and quality of instruction; Steinert et al., 2006) canbe considered as a measure of impact. He comes to the conclusion that “participants’ reactions should be removed from

Page 5: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 29

Table 1Kirkpatrick’s model for evaluating outcomes of instructional development.

Level Description

Change within teachersLearningChange in attitudes Change in attitudes towards teaching and learningChange in conceptions Change in conceptions of (i.e., in ways of thinking about) teaching and learningChange in knowledge Acquisition of concepts, procedures and principlesChange in skills Acquisition of thinking/problem-solving, psychomotor and social skills

Behavior Transfer of learning to the workplace

Institutional impact Wider changes in the organization, attributable to the instructional development intervention

Change within studentsChange in perceptions Change in students’ perceptions of the teaching and learning environmentChange in study approaches Change in students’ approaches to studyingChange in learning outcomes Improvement in students’ performance as a direct result of the instructional development

Note: Kirkpatrick’s (1994) model was modified by Steinert et al. (2006). It was further adapted for this review.

evaluation models as a primary outcome of training” (Holton, 1996, p. 10), they can at a maximum be considered as anintervening variable. Also Guskey (2000) describes participants’ reactions as ‘happiness indicators’ indicating how wellparticipants liked the intervention without addressing the issue of attaining change. Participants’ reactions to instructionaldevelopment do not contribute to a clear picture of its real impact (Weimer & Lenze, 1998). We decided to exclude this levelin our review. Because colleagues are part of teachers’ institutional context, we further adapted the model as described bySteinert et al. (2006) by considering change among colleagues as an example of institutional impact. Taking into account thecategorization introduced by Levinson-Rose and Menges (1981), as well as the extra dependent variables noted in McAlpine’sreview (2003), we subdivided the category change among students into the following subcategories: students’ perceptions,students’ study approaches, and students’ learning outcomes; and we added the subcategory change in teachers’ conceptionsto the level of learning. We excluded change among residents since it is a level of outcome typical for medical education, notapplicable to other disciplines. In order to structure the categorization somewhat further, the category change among teacherswas added, with the levels of learning and behavior belonging to this category. Table 1 presents our final list of outcomes ofinterest for this synthesis.

5. Methods for research synthesis

5.1. Literature search procedures

Because of the inconsistent use of terminology in the past (Freeth et al., 2003; Taylor & Rege Colet, 2009), the literaturesearch for studies into the impact of instructional development in higher education was based on a variety of terms thatcan refer to instructional development. Based on the conceptualization in a recent study by Taylor and Rege Colet (2009)and our first readings of studies, we composed a list of the following keywords instructional development, instructionaltraining, academic development, faculty development, faculty training, professional development, educational development,educational training, staff development, pedagogical training, and university teacher.

In February 2008, we conducted searches of the electronic database ERIC. Each time one of the above mentioned termswas indicated as word that had to appear in the title, in combination with the term ‘teaching’ as word that had to appear inthe abstract. The last in order to get the focus we wanted and to exclude for example the large number of studies regardingthe professional development of doctors or researchers. We did not limit the search in time or in source of publication. Oursearch strategy was meant to uncover both published and unpublished research to prevent publication bias.

The search resulted in 1833 citations. We read the abstracts of these citations, except when the titles indicated thatthe studies would be excluded from the synthesis (e.g., studies of instructional development in elementary or secondaryeducation). After examining abstracts for relevance to the synthesis based on the criteria described in the next section, 101articles were selected to be screened. Of those, we were able to find 80 studies; as we did not limit the search to time orsource, 21 studies were not traceable despite several attempts to contact the authors.

5.2. Inclusion criteria

For a study to be included in the research synthesis, the following criteria had to be met:

1. Studies had to concern an instructional development initiative for teachers in higher education. An instructional devel-opment initiative was defined as an initiative specifically planned to enhance course design so that student learning issupported (Taylor & Rege Colet, 2009).

Page 6: The impact of instructional development in higher education: The state-of-the-art of the research

30 A. Stes et al. / Educational Research Review 5 (2010) 25–49

2. Studies had to examine the effect of the instructional development initiative as a central object of study (we were notinterested in studies that only marginally touched upon the idea). For purposes of this synthesis, we defined effect as achange at one of the levels described in Table 1. As mentioned before, studies that only investigated participants’ reactionsto the initiative (i.e., their views on the instructional development learning experience, its organization, presentation,content, teaching methods or quality of instruction; Steinert et al., 2006) were excluded.

3. Studies had to systematically report the gathering and analysis of empirical data. Non-empirical reflections or consider-ations were excluded. Because the focus of this synthesis was a rigorous exploration of the nature and design of existingresearch into all kinds of instructional development in higher education, studies had to include sufficient informationabout the research methodology. Studies that reported only findings were excluded.

Thirty-seven articles met the criteria for inclusion. Two of these articles concerned the same study, published in twodifferent journals. For this review, the study was considered only once. So, 36 articles were included in our analysis. Reasonsfor exclusion were the fact that after careful reading of the article it became clear that the study did not target instructionaldevelopment as defined for this synthesis (n = 6), concerned an instructional development initiative but not for teachers inhigher education (n = 11), or did not examine a change at one of the levels described in Table 1 (n = 8). There were 18 studiesexcluded because they did not report (or not systematically) about the gathering and analysis of empirical data.

5.3. Coding of studies

Each study was coded for information about the outcomes measured, the research design, the instructional developmentthat was implemented, and the results. The outcomes measured were coded using the different sub-levels given in Table 1.

We described the research design as experimental, quasi-experimental, or other. To be classified as experimental, teachershad to be randomly assigned to treatment or control/comparison groups. Studies classified as quasi-experimental did notrandomly assign teachers to treatment and control/comparison groups, but often used procedures to equate or match thedifferent groups. Designs were also coded for whether there was some kind of pretest prior to the instructional developmentintervention, or only a posttest. With regard to the posttest, a distinction was made between a posttest immediately afterthe end of the intervention and a delayed one. The way the outcomes were measured was described where there was noparticipant self-report.

Instructional development information included the intervention duration, the nature of the intervention (e.g., residentialcourse, one-on-one support), and the target group. With regard to its duration, an intervention was coded as a one-timeevent or as extended over time. We coded the nature of the intervention using three categories: collective and course-like(e.g., workshop, short course, seminar or seminar series, longitudinal program), alternative (e.g., instructional grant, practicewith feedback, feedback from student ratings, individual consultation, provision of resource materials, peer coaching, in situtraining of a group), and a hybrid form of the first two categories (e.g., a workshop followed by one-on-one-support). Withregard to the target group, distinctions were made between: teaching assistants, new faculty, and another/no specific targetgroup. In addition, targeting of a discipline-specific group was also noted.

Results for each outcome measure were coded using the following categories: 0 (no indication of impact), + (indicationof impact), +/0 (partial indication of impact), ? (unclear whether there is an indication of impact or not).

We remark that an inconsistent and variable use of terms (in particular regarding the outcomes measured and theinstructional development information) complicated the coding. However, whenever possible, the authors’ terminologywas used.

5.4. Coding procedures

Three coders were involved in the coding procedure. All three had practical experience in instructional development inhigher education as well as expertise in educational research methodology. First, each of them independently coded a fewstudies. Difficulties or lack of clarity in the use of the coding form were discussed in order to improve coding consistency. Inthe next phase, all studies were independently coded by one of the three coders. However, if a coder felt unsure about thecoding of a specific aspect, it was discussed until a consensus among all coders was reached.

6. Results

In this section we briefly describe the research design and results of the studies included in our review. Studies areclustered on the basis of the level of outcome that was measured. First we will look at the studies measuring change withinteachers, namely at the level of teachers’ learning (teachers’ attitudes, conceptions, knowledge, and skills) and teachers’behavior. Second the studies examining the institutional impact of instructional development will be described. We willend with an overview of the studies measuring change within students, namely at the level of their perceptions of teaching,study approaches, and learning outcomes. Quantitative, qualitative and mixed-method studies are distinguished.

Page 7: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 31

6.1. Studies measuring change within teachers

6.1.1. Impact on teachers’ learningThe literature search located 27 studies that examined the impact of instructional development in higher education on

the level of teachers’ learning described in Table 1 (i.e., on teachers’ attitudes, conceptions, knowledge, and/or skills). Someof the important characteristics and outcomes of each study are presented in Table 2.

6.1.1.1. Impact on teachers’ attitudes. Sixteen studies examined the impact of instructional development on teachers’ atti-tudes: six quantitatively, six qualitatively, and four using mixed-methods.

Quantitative studies. Six quantitative studies investigated the effect on teachers’ attitudes. In four of them (Bennett& Bennett, 2003; Dixon & Scott, 2003; McShannon & Hynes, 2005; Sydow, 1998) positive effects were reported. Innone of these studies was a control/comparison group used. Only the study by Bennett and Bennett (2003) used apretest/posttest-design. In the studies by Dixon and Scott (2003), McShannon and Hynes (2005), and Sydow (1998) thereported positive attitudinal impact of instructional development considered impact as perceived by the teachers at theend of their participation in instructional development. In the study by Braxton (1978) a questionnaire was used toexamine teachers’ concern with the method of teaching as well as teachers’ attitudes regarding the idea that facultycan learn teaching methods and skills. No pretest was used; results for neither a control nor comparison group wereobtained. Scores on the questionnaire scales as obtained at the end of the instructional development were reported, how-ever, the impact of instructional development on teachers’ attitudes remains unclear. Nasmith, Saroyan, Steinert, Lawn,and Franco (1995) used a quasi-experimental design. Interviews were conducted six months to five years after the endof the instructional development intervention. In these interviews teachers were asked also about the period beforetheir participation in the instructional development; in this way retrospective pretest data were collected. Chi-squareanalyses on interview answers revealed that the change from pretest to posttest in the experimental group of teach-ers with regard to their enjoyment of teaching small groups was not significantly different from those in the controlgroup.

Qualitative studies. Six qualitative studies examined the impact of instructional development on teachers’ attitudes(Finkelstein, 1995; Harnish & Wild, 1993; Howland & Wedman, 2004; Nursing faculty development, 1980; Pololi et al., 2001;Stepp-Greany, 2004). The qualitative analysis was based on several sources. None of the studies used a quasi-experimentaldesign. Only Harnish and Wild (1993) reported a pretest. Their analysis of interview data, presented by way of 4 cases,revealed that in general teachers were more inclined to self-reflect at the end of instructional development. One of the casesshowed an increase in teachers’ confidence. Howland and Wedman (2004) analyzed interview data as well. 21 teachers werequestioned after the end of instructional development. The authors concluded that teachers’ commitment to the value oftechnology in class was reinforced. Pololi et al. (2001) used an open-ended survey (n = 58) as well as a focus group (n = 7) toanalyze teachers’ perceptions of the extent to which participation in instructional development had changed their attitudes.Increased thinking about student-centered learning was reported. Stepp-Greany (2004) analyzed participants’ logs as wellas supervisor’s notes to conclude that teaching assistants who participated in a project of team teaching and experientiallearning seemed to value experiential instruction as well as the written planning process of teaching. In addition, participantsseemed to be more satisfied than they would have been in a regular class without team teaching. The two remaining studies(Finkelstein, 1995; Nursing faculty development, 1980) did not rely on participants’ perceptions to analyze the impact ofinstructional development on teachers’ attitudes. In Finkelstein’s study (1995), instructional developers analyzed teachers’syllabi and concluded that at the end of a peer learning project, faculty appreciated their influence over students (e.g., stu-dents’ motivation) more. In the final report of Nursing faculty development (1980), site visits at the end of the instructionaldevelopment project by its coordinator as well as by assigned evaluators revealed increased faculty awareness of individualapproaches to learning and of the needs of targeted student populations.

Mixed-method studies. Four studies examined the impact of instructional development on teachers’ attitudes both quan-titatively and qualitatively (Fidler, Neururer-Rotholz, & Richardson, 1999; Kahn & Pred, 2002; Postareff, Lindblom-Ylänne,& Nevgi, 2007; Sydow, 1998). In the part of a study by Sydow (1998) that considered the impact of research grants regard-ing teaching, the quantitative analysis of assessment form data as well as the qualitative analysis of 10% of the proposalsand final reports of the grants, indicated that participants had feelings of professional reinvigoration upon completion ofthese research grants. Quantitative survey data in the study by Fidler et al. (1999) revealed that 33% of 60 respondentsfelt more committed to instructional excellence at the end of instructional development. 60% of 53 respondents felt moresensitive to students’ non-academic needs; 38% reported an increased understanding of students’ academic needs. Inter-view data provided corroboration of these results. In the study by Kahn and Pred (2002) a pretest/posttest design was usedwith quantitative survey data. A perceived increase in seeing benefits to incorporating technology was reported as a result.The data assembled by way of an e-mail survey four months after the end of the instructional development were analyzedqualitatively. They revealed that teachers were curious to learn more. The study by Postareff et al. (2007) was the onlyone of the four studies in this rubric that used comparison groups: teachers who had received more instructional train-ing were compared with those receiving less or none. Analysis of quantitative data regarding a self-efficacy scale revealedthat teachers who had more credits of instructional training reported higher self-efficacy than teachers with fewer cred-its (p < .05). In an interview, teachers reported that the instructional training made them more aware of their teachingmethods.

Page 8: The impact of instructional development in higher education: The state-of-the-art of the research

32 A. Stes et al. / Educational Research Review 5 (2010) 25–49

6.1.1.2. Impact on teachers’ conceptions. Eight studies examined the impact of instructional development on teachers’ con-ceptions, four quantitatively and four qualitatively.

Quantitative studies. Four quantitative studies investigated the impact on teachers’ conceptions (Gibbs & Coffey, 2004; ,Hubball, Collins, & Pratt, 2005; Nasmith et al., 1995; Postareff et al., 2007). All studies included a control/comparison group;however, in the study of Hubball et al. (2005) results for this group were not reported. Three studies (Gibbs & Coffey, 2004;Hubball et al., 2005; Nasmith et al., 1995) used a pretest/posttest design. In two studies (Gibbs & Coffey, 2004; Nasmithet al., 1995) the posttest was delayed (one year and six months to five years, respectively, after the end of the instruc-tional development). The study by Nasmith et al. (1995) reported an increase from pretest to posttest in the experimentalgroup with regard to defining a small group session; in the control group a decrease was found. The study by Hubball etal. (2005) revealed a change in teaching perspectives due to training for 17 out of 30 teachers. For each of five teachingperspectives the scores of the experimental group as a whole were significantly different at the posttest from those at thepretest. Postareff et al. (2007) compared teachers who had received more instructional training with those receiving lessor none. Analysis of quantitative data collected with the Approaches to Teaching Inventory (ATI; Prosser & Trigwell, 1999)revealed that teachers who had more credits of instructional training scored higher on the conceptual change scale thanteachers with fewer credits (p < .005). Gibbs and Coffey (2004) used the ATI as well. They found a mixed effect of instructionaldevelopment on teachers’ conceptions. A statistically significant effect of instructional development was found for teachers’conceptual change approach (p < .05) in comparing the posttest ATI scores of the experimental and the control group, as wellas by comparing the pretest and the posttest scores of the experimental group, but not for their information transmissionapproach.

Qualitative studies. Four qualitative studies examined the impact of instructional development on teachers’ conceptions(DeWitt et al., 1998; Medsker, 1992; Pololi et al., 2001; , Slavit, Sawyer, & Curley, 2003). None of the studies incorporateda control/comparison group or took pretest data into account in the analysis. In the research by Pololi et al. (2001) ananalysis of documents written (by participants or the facilitator) during the process of instructional development revealedthat course participants valued learner-centered learning a great deal at the end of the instructional development. Medsker(1992) described the results of a telephone interview using three cases. In one of the cases a change towards a more student-centered teaching philosophy was reported. The study by Slavit et al. (2003) described three teacher cases as well. The firstcase described a change of perspective in the role of technology. DeWitt et al. (1998) concluded that teachers (n = 6) hadidiosyncratic beliefs at the end of instructional development.

Mixed-method studies. No studies were found that investigated the impact of instructional development on teachers’conceptions using a mixed-methods approach.

6.1.1.3. Impact on teachers’ knowledge. Twelve studies examined the impact of instructional development on teachers’ knowl-edge: five quantitatively, four qualitatively, and three using a mixed-methods approach.

Quantitative studies. Five studies (Claus & Zullo, 1987; Dixon & Scott, 1981; Nasmith et al., 1995; Quirk, DeWitt, Lasser,Huppert, & Hunniwell, 1998; Sheets & Henry, 1984) investigated quantitatively the impact of instructional development onteachers’ knowledge. Nasmith et al. (1995) used a quasi-experimental design. In the part of the study considering not only adelayed posttest (from six months to five years later), but also retrospective pretest data (obtained by asking participants tothink back to the period before their participation in instructional development), a positive effect was found with regard toawareness of group dynamics. A knowledge test concerning the analysis of ten teaching scenarios was only administered asa delayed posttest. The global score as well as the scores for eight of the ten scenarios were higher in the experimental groupthan in the control group, but not significantly different. Only 20 teachers (10 experimental and 10 control teachers) took thetest; effect sizes were not considered. The research by Sheets and Henry (1984) incorporated a pretest, a posttest and a delayedposttest (six months later). A written test on cognitive outcomes taken by 14 teachers revealed an improvement from pretestto posttest and from pretest to delayed posttest, in which a general improvement in knowledge was observed. However, themean score decreased from posttest to delayed posttest. The study by Quirk et al. (1998) used a pretest/posttest/delayedposttest design (delay of three months) as well. A self-assessment questionnaire revealed significantly positive changes infamiliarity with 9 of 11 concepts (p < .01) from pretest to posttest. The familiarity was retained three months later (p < .01). Inthe study by Claus and Zullo (1987) the impact of instructional development on teachers’ knowledge was positive, as well.Scores on a knowledge test were significantly higher after the instructional development than before (p < .001). It should benoted that only six respondents took the knowledge test. Dixon and Scott (2003) did not use a quasi-experimental designnor did they incorporate a pretest. Participants were asked for their perceptions of accumulation of knowledge during theinstructional development initiative once this had ended. Most of the 19 respondents judged the impact of the initiative ontheir knowledge as positive.

Qualitative studies. The impact of instructional development on teachers’ knowledge was investigated qualitatively in fourstudies (Addison & VanDeWeghe, 1999; Harnish & Wild, 1993; Slavit et al., 2003; Stepp-Greany, 2004). None of the studiesincorporated a control/comparison group. The study by Stepp-Greany (2004) was the only one that used data other than thatreported by the participants in the instructional development. In this study an analysis of supervisors’ notes and teachingassistant logs revealed that the teaching assistants who participated in instructional development gained insights about theimportance of rubrics, forms and answer keys, learned the importance of using ancillary instructional materials as well asof the physical space for learning, and increased their understanding of the collaborative process of teaching. Only Harnishand Wild (1993) reported using a pretest. In one of their four cases they reported gains in knowledge. Slavit et al. (2003)

Page 9: The impact of instructional development in higher education: The state-of-the-art of the research

A.Stes

etal./EducationalResearch

Review5

(2010)25–49

33Table 2Characteristics and outcomes of studies examining the impact of instructional development in higher education.

Study Intervention duration Nature intervention Target group intervention Outcome measure Research design Resulta

Addison and VanDeWeghe (1999) Extended over time (? weeks) Alternative (portfolioseminars)

Teaching assistants & full-timeand part-time faculty

Learning: Knowledge Qualitative +

Behavior Qualitative ?

Bennett and Bennett (2003) Extended over time (30 weeks) Traditional No specific target group Learning: Attitudes Quantitative; pretest +Behavior Quantitative; pretest +

Brauchle and Jerich (1998) Extended over time (8 × 3 h) Traditional Teaching assistants Learning: Skills Qualitative +Students: Perceptions Quantitative;

quasi-experimental+

Students: Learning outcomes Quantitative;quasi-experimental

+

Braxton (1978) One-time event (1 day) Traditional No specific target group Learning: Attitudes Quantitative ?Behavior Quantitative +

Claus and Zullo (1987) Extended over time (1semester?)

Traditional Dental medicine faculty Learning: Knowledge Quantitative; pretest;knowledge test

+

Behavior Quantitative; pretest;observations andanalysis of videos

+

DeWitt et al. (1998) Extended over time (one year?) Alternative (teaching practice,reflective journals, interviews,concept maps)

Teacher educators Learning: Conceptions Qualitative ?

Dixon and Scott (2003) One-time event (2 h) Traditional Mostly part-time teachers Learning: Attitudes Quantitative +Learning: Knowledge Quantitative +Learning: Skills Quantitative +Behavior Quantitative +/0

Fedock et al. (1993) Extended over time (? Weeks) Alternative (preparing andcarrying out a training programfor secondary school teachers)

No specific target group Behavior Qualitative +

Institutional impact Qualitative +

Fidler et al. (1999) Extended over time (?) Hybrid form(workshop + teaching of theseminar)

Teachers teaching or have beenteaching the freshman seminar

Learning: Attitudes Mixed-method +

Learning: Knowledge Mixed-method +Learning: Skills Mixed-method +Behavior Mixed-method +Institutional impact Mixed-method +

Finkelstein (1995) Extended over time (2 years) Alternative (peer learning) No specific target group Learning: Attitudes Qualitative; documentanalysis

+

Behavior Qualitative; interviews+ document analysis

+

Students: Perceptions Quantitative; pretest +Students: Learning outcomes Quantitative; pretest +/0

Gallos et al. (2005) Extended over time (weeklysessions over longer period oftime)

Hybrid form (coachingsessions, observations &feedback)

Instructors who taught a 1styear course general chemistry

Behavior Qualitative; pretest;observa-tions + videos + recordsof meetings + studentinterviews

+

Page 10: The impact of instructional development in higher education: The state-of-the-art of the research

34A

.Steset

al./EducationalResearchReview

5(2010)

25–49

Table 2 (Continued )

Study Intervention duration Nature intervention Target group intervention Outcome measure Research design Resulta

Gibbs and Coffey (2004) Extended over time (varied) Varied Varied Behavior Quantitative; pretest;delayed posttest;quasi-experimental;ATI

+/0

Learning: Conceptions Quantitative; pretest;delayed posttest;quasi-experimental;ATI

+/0

Students: Perceptions Quantitative; pretest;delayed posttest; SEEQand MEQ

+

Students: Study approach Quantitative; pretest;delayed posttest; SEEQand MEQ

+/0

Gibbs et al. (1988) Extended over time (9 months) Traditional No specific target group Behavior Qualitative; documentanalysis

+

Learning: Skills Mixed-method; partlyquasi-experimental + criticalthinking test

+/0

Harnish and Wild (1993) Extended over time Alternative (peer mentorprojects)

No specific target group Learning: Attitudes Qualitative; pretest; +

Learning: Knowledge Qualitative; pretest +/0Learning: Skills Qualitative; pretest +/0Behavior Qualitative +Institutional impact Qualitative +

Hewson et al. (2001) Extended over time (3 months+ time for project)

Hybrid form (course, coaching& project)

Medical faculty Behavior Quantitative; pretest;delayed posttest; partlystudents’ ratings

+

Howland and Wedman (2004) Extended over time (2 years) Alternative (individualizedprocess)

College of education faculty Learning: Attitudes Qualitative +

Learning: Knowledge Mixed-method; pretest +Learning: Skills Quantitative; pretest +Behavior Mixed-method; pretest +Students: Study approach Quantitative; Flashlight

current studentinventory

?

Hubball et al. (2005) Extended over time (8 months) Traditional Experienced teachers Learning: Conceptions Quantitative; pretest;quasi-experimental?

+

Kahn and Pred (2002) One-time event (2 days) Traditional Faculty groups homogenouswith regard to discipline

Learning: Attitudes Mixed-method; partlywith pretest

+

Learning: Knowledge Mixed-method; pretest +Learning: Skills Quantitative; pretest +Behavior Qualitative; delayed

posttest+

Page 11: The impact of instructional development in higher education: The state-of-the-art of the research

A.Stes

etal./EducationalResearch

Review5

(2010)25–49

35

McDonough (2006) Extended over time (15 weeks) Alternative (action researchseminars)

Teaching assistants Behavior Qualitative; delayedposttest; documentanalysis + evaluationforms + focus groups

+

Institutional impact Qualitative; delayedposttest; documentanalysis + evaluationforms + focus groups

+

McShannon and Hynes (2005) Extended over time (onesemester)

Alternative (classroomobservation & discussion)

Engineering & science faculty Learning: Attitudes Quantitative +

Behavior Quantitative +Students: Learning outcomes Quantitative; pretest +

Medsker (1992) Extended over time (>8 weeks) Hybrid form (workshops &one-on-one support)

No specific target group Learning: Conceptions Qualitative +/0

Behavior Qualitative +Institutional impact Qualitative +Students: Perceptions Qualitative +/0Students: Learning outcomes Mixed-method +

Michael (1993) Extended over time (? Weeks) Alternative (action research) Faculty members and students Behavior Qualitative +

Nasmith et al. (1995) One-time event (2 days) Traditional Medical faculty Learning: Attitudes Quantitative;retrospective pretest;delayed posttest;quasi-experimental

0

Learning: Conceptions Quantitative; pretest;quasi-experimental;delayed posttest

+

Learning: Knowledge Quantitative; partly withretrospective pretest;delayed posttest;quasi-experimental;partly knowledge test

+/0

Behavior Quantitative; delayedposttest;quasi-experimental;partly observations

0

Nurrenbern et al. (1999) Extended over time (1semester)

Hybrid form (workshops &meetings)

No specific target group Students: Perceptions Quantitative;quasi-experimental

0

Students: Study approach Quantitative;quasi-experimental

+

Nursing faculty development (1980) Extended over time (3 years) Hybrid form (workshops,seminars, consultations,regional meetings,publications,

Nurse faculty Learning: Attitudes Qualitative; site-visits +

Behavior Qualitative; survey + sitevisits

+

Institutional impact Qualitative; partly sitevisits

+

Students: Learning outcomes Qualitative; partly sitevisits

+/0

Pinheiro et al. (1998) Extended over time (1 yearpart-time fellowship program)

Hybrid form (workshops &mentoring)

Medical faculty Behavior Mixed-method; pretest;PALS; partly videos

+

Page 12: The impact of instructional development in higher education: The state-of-the-art of the research

36A

.Steset

al./EducationalResearchReview

5(2010)

25–49

Table 2 (Continued )

Study Intervention duration Nature intervention Target group intervention Outcome measure Research design Resulta

Pololi et al. (2001) One-time event (3 days) Traditional Medical school faculty Learning: Attitudes Qualitative +Learning: Conceptions Qualitative; document

analysis+

Institutional impact Qualitative; posttestand delayed posttest;partly documentanalysis

+

Postareff et al. (2007) Extended over time Traditional No specific target group Learning: Attitudes Mixed-method;comparison groups

+

Learning: Conceptions Quantitative;comparison groups;ATI

+

Learning: Skills Qualitative +Behavior Quantitative;

comparison groups;ATI

+

Quirk et al. (1998) One-time event (1 workshop) Traditional Community health centerpreceptors

Learning: Knowledge Quantitative; pretest;posttest and delayedposttest

+

Learning: Skills Mixed-method;pretest; sort of test

+

Behavior Quantitative; pretest;posttest and delayedposttest

+

Rakes (1982) One-time event (4 days) Traditional No specific target group Learning: Skills Quantitative; pretest;16-items test

+

Rothman and Robinson (1977) One-time event (4.5 days) Traditional New & experienced faculty Behavior Quantitative +/0

Sheets (1984) Extended over time (9 months) Traditional Medical staff of familymedicine faculty

Learning: Knowledge Quantitative; pretest;posttest and delayedposttest; written test

+

Behavior Mixed-method; twodelayed post tests;videos

+/0

Skeff et al. (1998) Extended over time (9 × 3 h) Traditional Pathology faculty Learning: Skills Quantitative; pretest;delayed posttest

+

Behavior Mixed-method;pretest; posttest anddelayed posttest; partlyvideos

+/0

Institutional impact Quantitative; posttestand delayed posttest

+

Students: Perceptions Quantitative; pretest +

Page 13: The impact of instructional development in higher education: The state-of-the-art of the research

A.Stes

etal./EducationalResearch

Review5

(2010)25–49

37

Slavit et al. (2003) Extended over time (Oneacademic year)

Hybrid form (generalworkshops & as-neededsupport)

Teacher educators Learning: Conceptions Qualitative +/0

Learning: Knowledge Qualitative +/0Learning: Skills Qualitative; report of

instructional developer+

Behavior Qualitative +Students: Learning outcomes Quantitative; partly

sort of pretest+

Stepp-Greany (2004) Extended over time Alternative (team teaching &experiential teaching)

Teaching assistants Learning: Attitudes Qualitative +

Learning: Knowledge Qualitative; documentanalysis

+

Learning: Skills Qualitative; documentanalysis

+

Behavior Qualitative; documentanalysis

+

Students: Perceptions Qualitative +Students: Study approach Qualitative; teaching

assistant logs+

Students: Learning outcomes Qualitative; partly sortof pretest

+

Sydow (1998) One-time event (1 or 2 days) Traditional No specific target group Learning: Attitudes Quantitative +Behavior Mixed-method; partly

focus groups+

Institutional impact Qualitative; delayedposttest; focus groupsand document analysis

+

Students: Learning outcomes Quantitative; survey ofteachers (!)

+

Extended over time (1semester?)

Alternative (research grant) No specific target group Learning: Attitudes Mixed-method; partlydocument analysis

+

Behavior Quantitative +Institutional impact Qualitative; delayed

posttest; focus groupsand document analysis

+

Students: Learning outcomes Quantitative; survey ofteachers (!)

+

a (0) No indication of impact; (+) indication of impact; (+/0) partial indication of impact; (?) unclear if there is an indication of impact or not.

Page 14: The impact of instructional development in higher education: The state-of-the-art of the research

38 A. Stes et al. / Educational Research Review 5 (2010) 25–49

described three teacher cases. In one of these cases understanding how the technology worked was reported as an effect ofinstructional development. In the study by Addison and VanDeWeghe (1999) participants’ answers to open-ended surveyquestions revealed an increased understanding of writing components as well as of the interpretive nature of assessmentfor many participants in instructional development.

Mixed-method studies. Three studies examined the impact of instructional development on teachers’ knowledge usingmixed-methods (Fidler et al., 1999; Howland & Wedman, 2004; Kahn & Pred, 2002). None of these studies utilized a controlor comparison group. Only Fidler et al. (1999) did not use pretest data. Their analysis of quantitative survey data revealedthat 47% of 53 respondents reported a greater understanding of student services due to instructional development. Inter-view data supported this finding. In the study by Howland and Wedman (2004), quantitative self-report data revealed apositive change from pretest to posttest in technological knowledge. The change was statistically significant (p < .05) forfour of the seven knowledge aspects. Interview data as reported at the posttest revealed an increased awareness of mean-ingful ways to integrate technology into course content. The study by Kahn and Pred (2002) considered an instructionaldevelopment course regarding educational technology as well. Whereas the quantitative survey data made clear that at theend of the training participants perceived an increase in knowledge, the qualitative data as collected by way of an e-mailsurvey four months after the end of training revealed faculty had insight into the educational worth and possibilities oftechnology.

6.1.1.4. Impact on teachers’ skills. Thirteen studies examined the impact of instructional development on teachers’ skills: fivequantitatively, five qualitatively and three using mixed-methods.

Quantitative studies. None of the five quantitative studies of the impact of instructional development on teachers’ learningof skills (Dixon & Scott, 2003; Howland & Wedman, 2004; Kahn & Pred, 2002; Rakes, 1982; Skeff, Stratos, Bergen, & Regula,1998), used a comparison/control group. In the study by Dixon and Scott (2003), teachers judged that their participation ininstructional development increased their skills regarding the creation of optimal and comfortable learning environments,time-management, and enhancing student motivation and interaction. No pretest data were reported. Howland and Wedman(2004) used a self-report instrument and analyzed pretest and posttest data using a paired t-test. A statistically significantand positive change in technology skill was mentioned for four aspects (p < .05); for three other aspects the change waspositive but not statistically significant. The study by Kahn and Pred (2002) examined change in technology skills as well. Ananalysis of pretest and posttest data revealed that participants felt more comfortable with educational technology at the endof instructional development than before. Rakes (1982) used a test with 16 items to investigate the impact of instructionaldevelopment on the teaching strategies of seven teachers. Despite the small number of respondents, a t-test on pretestand posttest data revealed a statistically significant effect (p < .001). Skeff et al. (1998) analyzed pretest and delayed posttestdata (delay of five months) from a questionnaire completed by eight participants in instructional development. A statisticallysignificant increase in awareness of teaching problems was reported (p = .002) despite the small number of teachers involvedin the study.

Qualitative studies. Teachers’ skills as enhanced due to instructional development were the subject of study in five quali-tative studies (Brauchle & Jerich, 1998; Harnish & Wild, 1993; Postareff et al., 2007; Slavit et al., 2003; Stepp-Greany, 2004).None of the studies incorporated a control/comparison group. The study by Harnish and Wild (1993) was the only one thatincluded a pretest. Four cases were described in this study. In one of them, the two teachers involved reported gains intechnical skills. The study by Slavit et al. (2003) was case-based as well. For one of the three participants, the instructionaldeveloper reported that at the end of instructional development, the teacher was able to provide resources without over-whelming students. For another participant, it was concluded that he became able to post his own web sites and had madegreat strides in web design. In the study by Stepp-Greany (2004) an analysis of supervisor’s notes and teaching assistantlogs revealed that the teaching assistants who participated in instructional development gained skills in classroom organi-zation, problem solving, decision making, and reflecting. Moreover, they learned teaching strategies including strategies tosequence instruction appropriately. The three teachers interviewed in the study by Brauchle and Jerich (1998), perceived thattheir participation in instructional development enhanced their teaching ability, classroom presentation skills, and skills toimprove student evaluations. An analysis of the 23 interviews in the study by Postareff et al. (2007) revealed that teachersdeveloped reflective skills during instructional development.

Mixed-method studies. Three studies investigated the impact of instructional development on teachers’ skills both quanti-tatively and qualitatively (Fidler et al., 1999; , Gibbs, Browne, & Keeley, 1988; Quirk et al., 1998). The analysis of quantitativesurvey data in the study by Fidler et al. (1999) revealed that 33% of 60 respondents reported an increased competency inteaching skills due to instructional development. Interview data supported this finding. The quantitative data in the studyby Gibbs et al. (1988), using a critical thinking test after the end of the instructional development, revealed no differencesbetween the experimental group and the control group regarding critical thinking and higher thought processes. On theother hand, a qualitative analysis of the summaries of group discussions that took place during the instructional develop-ment process made clear that participants reported they learned teaching skills. In the study by Quirk et al. (1998), (205)participants in instructional development were asked to analyze role-play of an ineffective teaching situation. Pretest as wellas posttest data were reported. Three expert judges categorized teachers’ responses. The quantitative analysis by way of signtests as well as the qualitative analysis of teachers’ suggestions for how to improve the role-play situation made clear thatat the end of the workshop teachers were more able to analyze an educational encounter.

Page 15: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 39

6.1.2. Impact on teachers’ behaviorThe literature search located 31 studies that examined the impact of instructional development in higher education

on the level of teachers’ behavior (as described in Table 1): 12 quantitative studies, 13 qualitative studies, and 6 using amixed-methods approach. Some of the important characteristics and outcomes of each study are presented in Table 2.

6.1.2.1. Quantitative studies. Three of the studies examining the impact on teachers’ behavior quantitatively used a quasi-experimental design (Gibbs & Coffey, 2004; Nasmith et al., 1995; Postareff et al., 2007). Only Gibbs and Coffey (2004) useda pretest; Nasmith et al. (1995) reported retrospective pretest data. Gibbs and Coffey (2004) collected posttest data oneyear after the end of instructional development. The Approaches to Teaching Inventory (ATI; Prosser & Trigwell, 1999) wasused as an instrument. Quantitative analysis of the data revealed that the experimental teachers were more student-focusedin their teaching approach than the control teachers (p < .05) at the posttest. No significant difference between the twogroups was found with regard to teacher focused approach. A comparison between the pretest and the posttest data of theexperimental group illustrated that the experimental teachers had a higher student-focused approach after their participationin instructional development than before (p < .001). No significant difference between the two moments (post and pre) wasfound with regard to teacher focused approach. Postareff et al. (2007) used the ATI as well. In their study, teachers who hadmore credits of instructional training were compared with those who had fewer or no credits. A quantitative analysis madeclear that the more credits the teachers obtained, the more their teaching was characterized as student-focused (p < .005).In the study by Nasmith et al. (1995) chi-square analyses were used to compare behavior as observed by the instructionaldevelopers, within as well as between the experimental and the control group. Teachers’ behavior was observed from sixmonths to five years after the end of instructional development. No significant differences between the experimental andcontrol groups were found. Chi-square analyses were also used with regard to interview answers, collected six months tofive years after the end of instructional development. In these interviews, teachers were asked about the period before theirparticipation in the instructional development as retrospective pretest data. The analyses revealed that the use of innovativeteaching methods by the experimental teachers increased from pretest to posttest, but the change was not significantly morethan in the control group.

Of the remaining nine studies examining the impact on teachers’ behavior in a quantitative way, four used apretest/posttest design (Bennett & Bennett, 2003; Claus & Zullo, 1987; Hewson, Copeland, & Fishleder, 2001; Quirk et al.,1998). In the study by Bennett and Bennett (2003), an increased use of Blackboard from pretest (use by 15% of the 20respondents) to posttest (use by 37% of the 20 respondents) was reported as a result of instructional development. In thestudy of Claus and Zullo (1987) a Wilcoxon signed rank test on the results of teaching observations as well as of the anal-ysis of videotaped teaching behavior, revealed a statistically significant increase in well presented lectures from pretest toposttest (p < .05). Hewson et al. (2001) used a pretest and a delayed posttest (delay of six months) to examine the impactof instructional development on teachers’ behavior. According to the results of the teaching survey, the teaching compe-tencies of 27 teachers improved from pretest to delayed posttest, with a statistically significant improvement for 2 out of15 competencies (p < .005). A student survey was used asking students to rate their teachers on the same 15 competencies.The correlation between the self-assessment of the teachers (i.e., the teaching survey) and students’ ratings was positivefor all competencies. For 5 of the 15 competencies the correlation was significant (p < .005). The students as well as theteachers rated at the posttest two competencies as significantly better than at the pretest (p < .05). In the study by Quirket al. (1998), a pretest, a posttest, and a delayed posttest (delay of three months) were used. Self-reports of participants inthe instructional development at the immediate posttest revealed that teachers anticipated an increased use of 8 out of the11 teaching behaviors. Three months later teachers were asked to compare their actual teaching behavior at that momentwith what they had anticipated immediately after the instructional workshop. Six of the eight anticipated changes remainedpositive (p < .01); however, for five of them the increase was less high than anticipated immediately after the workshop(p < .05).

Five quantitative studies investigating the impact of instructional development on teachers’ behavior used neither a quasi-experimental design nor a pretest/posttest design (Braxton, 1978; Dixon & Scott, 2003; McShannon & Hynes, 2005; Rothman& Robinson, 1977; Sydow, 1998). All five used a quantitative survey to gain insight into teachers’ perceptions of the impactof their participation in an instructional development initiative. In the study by Braxton (1978), (84)% of the 44 respondentsindicated that after the instructional development initiative they tried another teaching method. Dixon and Scott (2003) alsoreported on the impact of instructional development on teachers’ behavior. In their study, 13 out of 19 respondents indicatedincreased relevance of their teaching, 18 reported trying to move among their students, 15 reported encouraging students toask questions, and a majority reported trying to make eye contact with students. On the other hand, eight respondents wereunsure whether they were more available to students after their participation in the instructional development initiativeand one respondent reported he was not. Four respondents believed that their awareness of students’ responses to theirteaching style had not increased, and five remained unsure about this. The study by Rothman and Robinson (1977) foundmixed effects as well. Whereas 22 respondents reported a change in teaching behavior after their participation in instructionaldevelopment, 52 reported no change. McShannon and Hynes (2005) reported a positive impact of instructional developmenton teachers’ behavior. All of the 62 respondents indicated that they used at least three of the strategies suggested during theinstructional development initiative. 89% of the respondents mentioned that they used a suggested strategy once a week.High scores (4.0, 4.5, and 3.6, respectively) were obtained on a five-point Likert scale for the items “practice has changed”,“practices will be used in the future”, and “teaching will be further improved.” In the part of the study by Sydow (1998)

Page 16: The impact of instructional development in higher education: The state-of-the-art of the research

40 A. Stes et al. / Educational Research Review 5 (2010) 25–49

that examined the effect of research grants regarding teaching, 11% of the 185 participants reported improvement in classinstruction.

6.1.2.2. Qualitative studies. None of the thirteen qualitative studies examining the impact on teachers’ behavior integrateda control/comparison group. Only two studies (Gallos, van den Berg, & Treagust, 2005; Harnish & Wild, 1993) reporteddata from a pretest. In the study by Gallos et al. (2005) qualitative analysis of observations, videotaped teaching, records ofmeetings, and interviews with students, revealed a reduction of lecture time along with increased time for guided practice,from pretest to posttest. In one of the four cases described by Harnish and Wild (1993), teachers used each other’s teachingstrategies more at the end of the peer mentoring project. In another case it was concluded that the consistency in one’scourse had increased. Two studies used a delayed posttest (Kahn & Pred, 2002; McDonough, 2006). In the study by Kahn andPred (2002) an e-mail survey was sent to 50 participants enrolled in an instructional development initiative four monthsafter its completion. Their analysis revealed that 46 of the 50 teachers were utilizing technology and evidence was foundthat teachers were seeking new ways to improve their teaching with technology. McDonough (2006) reported posttest datacollected 13 months after the end of action research seminars. Data were obtained for seven participants from differentsources: professional journal entries, reflective essays, reports about the action research project, evaluation forms, and oralfocus groups. The qualitative analysis revealed new or improved practices and a greater understanding of their classes.

Of the nine remaining studies, four used interview or survey data (Addison & VanDeWeghe, 1999; Fedock, Zambo, &Cobern, 1993; Medsker, 1992; Michael, 1993). In the study by Addison and VanDeWeghe (1999) the qualitative analysis ofopen-ended survey data revealed that almost half of the respondents commented that they planned to discuss issues raisedduring the instructional development with their students and to emphasize them in class. We noted that the responsesfocused on future teaching behavior rather than actual teaching practice. Fedock et al. (1993) interviewed four teachers whowere involved in instructional development. They concluded that teachers’ behavior became more likely to include classdiscussions. Medsker (1992) described the results of a telephone interview with three cases. In one of the cases a behavioralchange (i.e., use of slides to enhance one’s lectures) was reported. In general it was concluded that teachers adopted moresystematic design and measurement techniques. In a study by Michael (1993) it was reported, on the basis of interview data,that teachers (n = 50) became more reflective due to their participation in action research.

Two studies (Finkelstein, 1995; Nursing faculty development, 1980) used not only interview/survey data, but also com-plemented these with an analysis of course syllabi (Finkelstein, 1995) or with site visits (Nursing faculty development, 1980).Based on the analysis of exit interviews with 22 participants involved in instructional development, Finkelstein (1995) con-cluded that 4 teachers could be considered as “converters”, 14 as “moderate changers” and 4 as “resisters”. The qualitativeanalysis of the syllabi of 24 participants revealed for most teachers an increased diversity of teaching behaviors as wellas an increase in activating teaching methods. In the final report of Nursing faculty development (1980), a written surveywith open-ended questions answered by 185 participants in instructional development revealed that 61% of the teachersreported that they had changed their teaching practice during the instructional development project. Site visits corroboratedthis result and made clear that the use of various instructional strategies had increased. In the study by Gibbs et al. (1988)an analysis of the lesson plans of 38 teachers revealed that all participants intended to do things differently due to theirparticipation in instructional development. The study by Slavit et al. (2003) reported about three teacher cases. In one case,a technology strand was included in the teacher’s course due to the instructional development initiative, and many moretechnology experiments were integrated. A second case demonstrated that the teacher assessed more information on theweb. In the study by Stepp-Greany (2004) the qualitative analysis of supervisor’s notes and of teaching assistant logs revealeda generalization of what had been learned to regular classes in the following semester. In addition, teachers seemed to beable to get to know their students, to observe and analyze various learning styles, to provide differing strategies, and todevelop and use several instruments.

6.1.2.3. Mixed-method studies. Six studies analyzed the impact of instructional development on teachers’ behavior using amixed-methods approach (Fidler et al., 1999; Howland & Wedman, 2004; Sheets & Henry, 1984; Skeff et al., 1998; Sydow,1998; Pinheiro, Rohrer, & Heimann, 1998). None of the studies used a control or comparison group. Three studies (Fidler et al.,1999; Sheets & Henry, 1984; Sydow, 1998) did not use pretest data. In the study by Fidler et al. (1999) an analysis of quantitativesurvey data revealed that 47% of 53 respondents reported that their variety of teaching behaviors had increased due toinstructional development, and 40% referred a greater number of students to student services. Interview data supportedthese findings. In a study by Sheets and Henry (1984), two delayed posttests were used: the first took place five months afterthe end of the instructional development session, and the second nine months after its conclusion. Videotaped teaching of 14teachers was analyzed quantitatively (rating on a five point scale) as well as qualitatively by the instructional developers. Onlya very small improvement in teaching behavior was reported five months to nine months after the instructional development.Sydow (1998) evaluated the impact of peer group conferences that took place over a period of five years. Quantitative datacollected from 1894 participants revealed that 30% of them had improved their class instruction. A qualitative analysis offocus group transcripts indicated that teachers incorporated specific classroom applications.

The three remaining studies (Howland & Wedman, 2004; Pinheiro et al., 1998; Skeff et al., 1998) used a pretest/posttestdesign. In the study by Skeff et al. (1998), a delayed posttest (delay of five months) was also used. Whereas the immediateposttest revealed a statistically significant increase in teaching performance related to all but two of the seven seminar topics(p-value not reported), the five-month follow-up questionnaire indicated a statistically significant increase in performance

Page 17: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 41

related to all topics (p-value not reported). A trained researcher analyzed teaching videotapes of seven participants. It wasconcluded that for five of the seven participants teaching improved from pretest to posttest. 52% of the post-seminar tapeswere classified as lecturing only in contrast to 81% of the pre-seminar tapes. The ratings on the scales “learning climate” and“promoting understanding and retention” were significantly higher for the post-seminar videotapes than for the pre-seminarones (p < .05 and p = .01, respectively). The ratings on the scales “feedback” and “self-directed learning” were low on pre- aswell as on post-seminar videotapes. Howland and Wedman (2004) used a self-report instrument to analyze the impact ofinstructional development on teachers’ behavior. At the post-test participants integrated technology use to a greater extentthan at the pretest. A statistically significant decrease in lectures or presentations was found (p < .05). Interviews at theposttest indicated that teachers (n = 21) used technology with an increased comfort level. Pinheiro et al. (1998) used thePALS (Principles of adult learning scales; Conti, 1985) as an instrument to analyze change in teachers’ behavior (n = 18) frompretest to posttest due to instructional development. A significant overall improvement (p = .005) as well as a significantimprovement for four of the seven scales (p < .05) was found. The analysis of teaching videotapes of six participants revealedcomparable findings. Five participants received higher video scores after the instructional development than before whileone participant received a lower score.

6.2. Studies measuring institutional impact

The literature search located nine studies that examined the impact of instructional development in higher educationon the institutional level (as described in Table 1): one quantitatively, seven qualitatively, and one using a mixed-methoddesign. Some of the important characteristics and outcomes of each study are presented in Table 2.

6.2.1. Quantitative studyThe study by Skeff et al. (1998) was the only study that examined the institutional impact of instructional development

quantitatively. No control or comparison group was used. Questionnaire data revealed that teachers were more likely todiscuss teaching with colleagues right after the conclusion of the instructional development seminar than before (p < .01).This effect was still found five months later (p < .01).

6.2.2. Qualitative studiesNone of the seven studies examining institutional impact qualitatively (Fedock et al., 1993; Harnish & Wild, 1993; Nursing

faculty development, 1980; McDonough, 2006; Medsker, 1992; Pololi et al., 2001; Sydow, 1998) utilized a control/comparisongroup. In a study by Fedock et al. (1993), interviews with four teachers who participated in instructional developmentindicated that they initiated discussions with colleagues to change their biology curriculum. A case-based study by Harnishand Wild (1993) revealed that information and teaching materials were disseminated among colleagues to a greater extentafter the instructional development project than before the project. In the final report of Nursing faculty development (1980),data collected through a written survey revealed that 64 of the 185 participants developed spin-off activities or special interestgroups at the end of instructional development. Site visits revealed improved collegial and peer relationships. Medsker (1992)described the results of a telephone interview with three cases. In one of the cases, a shift on the campus in attitude towardsteaching (more value placed on teaching) was mentioned. In general, it was concluded that networking with other facultywas established and that national recognition was gained for the teaching projects on which one worked.

Delayed posttests were used in three studies (McDonough, 2006; Pololi et al., 2001; Sydow, 1998). McDonough (2006)collected posttest data 13 months upon the conclusion of action research seminars. Data were obtained from differentsources: professional journal entries, reflective essays, reports about the action research project, evaluation forms, and oralfocus groups. The qualitative analysis revealed an appreciation for peer collaboration. Documents written during the processof instructional development were analyzed in a study by Pololi et al. (2001). The analysis revealed increased interdisciplinarycollegiality. Three months after the end of the instructional development, an open-ended survey with the 58 participants anda focus group with seven participants revealed the same result. Sydow (1998) evaluated the impact of peer group conferencesthat took place over a period of five years. The qualitative analysis of focus group transcripts indicated the development ofcollegial networks. For the impact of research grants regarding teaching, the qualitative analysis of 10% of the proposals andfinal reports of the grants revealed that students’ access to education was increased while there were cost savings for theinstitution.

6.2.3. Mixed-method studyThe study by Fidler et al. (1999) was the only one that studied institutional impact using a mixed-methods approach.

Survey data revealed that 57% of the 68 participants in instructional development mentioned increased meeting of newcolleagues outside their own discipline. Interview data supported this finding.

6.3. Studies measuring change within students

The literature search located 12 studies that examined the impact of instructional development in higher education onstudents (as described in Table 1): on students’ perceptions of the teaching and learning environment, on students’ study

Page 18: The impact of instructional development in higher education: The state-of-the-art of the research

42 A. Stes et al. / Educational Research Review 5 (2010) 25–49

approaches, and/or on their learning outcomes. Some of the important characteristics and outcomes of each study arepresented in Table 2.

6.3.1. Impact on students’ perceptionsSeven studies examined the impact of instructional development on students’ perceptions of the teaching and learning

environment: five quantitatively and two qualitatively.

6.3.1.1. Quantitative studies. Two of the quantitative studies (Brauchle & Jerich, 1998; Nurrenbern, Mickiewicz, & Francisco,1999) compared student ratings of an experimental group of teachers (who participated in instructional development) withthose of a control group (who did not participate in instructional development). Brauchle and Jerich (1998) concluded thatafter an instructional development initiative for teaching assistants, the treatment group’s student ratings (obtained from106 students of 7 experimental assistants) were statistically significantly higher than the control group’s (obtained from 140students of 6 control assistants; p < .01). In a study by Nurrenbern et al. (1999), most of the items in the student questionnairerevealed no significant differences between the experimental group (n = 1239 students) and the control group (n = 846). Theother three quantitative studies (Finkelstein, 1995; Gibbs & Coffey, 2004; Skeff et al., 1998) used a pretest/posttest designto examine the impact on students’ perceptions. In the study by Finkelstein (1995) a student survey indicated that studentsperceived an increase in number and quality of interactions from pretest to posttest. Gibbs and Coffey (2004) used six scalesof the Student Evaluation of Educational Quality (SEEQ; Marsh, 1982) questionnaire as well as the ‘Good teaching’ scale of theModule Experience Questionnaire (MEQ, developed from the Course Experience Questionnaire; Ramsden, 1991) to examinestudents’ perceptions of the teaching and learning environment. At the delayed posttest, one year after teachers’ instructionaltraining, students’ ratings were significantly higher than at the pretest (p < .01). Skeff et al. (1998) found significantly higherstudent ratings of lecture quality and quality of syllabus notes at the posttest in comparison to the pretest (p < .05).

6.3.1.2. Qualitative studies. Only two studies (Medsker, 1992; Stepp-Greany, 2004) examined the impact of instructionaldevelopment on students’ perceptions qualitatively. Both studies reported a positive effect. Medsker (1992) described theresults of a telephone interview using three cases. In one of the cases better student evaluations were reported as outcomeof teacher’s instructional development. In the study by Stepp-Greany (2004), most of the 11 students who were surveyedreported that the course in question (namely the course as team taught by the teaching assistants involved in the instructionaldevelopment) was different from other courses (e.g., more interesting than other foreign language classes).

6.3.1.3. Mixed-method studies. No study was found that investigated the impact of instructional development on students’perceptions using a mixed-methods approach.

6.3.2. Impact on students’ study approachesFour studies examined the impact of instructional development on students’ study approaches: three quantitatively and

one qualitatively.

6.3.2.1. Quantitative studies. In a study by Howland and Wedman (2004), (132) students filled in the Flashlight current studentinventory (Ehrmann & Zuniga, 1997) at the end of teachers’ instructional development in educational technology. Studentsreported they used a variety of technology to complete learning experiences or assignments. However, because no pretest orcontrol/comparison group was incorporated in the study, it remains unclear to what extent the results can be attributed tothe instructional development initiative. Nurrenbern et al. (1999) compared the study approaches of an experimental groupof students (whose teachers participated in instructional development) with those of a control group. It was found thatstudents in the experimental group helped each other significantly more than the control group (p < .05). Gibbs and Coffey(2004) used two scales of the MEQ as well as the scale “Learning” of the SEEQ to examine students’ study approaches. At thedelayed posttest, one year after teachers’ instructional training, students’ scores on surface study approach were significantlydecreased in comparison to the pretest (p < .001). The increase in deep study approach was not statistically significant. Thescores on the scale “Learning” of the SEEQ were higher at the posttest than at the pretest (p < .001).

6.3.2.2. Qualitative study. The study by Stepp-Greany (2004) was the only one that examined the impact on students’ studyapproaches qualitatively. The analysis of logs of the three teaching assistants who participated in instructional developmentrevealed that a community of learners seemed to be created among the students, which acted as a kind of support group.

6.3.2.3. Mixed-method studies. No studies were found that investigated the impact of instructional development on students’study approaches using mixed-methods.

6.3.3. Impact on students’ learning outcomesEight studies examined the impact of instructional development on students’ learning outcomes: five quantitatively, two

qualitatively, and one using mixed methods.

Page 19: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 43

6.3.3.1. Quantitative studies. Five quantitative studies (Brauchle & Jerich, 1998; Finkelstein, 1995; McShannon & Hynes, 2005;Slavit et al., 2003; Sydow, 1998) examined the impact on students’ learning outcomes. Only one of these (Brauchle & Jerich,1998) used a control/comparison group. Students of teachers who participated in instructional development rated theirlearning outcomes significantly higher than students in the control group (p < 01). Two of the studies (Finkelstein, 1995;McShannon & Hynes, 2005) used a pretest/posttest design. In Finkelstein’s research (1995) a student survey indicated thatstudents’ expectations of academic performance increased from pretest to posttest. However, an analysis of their academicachievement scores revealed no increase in actual performance. McShannon and Hynes (2005) surveyed 62 teachers whoparticipated in instructional development, about their perceptions of students’ learning outcomes. The data revealed thatteachers perceived that their students learned more than before (score of 3.6 on Likert-scale from 1 to 5). A comparisonof the actual pretest/posttest achievement and retention scores supported teachers’ perceptions. A 5% increase in studentachievement and a 6% increase in student retention were found. Slavit et al. (2003) compared students’ topic choices formaster’s thesis with the choices students made two years before the instructional development initiative regarding edu-cational technology took place. It was found that while in the two previous years only two students chose a technologyfocus for their masters project, this increased to seven during the year teachers participated in instructional development.The perceptions of the students (who were pre-service teachers) about their learning outcomes revealed that 77% of thestudents strongly agreed that they learned new models for using one computer as an instructional tool in class. Additionally,88% strongly agreed that they planned to apply what they learned in their own instructional practice. In Sydow’s study (1998)teachers were surveyed about the impact of instructional development on students’ learning outcomes. The impact of peergroup conferences that took place over a period of five years, as well as research grants related to teaching were evaluated.The quantitative analysis of data from 1894 participants in the peer group conferences revealed that 24% of them perceivedenhanced student learning. The impact of research grants was perceived as lower, in that only 9% of the 185 teachers involvedperceived that students’ learning had increased.

6.3.3.2. Qualitative studies. Two studies were found that examined the impact on students’ learning outcomes qualitatively(Nursing faculty development, 1980; Stepp-Greany, 2004). Neither of the two studies used a pretest or a control/comparisongroup. In the final report of Nursing faculty development (1980), a written survey with open-ended questions indicated that28% of the teachers who took a leadership role during the instructional development reported an increase in the retentionof minority students while 57% reported no change. Site visits revealed a reduction in academic attrition among students. Inresearch by Stepp-Greany (2004), an analysis of the logs of three assistants who participated in an instructional developmentinitiative indicated that their students showed more interest and were more satisfied than students in previous classes whilealso acquiring better speaking and writing skills. The students were also surveyed in this study. An analysis of the studentdata revealed that almost all of them reported positive gains in understanding language or culture; more than half believedthat they would be able to use what they learned later on, with the majority reporting that their attitudes towards learningwere affected positively.

6.3.3.3. Mixed-method study. The study by Medsker (1992) investigated the impact on students’ learning outcomes bothquantitatively and qualitatively. Sixty-five students took a written exam testing their critical thinking skills before as well asafter their teachers’ participation in instructional development. For four of the five skills tested, students scored significantlyhigher at the posttest than at the pretest (p = .05). The results of telephone interviews with the teachers who participated ininstructional development were described using three cases. In one of the cases increased student involvement as well asan increased number of students doing special projects were reported. In two of the cases it was mentioned that teachersbelieved that their students were learning more.

7. Overview of the nature and design of the research

7.1. The nature of earlier research: levels of outcome measured

Of the 36 articles incorporated in this review, a majority examined the impact of instructional development on theparticipants themselves, namely the teachers: 27 examined the impact on teachers’ learning, 31 the impact on teachers’behavior. An institutional impact was investigated in only 9 studies; an impact at the level of the students in 12 studies.Most studies examined the impact on more than one of the sub-levels as described in Table 1. That way the impact ofinstructional development was investigated 108 times: 49 times at the level of teachers’ learning (16 times on attitudes, 8times on conceptions, 12 times on knowledge and 13 times on skills), 31 times at the level of teachers’ behavior, 9 times atthe institutional level, and 19 times at the level of the students (7 times on students’ perceptions, 4 times on students’ studyapproaches, 8 times on students’ learning outcomes).

Although 31 of the 36 studies incorporated in our review examined the impact of instructional development on teachers’behavior, we noted that only 12 of these studies (Claus & Zullo, 1987; Finkelstein, 1995; Gallos et al., 2005; Gibbs et al., 1988;Hewson et al., 2001; McDonough, 2006; Nasmith et al., 1995; Nursing faculty development, 1980; Pinheiro et al., 1998; Sheets& Henry, 1984; Skeff et al., 1998; Stepp-Greany, 2004) relied on more than the self-reports of participants to draw conclusionsregarding teachers’ behavioral change. The plea already made by Levinson-Rose and Menges in their review (1981) for impactresearch drawing not only on the self-reports of participants, but also measuring actual changes in performance seems not to

Page 20: The impact of instructional development in higher education: The state-of-the-art of the research

44 A. Stes et al. / Educational Research Review 5 (2010) 25–49

have been fully heeded. On the other hand, we note that the seven studies reporting the impact on students’ perceptions canbe considered as partly measuring teachers’ behavior as well without relying on teachers’ self-reports: students’ perceptionsof the teaching and learning environment encompass their perceptions of teachers’ behavior.

Steinert et al. (2006) synthesized the existing evidence regarding the effects of instructional development in medicaleducation, covering the period 1980–2002. They predicted an increase in studies researching behavioral and systems out-comes in the first years of the twenty-first century. 14 of the 36 articles incorporated in our review were published after 2000(Bennett & Bennett, 2003; Dixon & Scott, 2003; Gallos et al., 2005; Gibbs & Coffey, 2004; Hewson et al., 2001; Howland &Wedman, 2004; Hubball et al., 2005; Kahn & Pred, 2002; McDonough, 2006; McShannon & Hynes, 2005; Pololi et al., 2001;Postareff et al., 2007; Slavit et al., 2003; Stepp-Greany, 2004). Eleven of those examined the impact on teachers’ learning, 12on teachers’ behavior; 2 investigated the impact at the level of the institution and 5 at the level of the students. We conclude,thus, that the prediction by Steinert et al. (2006) mentioned above is not fully confirmed by our review data. Attention tothe impact beyond the level of the teachers themselves remains very limited.

7.2. The design of earlier research

As mentioned earlier, the impact of instructional development at one of the sublevels as mentioned in Table 1 wasinvestigated 108 times in the 36 articles incorporated in our review: 46 times quantitatively, 44 times qualitatively, and18 times using mixed-methods. A mixed-method approach was the least represented at all of the four levels (teachers’learning, teachers’ behavior, institutional impact, impact at the level of the students). Institutional impact was most frequentlyexamined qualitatively (7 of the 9 times), while the impact at the level of the students was investigated most frequentlyquantitatively (13 of the 19 times).

A pre-test was incorporated 34 times: 24 times in quantitative analyses, 4 times in qualitative analyses, and 6 times inmixed-method analyses. None of the 9 studies that examined institutional impact used a pretest. The impact on teachers’behavior was investigated by use of a pretest infrequently (9 out of 31 times). A small half of the investigations with regardto the impact at teachers’ learning (18 out of 49), respectively, at the students (8 out of 19) used pretest data.

Delayed posttest data were reported 21 times: 14 times in quantitative analyses, 5 times in qualitative analyses, and 2times in mixed-method analyses. Institutional impact was examined using a delayed posttest relatively often (4 out of 9times). Also, the impact on teachers’ behavior was investigated relatively more using a delayed posttest (8 out of 31 times)than the impact on teachers’ learning (7 out of 49 times) or on students (2 out of 19 times).

A control/comparison group was used only 15 times: 13 times in quantitative analyses and 2 times in mixed-methodanalyses. No qualitative analyses were reported in which data of a control or comparison group were incorporated. A similarpattern with regard to the use of a pretest was noted: a control/comparison groups was not used any of the 9 times thatinstitutional impact was examined. The impact on teachers’ behavior was investigated using a control/comparison groupinfrequently as well (3 out of 31 times). Eight out of 49 investigations regarding the impact at teachers’ learning and 4 out of19 investigations regarding the impact at the students used data from a control or comparison group.

An internationally known rather than self-constructed instrument was used only 8 times to investigate the impact ofinstructional development. The Approaches to Teaching Inventory (ATI; Prosser & Trigwell, 1999) was used in two studies(Gibbs & Coffey, 2004; Postareff et al., 2007) to examine the impact on teachers’ conceptions and teachers’ behavior. Pinheiroet al. (1998) used the principles of adult learning scales (PALS; Conti, 1985) to examine the impact on teachers’ behavior.The Student Evaluation of Educational Quality (SEEQ; Marsh, 1982) questionnaire and the Module Experience Questionnaire(MEQ, developed from the Course Experience Questionnaire; Ramsden, 1991) were used by Gibbs and Coffey (2004) withregard to the impact on students’ perceptions and study approaches. Howland and Wedman (2004) used the Flashlightcurrent student inventory (Ehrmann & Zuniga, 1997) in their investigation of the impact on students’ study approaches.

Earlier reviews (Levinson-Rose & Menges, 1981; McAlpine, 2003; Prebble et al., 2004; Weimer & Lenze, 1998) made aplea for more rigorous research designs. This plea is still relevant taking the aforementioned results into account. Moreattention should be given to mixed-method studies and to designs with a pretest and/or a quasi-experimental character.Only delayed posttests can examine whether changes are durable over time. When we consider the studies appearing after2000, in particular, we see no big changes in the research design in comparison to the studies before 2000. In just thestudies appearing after 2000, the impact of instructional development was investigated 46 times: 22 times quantitatively, 19times qualitatively, and 5 times using a mixed-method approach. Pretest data were reported 18 times and delayed posttestdata were reported 9 times. A control or comparison group was incorporated 6 times. Steinert et al. (2006) synthesized theexisting evidence regarding the effects of instructional development in medical education, covering the period 1980–2002.They predicted an increase in well-designed studies in the first years of the twenty-first century. The data in our review giveno indication that this prediction came true. As previously mentioned, McAlpine’s review (2003) revealed studies with moreattention to high-quality research design, in comparison to the reviews by Levinson-Rose and Menges (1981), Prebble et al.(2004) and Weimer and Lenze (1998). Our current synthesis gives no evidence that this was due to the fact that McAlpine(2003) considered only studies reporting about the impact of an instructional workshop at the level of students and/or theinstitution (not only at the level of the participants).

Our synthesis makes clear that up until now most researchers constructed instruments themselves to investigate theimpact of instructional development. More use of the same instrument(s) to measure a certain dependent variable wouldfacilitate the comparability of research results as well as make it easier for studies to build upon one another. Possibly this

Page 21: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 45

state-of-the-art description of the nature and design of existing research will make it easier for future researchers to attaincomparable measurements of one and the same dependent variable. Of course, it is also important that the character ofinstruments used to measure the impact of an instructional development initiative is in line with the objectives of thisinitiative. Therefore instruments should be adapted to the specific context within which they are used.

8. The influence of duration, nature and target group

In our examination of the influence of duration, nature and target group of instructional development on its impact, 35articles were considered, describing together the impact of 36 instructional development initiatives. The article by Gibbs andCoffey (2004) was excluded because it considered different instructional development initiatives with varied goals, rationales,and training processes and in addition taking place in different countries. In Sydow’s study (1998) the impact of two kindsof interventions (peer group conferences and research grants regarding teaching) was evaluated. Because the impact of eachof the interventions was separately described, this article was considered as describing two kinds of interventions.

8.1. The influence of the duration of instructional development on its impact

Of the 36 instructional development initiatives only nine were a one-time event (Braxton, 1978; Dixon & Scott, 2003;Kahn & Pred, 2002; Nasmith et al., 1995; Pololi et al., 2001; Quirk et al., 1998; Rakes, 1982; Rothman & Robinson, 1977; Sydow,1998 (part A)); the 27 others occurred over time (from a couple of weeks to a couple of years). For the one-time events theimpact at one of the sublevels described in Table 1 was examined 26 times. The impact of the instructional developmentinitiatives over time was investigated 80 times. Table 3 gives an overview of the extent to which an indication of impact (apartial indication not taking into account, so considering only the + in the results column in Table 2) was found at the fourlevels of outcome for the instructional development initiatives categorized on the basis of their duration. We find that theimpact of the initiatives over time is less studied at the level of participants’ learning (32 out of 80 times versus 16 out of 26times) and more at the student level (17 out of 80 times versus ones out of 26 times), in comparison to that of the one-timeevents. At the level of participants’ behavior impact is more often found for the initiatives extended over time (20 out of 23times) than for the one-time events (4 out of 7 times). At the other levels, the impact is comparable for the two kinds ofinstructional development. Every time that impact was investigated at the institutional level, a positive effect was found forthe one-time events as well as for the initiatives over time. However, institutional impact was never examined by way of apretest/posttest or a quasi-experimental design. The less rigorous character of the research designs by which institutionalimpact was examined may be why only positive impact was reported.

We conclude that our synthesis gives evidence that instructional development interventions over time have more positivebehavioral outcomes than one-time events. Because the number of studies that examined the impact of one-time eventswas small, further investigation into this topic is needed.

8.2. The influence of the nature of instructional development on its impact

Seventeen of the 36 instructional development initiatives had a collective, course-like character; 11 had an alternativeformat such as peer learning, action research, team teaching, or research grants regarding teaching (Addison and Van-DeWeghe, 1998; DeWitt et al., 1998; Fedock et al., 1993; Finkelstein, 1995; Harnish & Wild, 1993; Howland & Wedman,2004; McDonough, 2006; McShannon & Hynes, 2005; Michael, 1993; Stepp-Greany, 2004; Sydow, 1998 (part B)); 8 used ahybrid format and combined a collective course with an alternative form of instructional development (Fidler et al., 1999;Gallos et al., 2005; Hewson et al., 2001; Medsker, 1992; Nurrenbern et al., 1999; Nursing faculty development, 1980; Pinheiroet al., 1998; Slavit et al., 2003). The impact at one of the sublevels described in Table 1 was examined 46 times for the collective,course-like initiatives. The impact of the instructional development initiatives with an alternative format was investigated36 times, and hybrid formats were investigated 24 times. Table 4 gives an overview of the extent to which an indication ofimpact (a partial indication not taking into account so considering only the + in the results column in Table 2) was found atthe four levels of outcome for the three kinds of instructional development.

Table 3Impact at the four levels of outcome for the instructional development initiatives categorized on the basis of their duration.

Level of outcome One-time events Initiatives extended over time

Number of timesimpact examined

Number of timesimpact found

Number of timesimpact examined

Number of timesimpact found

Teachers’ learning 16 13 32 25Teachers’ behavior 7 4 23 20Institutional impact 2 2 8 8Students 1 1 17 12

Total 26 20 80 65

Page 22: The impact of instructional development in higher education: The state-of-the-art of the research

46 A. Stes et al. / Educational Research Review 5 (2010) 25–49

Table 4Impact at the four levels of outcome for the instructional development initiatives categorized on the basis of their nature.

Level of outcome Initiatives with a collective,course-like format

Initiatives with an alternative format Initiatives with a hybrid format

Number of timesimpact examined

Number of timesimpact found

Number of timesimpact examined

Number of timesimpact found

Number of timesimpact examined

Number of timesimpact found

Teachers’ learning 26 22 14 11 8 5Teachers’ behavior 13 8 10 9 7 7Institutional impact 3 3 4 4 3 3Students 4 4 8 6 6 3

Total 46 37 36 30 24 18

We find that the impact of the initiatives with an alternative or hybrid character is less studied at the level of participants’learning (14 out of 36 and 8 out of 24 times, respectively, versus 26 out of 46 times) and more at the institutional level (4 outof 36 and 3 out of 24 times, respectively, versus 3 out of 46 times) and the student level (8 out of 36 times and 6 out of 24times, respectively, versus 4 out of 46 times), in comparison to that of the collective, course-like initiatives. At the level of theparticipants’ behavior, impact was more often found for the initiatives with an alternative or hybrid character (9 out of 10times, respectively, all 7 times) than for the collective, course-like ones (8 out of 13 times). At the student level, the oppositeis true. For the collective, course-like initiatives impact at the student level was found all 4 times, while for the initiativeswith an alternative or hybrid character such impact occurred only 6 out of 8 and 3 out of 6 times, respectively. Impactwas found for all three kinds of instructional development every time it was investigated at the institutional level. Heretoo, institutional impact was not examined using pretest/posttest or quasi-experimental designs. Possibly the less rigorouscharacter of the research designs by which institutional impact was examined can explain why only positive impacts werereported.

We conclude that our synthesis gives evidence that collective course-like, instructional development initiatives havefewer behavioral outcomes but more outcomes at the level of the students than initiatives with an alternative or hybridformat. This is surprising because we would expect that outcomes at the level of the students can only be reached bychanges in teachers’ teaching behavior. The number of studies that examined the impact of initiatives with an alternativeand especially with a hybrid format was small. In addition, the impact at the student level was examined infrequently.Further investigation into the differential impact of initiatives with a varied character (traditional, alternative or hybridform) is needed.

8.3. The influence of the target group of instructional development on its impact

Because new faculty were targeted in only one initiative (Rothman & Robinson, 1977) and teaching assistants in only four(Addison & VanDeWeghe, 1999; Brauchle & Jerich, 1998; McDonough, 2006; Stepp-Greany, 2004), our data do not permitthe examination of whether instructional development initiatives targeting one of these two groups have more positiveoutcomes than initiatives with another or no specific target group.

Of the 31 initiatives not particularly targeting new faculty or teaching assistants, 15 were oriented to a specific discipline(Claus & Zullo, 1987; DeWitt et al., 1998; Gallos et al., 2005; Hewson et al., 2001; Howland & Wedman, 2004; Kahn & Pred,2002; McShannon & Hynes, 2005; Nasmith et al., 1995; Nursing faculty development, 1980; Pinheiro et al., 1998; Pololi etal., 2001; Quirk et al., 1998; Sheets & Henry, 1984; Skeff et al., 1998; Slavit et al., 2003). This makes it possible to examinewhether discipline-specific instructional development initiatives produce more positive outcomes than discipline-generalinitiatives. The impact at one of the sublevels described in Table 1 of the instructional development initiatives oriented to aspecific discipline was investigated 43 times, of the discipline-general ones 48 times. Table 5 gives an overview of the extentto which an indication of impact (a partial indication not taking into account so considering only the + in the results columnin Table 2) was found at the four levels of outcome for the two kinds of instructional development. The impact is comparableat all levels. Our synthesis gives evidence for discipline-specific instructional development interventions having an impactcomparable to the impact of discipline-general interventions.

Table 5Impact at the four levels of outcome for the instructional development initiatives categorized on the basis of their discipline-specific/general character.

Level of outcome Discipline-specific initiatives Discipline-general initiatives

Number of timesimpact examined

Number of timesimpact found

Number of timesimpact examined

Number of timesimpact found

Teachers’ learning 22 17 21 16Teachers’ behavior 13 10 13 12Institutional impact 3 3 6 6Students 5 3 8 5

Total 43 33 48 39

Page 23: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 47

9. Discussion and conclusion

9.1. Limitations of generalization

Our literature search for studies of the impact of instructional development in higher education was not limited topublished sources. In this way we tried to overcome a biased representation. Of the studies that met the inclusion criteria, 9were unpublished; 4 were conference papers, and 5 research reports. The other 27 studies were published as journal articles.It is surprising that 30 out of the 36 studies in our synthesis were American; only 3 were European, 1 Asian, and 1 Australian.The study by Gibbs and Coffey (2004) considered instructional development in different countries. The fact that a majorityof studies were American means a possible limitation on the generalizability of the synthesis findings to other continents.

There is evidence that instructional development interventions that extend over time have more positive behavioraloutcomes than one-time events; collective, course-like instructional development initiatives seem to have fewer positivebehavioral outcomes but more outcomes at the student level than initiatives with an alternative or hybrid format. However,because the number of studies in our database that examined the impact of one-time events and of initiatives with analternative or hybrid format was small, the evidence found should be viewed as suggestive and subject to further investigation.

9.2. Future research

Further investigation into the differential impact of initiatives with a varied duration and/or a varied format is but oneof the several fruitful avenues for future research that may be gleaned from our current review. Research into the impactof instructional development targeting specific faculty groups such as teaching assistants or new faculty has been ratherscarce in the past and constitutes an area for further study as well. In fact, future research not focusing on one specificeducational feature of instructional development (e.g., the duration, the form or the target group), but considering the corecharacteristics of instructional development initiatives (e.g., the theoretical foundation, goals and content) in their internalconnection would be even more worthwhile.

In order to make most sense of the literature, the present review was written from a mainly narrative perspective. Althoughthe meta-analysis technique would probably result in less in-depth information (e.g., Shih & Fan, 2009), for future researchit would be worthwhile also to look to use more statistical analyses to synthesize the literature in the field of instructionaldevelopment.

With regard to the nature of research, the suggestions given in earlier reviews (Levinson-Rose & Menges, 1981; McAlpine,2003; Prebble et al., 2004; Steinert et al., 2006; Weimer & Lenze, 1998) are still relevant. We would encourage studiesresearching behavioral outcomes, thereby drawing not only on the self-reports of participants, but also measuring actualchanges in performance, for example by way of observations of videotaped teaching. Students’ perceptions of a teacher’steaching form an alternative way to get an indication of the teacher’s actual classroom behavior. Attempts to capture theeffects at the institutional or the student level would be very worthwhile as well. Of course it remains important to evaluatethe impact at a level that is in line with the objectives and so with the core focus of the instructional development. Theimpact of an initiative aimed to enhance students’ learning can best be evaluated at the level of the students. Instructionaldevelopment attempting to change the institutional culture is best evaluated on its institutional impact.

Much insight could be gained from well-designed studies with a pretest, an experimental character, and/or using a mixed-method approach. Given the state of earlier research, qualitative studies using a pretest and/or a quasi-experimental designwould be very innovative. As remarked by Levinson-Rose and Menges (1981) future research should pay particular attentionto individual differences between participants in an instructional development initiative and/or between their students. Thelong-term effects of instructional development remain a terrain for future study as well.

However, those who want to implement research evaluating the impact beyond the level of the participants’ learning byway of a rigorous research design face difficult challenges. First, random assignment is difficult to achieve in an applied settingsuch as higher education, especially when participation in instructional development is voluntary. A quasi-experimentaldesign whereby the control group consists of teachers who do not want to participate but who are otherwise comparableto the experimental group, for example with regard to demographic characteristics, is a realistic alternative. Working withdifferent groups who participated more or less in instructional development, as Postareff et al. (2007) did, is possible aswell. Second, if only a modest number of teachers participate in instructional development as implemented in an institution,research into its impact will have low power to detect effects, especially with quantitative analyses. Attention to effectsizes to detect not only statistically significant effects, but also effects interesting from a practical point of view can help toovercome this problem. The incorporation of qualitative data as a complementary source of information is an interestingsolution as well. In addition, researchers could think about collaboration across institutions so that more large-scale impactresearch with high numbers of participants in instructional development can be conducted. A difficulty in this approach isthat it should also consider the varied institutional contexts, because these can influence the final impact of the instructionaldevelopment.

Up until now, research reports often have given only vague references to what the initiative is about. In order to be ableto draw a stable conclusion on the features of the best development initiatives (preferably considered in their internal con-nection), it would be helpful if each individual study would describe in detail the concrete instructional development whoseimpact is being examined. This would also allow studies to build upon another: for instructional development initiatives

Page 24: The impact of instructional development in higher education: The state-of-the-art of the research

48 A. Stes et al. / Educational Research Review 5 (2010) 25–49

with a similar focus, the same dependent variables should be measured in similar ways. Instruments developed in impactresearch and tested for validity and reliability should be used across studies, if necessary adapted to the specific context inwhich it is used.

9.3. Conclusion

With regard to the nature of research, our synthesis reveals that more attention should be given to studies researchingbehavioral outcomes, thereby drawing not only on self-reports of participants, but also measuring actual changes in perfor-mance. Attempts to capture the effects at an institutional or student level would be very worthwhile as well. Much insightcould be gained from well-designed studies with a pretest, a quasi-experimental character and/or using a mixed-methodapproach. The long-term effects of instructional development remain a terrain for future study too. Use of the same instru-ments would facilitate the comparability of research results as well as make it easier for studies to build upon one another.Our synthesis gives some evidence that the duration and the nature of instructional development influence its impact.

Acknowledgment

We wish to thank Daniel L. Dinsmore from the University of Maryland for his feedback on an earlier draft of this article.

References1

*Addison, J., & VanDeWeghe, R. (1999). Portfolio-based assessment and professional development. English Education, 32, 16–33.*Bennett, J., & Bennett, L. (2003). A review of factors that influence the diffusion of innovation when structuring a faculty training program. Internet and

Higher Education, 6, 53–63.*Brauchle, P. E., & Jerich, K. F. (1998). Improving instructional development of industrial technology graduate teaching assistants. Journal of Industrial Teacher

Education, 35, 67–92.*Braxton, J. M. (1978, May). Impact of workshops for instructional improvement: The results of an evaluation of a component of faculty development

program. Paper presented at the annual Association for Institutional Research Forum, May 21–25, Houston, Texas.Centra, J. A. (1989). Faculty evaluation and faculty development in higher education. In J. C. Smart (Ed.), Higher education: Handbook of theory and research

(pp. 155–179). New York: Agathon Press.*Claus, J. M., & Zullo, T. G. (1987). An adaptive faculty development program for improving teaching skills. Journal of Dental Education, 51(12), 709–712.Conti, G. (1985). Assessing teaching style in adult education: How and why. Lifelong Learning, 8(8), 7–11, 28.Cook, C. E. (2001). The role of a teaching centre in curricular reform. To Improve the Academy, 119, 217–231.*DeWitt, P., Birrell, J. R., Egan, M. W., Cook, P. F., Oslund, M. F., & Young, J. R. (1998). Professional development schools and teacher educators’ beliefs:

Challenges and change. Teacher Education Quarterly, 25(2), 63–80.Dixon, K., & Scott, S. (2003). The evaluation of an offshore professional-development programme as part of a university’s strategic plan: A case study

approach. Quality in Higher Education, 9, 287–294.Dochy, F., Segers, M., Van den Bossche, P., & Gijbels, D. (2003). Effects of problem-based learning: A meta-analysis. Learning and Instruction, 5(13),

533–568.Eison, J., & Stevens, E. (1995). Faculty development workshops and institutes. In W. A. Wright, & Associates, Teaching improvement practices. Boston, MA:

Anker.Ehrmann, S., & Zuniga, R. (1997). The flashlight current student inventory. Corporation for public broadcasting. Retrieved May 12, 2008, from

http://www.tltgroup.org.*Fedock, P. M, Zambo, R., & Cobern, W. W. (1993). The professional development of college science professors as science teacher educators. Paper presented

at NARST, April 1993, Atlanta, Georgia.*Fidler, P. P., Neururer-Rotholz, J., & Richardson, S. (1999). Teaching the freshman seminar: Its effectiveness in promoting faculty development. Journal of the

First-Year Experience & Students in Transition, 11(2), 59–74.*Finkelstein, M. (1995). Assessing the teaching and student learning outcomes of the Katz/Henry faculty development model. Report written by the New Jersey

Institute for Collegiate Teaching and Learning, South Orange.Fishman, B. J., Marx, R. W., Best, S., & Tal, R. T. (2003). Linking teacher and student learning to improve professional development in systemic reform. Teaching

and Teacher Education, 19, 643–658.Freeth, D., Hammick, M., Koppel, I., Reeves, S., & Barr, H. (2003). A critical review of evaluations of interprofessional education. London: Higher

Education Academy Learning and Teaching Support Network for Health Sciences and Practice. Available from: http://www.health.ltsn.ac.uk/publications/occasionalpaper/occasionalpaper02.pdf.

*Gallos, M. R., van den Berg, E., & Treagust, D. F. (2005). The effect of integrated course and faculty development: Experiences of a university chemistrydepartment in the Philippines. International Journal of Science Education, 27(8), 985–1006.

*Gibbs, L. E., Browne, M. N., & Keeley, S. M. (1988). Stimulating critical thinking through faculty development: Design, evaluation, and problems. Washington,DC: Wisconsin University, Eau Claire, American Association of State Colleges and Universities.

*Gibbs, G., & Coffey, M. (2004). The impact of training of university teachers on their teaching skills, their approach to teaching and the approach to learningof their students. Active Learning in Higher Education, 5(1), 87–100.

Gijbels, D., Dochy, F., Van den Bossche, P., & Segers, M. (2005). Effects of problem-based learning: A meta-analysis from the angle of assessment. Review ofEducational Research, 75(1), 27–61.

Guskey, T. R. (2000). Evaluating professional development. Thousand Oaks, CA: Corwin Press.*Harnish, D., & Wild, L. A. (1993). Peer mentoring in higher education: A professional development strategy for faculty. Community College Journal of Research

and Practice, 17(3), 271–282.*Hewson, M. G., Copeland, H. L., & Fishleder, A. J. (2001). What’s the use of faculty development? Program evaluation using retrospective self-assessments

and independent performance ratings. Teaching and Learning in Medicine, 13(3), 153–160.Holton, E. F., III. (1996). The flawed four-level evaluation model. Human Resource Development Quarterly, 7(1), 5–21.*Howland, J., & Wedman, J. (2004). A process model for faculty development: Individualizing technology learning. Journal of Technology and Teacher Education,

12, 239–263.

1 References marked with an asterisk indicate studies included in the present review.

Page 25: The impact of instructional development in higher education: The state-of-the-art of the research

A. Stes et al. / Educational Research Review 5 (2010) 25–49 49

*Hubball, H., Collins, J., & Pratt, D. (2005). Enhancing reflective teaching practices: Implications for faculty development programs. The Canadian Journal ofHigher Education, 35, 57–81.

*Kahn, J., & Pred, R. (2002). Evaluation of a faculty development model for technology use in higher education for late adopters. Computers in the Schools,18, 127–150.

Kirkpatrick, D. L. (1994). Evaluating training programs: The four levels. San Francisco, CA: Berrett-Koehler Publishers.Kreber, C., & Brook, P. (2001). Impact evaluation of educational development programmes. International Journal of Academic Development, 6(2), 96–102.Levinson-Rose, J., & Menges, R. J. (1981). Improving college teaching: A critical review of research. Review of Educational Research, 51, 403–434.Marsh, H. W. (1982). SEEQ: A reliable, valid and useful instrument for collecting students’ evaluations of university teaching. British Journal of Educational

Psychology, 52, 77–95.McAlpine, L. (2003). Het belang van onderwijskundige vorming voor studentgecentreerd onderwijs: de praktijk geëvalueerd. [The importance of instructional

development for student centered teaching: An examination of practice.] In N. Druine, M. Clement, M., & K. Waeytens (Eds.), Dynamiek in het hogeronderwijs. Uitdagingen voor onderwijsondersteuning [Dynamics in higher education: Challenges for teaching support] (pp. 57–71). Leuven: UniversitairePers.

*McDonough, K. (2006). Action research and the professional development of graduate teaching assistants. Modern Language Journal, 90(1), 33–47.*McShannon, J., & Hynes, P. (2005). Student achievement and retention: Can professional development programs help faculty GRASP it? Journal of Faculty

Development, 20(2), 87–94.*Medsker, K. L. (1992). NETwork for excellent teaching: A case study in university instructional development. Performance Improvement Quarterly, 5,

35–48.*Michael, S. (1993). Crossing the disciplinary boundaries: Professional development through action research in higher education. Higher Education Research

and Development, 12(2), 131–142.*Nasmith, L., Saroyan, A., Steinert, Y., Lawn, N., & Franco, E. D. (1995). Long-term impact of faculty development workshops. Report of McGill University, Canada.*Nurrenbern, S. C., Mickiewicz, J. A., & Francisco, J. S. (1999). The impact of continuous instructional development on graduate and undergraduate students.

Journal of Chemical Education, 76(1), 114–119.Norton, L., Richardson, J. T. E., Hartley, J., Newstead, S., & Mayes, J. (2005). Teachers’ beliefs and intentions concerning teaching in higher education. Higher

Education, 50, 537–571.*Nursing Faculty Development. (1980). Final report. Atlanta, GA: Southern Regional Education Board.*Pinheiro, S. O., Rohrer, J. D., & Heimann, C. F. L. (1998). Assessing change in the teaching practice of faculty in a faculty development program for primary

care. In Paper presented at the Annual Meeting of the American Educational Research Association San Diego, CA, April 13–17, (p. 1998).*Pololi, L., Clay, M. C., Lipkin, J. R., Hewson, M., Kaplan, M. C., & Frankel, R. M. (2001). Reflections on integrating theories of adult education into a medical

school faculty development course. Medical Teacher, 23, 276–283.*Postareff, L., Lindblom-Ylänne, S., & Nevgi, A. (2007). The effect of pedagogical training on teaching in higher education. Teaching and Teacher Education, 23,

557–571.Prebble, T., Hargraves, H., Leach, L., Naidoo, K., Suddaby, G., & Zepke, N. (2004). Impact of student support services and academic development programmes on

student outcomes in undergraduate tertiary study: A synthesis of the research. Report to the Ministry of Education, Massey University College of Education.Prosser, M., & Trigwell, K. (1999). Understanding learning and teaching. The experience in higher education. Suffolk: Society for Research into Higher Education

& Open University Press.*Quirk, M. E., DeWitt, T., Lasser, D., Huppert, M., & Hunniwell, E. (1998). Evaluation of primary care futures: A faculty development progra for community

health center preceptors. Academic Medicine, 73, 705–707.*Rakes, T. (October, 1982). Staff development for university level English Faculty: Improving the teaching of reading and writing. In Paper presented at the

Annual Meeting of the College Reading Association October 28–30, Philadelphia, PA,Ramsden, P. (1991). A performance indicator of teaching quality in higher education: The course experience questionnaire. Studies in Higher Education, 16,

129–150.*Rothman, A. I., & Robinson, S. (1977). Evaluation of a training course. Canadian Journal of Higher Education, 7, 19–35.*Sheets, K. J., & Henry, R. C. (1984). Assessing the impact of faculty development programs in medical education. Journal of Medical Education, 59, 746–748.Shih, T., & Fan, X. (2009). Comparing response rates in e-mail and paper surveys: A meta-analysis. Educational Research Review, 4(1), 26–40.*Skeff, K. M., Stratos, G. A., Bergen, M. R., & Regula, D. P. (1998). A pilot study of faculty development for basis science teachers. Academic Medicine, 73,

701–704.*Slavit, D., Sawyer, R., & Curley, J. (2003). Filling your plate: A professional development model for teaching with technology. TechTrends, 47, 35–38.Steinert, Y., Mann, K., Centeno, A., Dolmans, D., Spencer, J., Gelula, M., et al. (2006). A systematic review of faculty development initiatives designed to

improve teaching effectiveness in medical education: BEME Guide No. Medical Teacher, 28(8), 497–526.*Stepp-Greany, J. (2004). Collaborative teaching in an intensive Spanish course: A professional development experience for teaching assistants. Foreign

Language Annals, 37, 417–426.*Sydow, D. L. (1998). Outcomes of the VCCS professional development initiative: 1993–1998. Big Stone Gap, VA: Mountain Empire Community College.Taylor, L., & Rege Colet, N. (2009). Making the shift from faculty development to educational development: A conceptual framework grounded in practice.

In A. Saroyan, & M. Frenay (Eds.), Building teaching capacities in higher education: A comprehensive international model. Sterling, VA: Stylus Publishing.Wilson, S. M., & Berne, J. (1999). Teacher learning and the acquisition of professional knowledge: An examination of research on contemporary professional

development. In A. Iron-Nejad, & P. D. Pearson (Eds.), Review of Research in Education (pp. 173–209). Washington, D.C.: AERA.Weimer, M., & Lenze, L. F. (1998). Instructional interventions: A review of the literature on efforts to improve instruction. In R. Perry, & J. Smart (Eds.),

Effective teaching in higher education (pp. 205–240). New York: Agathon Press.