Upload
ngoduong
View
212
Download
0
Embed Size (px)
Citation preview
DI
SC
US
SI
ON
P
AP
ER
S
ER
IE
S
Forschungsinstitut zur Zukunft der ArbeitInstitute for the Study of Labor
When the Cat Is Near, the Mice Won’t Play:The Effect of External Examiners in Italian Schools
IZA DP No. 6629
June 2012
Marco BertoniGiorgio BrunelloLorenzo Rocco
When the Cat Is Near, the Mice Won’t Play: The Effect of External Examiners in
Italian Schools
Marco Bertoni University of Padova
and CEP
Giorgio Brunello
University of Padova, IZA and CESifo
Lorenzo Rocco
University of Padova
Discussion Paper No. 6629 June 2012
IZA
P.O. Box 7240 53072 Bonn
Germany
Phone: +49-228-3894-0 Fax: +49-228-3894-180
E-mail: [email protected]
Any opinions expressed here are those of the author(s) and not those of IZA. Research published in this series may include views on policy, but the institute itself takes no institutional policy positions. The Institute for the Study of Labor (IZA) in Bonn is a local and virtual international research center and a place of communication between science, politics and business. IZA is an independent nonprofit organization supported by Deutsche Post Foundation. The center is associated with the University of Bonn and offers a stimulating research environment through its international network, workshops and conferences, data service, project support, research visits and doctoral program. IZA engages in (i) original and internationally competitive research in all fields of labor economics, (ii) development of policy concepts, and (iii) dissemination of research results and concepts to the interested public. IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author.
IZA Discussion Paper No. 6629 June 2012
ABSTRACT
When the Cat Is Near, the Mice Won’t Play: The Effect of External Examiners in Italian Schools*
We use a natural experiment to show that the presence of an external examiner in standardized school tests reduces the proportion of correct answers in monitored classes by 5.5 to 8.5% – depending on the grade and the test – with respect to classes in schools with no external monitor. We find that the effect of external monitoring in a class spills over to other classes in the same school. We argue that the negative effect of external supervision is due to reduced cheating (by students and/or teachers) rather than to distraction from having a stranger in the class. JEL Classification: C31, H52, I2 Keywords: education, testing, external monitoring Corresponding author: Marco Bertoni Centre for Economic Performance London School of Economics and Political Science Houghton Street London WC2A 2AE United Kingdom E-mail: [email protected]
* The authors are grateful to Daniele Checchi, Ifty Hussain, Edwin Leuven, Marco Manacorda, Guy Michaels, Steve Pischke, Olmo Silva and the audience at the 2012 LSE-CEP Annual Conference for comments and suggestions. We also thank Patrizia Falzetti and Paolo Sestito (INVALSI) for help with data collection. Financial support by the Ministry of Italian Universities (PRIN contract n. 2009MAATFS_002) is gratefully acknowledged. All errors are our own.
2
1.Introduction
A problem with test – based accountability systems in education is that they generate
incentivesforteachers,studentsandschooladministratorsto“game”thesysteminorderto
obtainbetterresults.Onemechanismforinflatingtestscoresisoutrightcheating.Empirical
analysis of cheating behaviour is scarce1. In their influential study, Jacob andLevitt (2003)
developanalgorithmfordetectingteachercheatingthatcombinesinformationonunexpected
test score fluctuationsandsuspiciouspatternsofanswers for students ina class.They find
thatasmallfractionofChicagoteachersrespondedtoaccountabilitypressuresbycompleting
studentexaminationsinanattempttoimproveobservedstudentoutcomes.
In this paper,we take adifferent approach and start from theobservation that strategic
manipulation by teachers, students and administrators can be substantially reduced if an
externalexaminerisactivelyengagedinmonitoringentirelyorinpartthetestprocess.Using
a natural experiment designed by the Italian central test administrator (INVALSI), which
assigned external examiners to randomly selected classes and schools with the task of
monitoringstudentstakingthetestandreportingresults2,wecomparetestoutcomesinthe
classes with an external examiner with the outcomes in other classes, where the test was
administered by a local teacher, and conclude that the gap measures cheating in un‐
monitoredclasses.
Our study contributes to the literature on school accountability in twomain directions.
First, we show that the introduction of external examiners has a significant effect on
measured test scores in an environmentwhere there are incentives tomanipulate results.
Second,wedocumentthatthemonitoringeffectsofhavinganexternalexaminerspilloverto
un‐monitored classes of the same school. We decompose the overall effect of external
monitoring ‐whichwemeasureas thedifference in the average rateof correct answers in
monitoredclassesandinclassesofun‐monitoredschools‐intoadirectandanindirecteffect.
Thedirecteffect isthedifferenceinthetestperformancebetweenclasseswithandwithout
externalexaminersbelongingtoschoolsselectedforexternalmonitoring.Theindirecteffect
is instead thedifference inperformancebetweenun‐monitored classes in a schoolwith an
externalexaminerandun‐monitoredclassesinschoolswithoutexternalexaminers.
1SeeFiglioandLoeb,2011,forareviewoftherecentliterature.2Thesetestsaretakenbytheuniverseofprimarysecondandfifthgradestudents.INVALSIsampledanumberofclassesandschoolsforexternalmonitoringtoobtainreliabledata,speedupdatacollectionandverificationandprepareanannualreportonthestateofprimaryeducationinItaly.
3
Weestimatethathavinganexternalexaminerreducesthepercentageofcorrectanswers
by3.6to5.4percentagepoints‐dependingonthegradeandthetest‐whichcorrespondsto
5.5 to8.5%of theaveragescore inclassesbelonging toschoolswithnoexternalexaminer.
Theestimateddirect effect ranges from2.8 to4.2percentagepoints (4.3 to6.6%), and the
residualindirecteffectfrom0.8to1.2percentagepoints(1.2to1.9%).
Using additional supporting evidence on the psychological conditions of students before
and during the test and on the distribution of answers within classes, we argue that the
negative impact of thepresenceof an external examineronmeasured test scores is due to
reduced cheating (by students and/or teachers) rather than to the negative effects of
distraction fromhavingastranger in theclass.Wediscuss twoalternativereasonswhythe
effectsofmonitoringspreadfromthemonitoredclasstotheotherclassesinthesameschool.
Thefirstisthatthepresenceofanexternalexaminerintheschoolactsasadisciplinarydevice
alsoonstudentsandteachersinotherclassesofthesameschoolbecauseofthefearthatthe
examinermayroamabout.Thesecondisthatteachersdislikeexcessivedispersioninaverage
testscoreswithinthesameschool,becauseoftheconflictsitcouldgenerate.
We find that theestimatedoverall effectofexternal supervision is significantlyhigher in
theschoolslocatedinSouthernItalythaninNorthernschoolsandinschoolswhereclasssize
is smaller and the proportion of tenured teachers is higher. We show that territorial
differencesareassociatedtodifferencesinsocialcapital,evenaftercontrollingfordifferences
inGDPpercapitaandunemploymentrates.
Thepaperisorganizedasfollows:Section2reviewstherelevantliteratureandSection3
describesthedesignoftheINVALSItestandthedataset.Theempiricalstrategyispresented
inSection4.Themainempiricalresults,afewrobustnesschecksandextensionsarereported
inSection5,6and7,respectively.Conclusionsfollow.
2.Reviewoftheliterature
Aside from outright cheating studied by Jacob and Levitt (2003), the literature has
identified several indirect ways that teachers and school administrators can use to
manipulatestudentresults.Ontheonehand, Jacob(2005),Figlio(2006),FiglioandGetzler
(2006),CullenandReback(2006)andHussain(2012)investigatewhetherschoolsengagein
strategic manipulation of the composition of the pool of tested students by excluding low
ability students, either by reclassifying them as disabled or by strategically using grade
4
retention anddisciplinary suspensions.On the other hand, Figlio andWinicki (2005) show
that during testing periods some schools increase the caloric intake provided by school
cafeterias so as to boost students’ performance. Attempts to increase test scores by taking
psycho‐stimulant drugs are documented for theUS by Bokhari and Schneider (2011),who
showthatthediagnosisof“attentiondeficit/hyperactivitydisorder”ismorefrequentinstates
wheretherearestrongeraccountabilitylaws.
Toourknowledge,wearethefirstinthisliteraturetoinvestigatecheatingbylookingatthe
directandindirecteffectsofhavingexternalexaminersmonitorteachersandstudentsduring
thetest.Thepresenceofindirecttreatmenteffectshasbeenalreadyuncoveredinabroader
literature.Heckman,LalondeandSmith(1999), for instance,discusshowpolicyeffectsmay
spread to those not directly participating in the programme mainly because of general
equilibriumorspill‐overeffects.MiguelandKremer(2004)evaluatebothdirectandexternal
effectsofaKenyanprogrammeaimedattreatingintestinalwormsinfectionamongprimary
school kids. In a similar fashion, Angelucci and De Giorgi (2009) evaluate the effects of
Progresa, aMexican aid programme based on cash transfers, and stress the importance of
estimating indirect treatment effects on the ineligibles when there are social interactions
betweeneligibleandineligibleindividuals.
3.TheDesignofINVALSIServizioNazionalediValutazione(SNV)TestsandtheData
INVALSI3 standardized tests in Italian and maths were introduced in Italian primary
schoolsin20084toevaluateschoolproductivity(intermsofvalueadded).Thesetestsarenot
formallyhigh‐stakes,becausetheallocationofresourcestoschools,thesalaryofteachersand
theschoolcareerofstudentsdonotexplicitlydependontestoutcomes.Evenso,pressureto
performwellinthetestshasbeenhighbothbecauseofthewidespreadexpectationthatthey
mightbeusedatsomepoint toevaluateteachersandbecause theschoolreputationwasat
stake.While results of the tests are not publicly available, schools and their principals can
access the resultsof their students anddecide tomake thempublic,which creates another
incentivetoperformwell.
Since 2008 the tests have been administered every year. In this paper,we focus on the
2009/2010wavebecauseofitspeculiardesignfeatures.First,thiswavewasthefirsttotest3 INVALSI is the National Institute for the Evaluation of the Education System, in charge of the design andadministrationofstandardizededucationtestsinItaly.4SeeLawDecreen.147–2007,andMinistryofEducationandResearchDecreen.74and76–2009.
5
andcollectdata for theentirepopulationof Italianprimaryschoolstudents in theirsecond
and fifth grade. Second, andmost important for our purposes, in 2000 randomly selected
classes‐outofapopulationofabout30000‐thetestwasadministeredinthepresenceofan
external examiner5,whohad twomain tasks: a)bepresent in the classduring the test and
monitor its correct implementation; b) report student answers on the dedicated answer
sheets and transmit them to INVALSI. In the other classes, the test was administered by
teachersoftheschool(butnotoftheclassandnotinthesubjecttested),andreportingwas
donejointlywiththeteacheroftheclass.Weusetherandomselectionofclassesasanatural
experimenttoestimatetheeffectsofexternalmonitoringontestoutcomes.
Classesassignedtoexternalmonitoringweresampledusingatwo‐stagessamplingscheme,
stratified by region6. In the first stage, a pre‐determined number of schools in each region
wererandomlyselectedbyprobabilisticsampling,withprobabilityofinclusionproportional
toschoolsize,measuredbythetotalnumberofstudentsenrolledinthetestedgrades.Inthe
secondstage,oneortwoclasseswithineachtreatedschoolwereselectedbysimplerandom
sampling7.Table1 shows foreachgrade the totalandsamplednumberofprimaryschools,
classesandpupils:about18%ofallprimaryschoolsandcloseto7%ofallclassesandpupils
inthesecondandfifthgradewereselectedtohaveanexternalexaminerduringthetest.
Wehaveaccesstothedatacontainingthe individualanswerstothequestionsofthetest
givenby thesecondand fifthgradeprimaryschool studentswho took the INVALSI tests in
2009/2010.For thesestudents,wealsohavedataon individualmarks in Italianandmaths
duringthesemesterbeforethetestwastakenandonparentalbackground,bothprovidedby
schooloffices.Exclusively for fifth graders, INVALSIuseda studentquestionnaire to collect
additionaldataonparentalbackgroundandthefeelingsandmotivationduringthetests.We
obtained fromINVALSIadditional informationonschoolandclasscharacteristics, including
the number of students enrolled in each class and in each school for each tested grade,
whethertheschoolispublicorprivate,theproportionoftenuredteachersineachschooland,
onlyforfifthgradestudents,anindexofindividualeconomic,socialandculturalstatus(ESCS)
8.
5External examinerswere selectedby INVALSIand theRegional SchoolingAuthoritiesmainlyamong retiredteachersandactiveteachersemployedinnon‐primaryschools.6RegionValled’AostaandtheProvinceofBolzanoautonomouslydecidedtohaveallclassesassignedtoexternalmonitoring.For this reason,weexcludethemfromthe followinganalysis.OtherdatamanagementoperationsaredescribedintheAppendix.7Theprecisenumberofsampledclassesdependsonschoolsize.8 Available variables for all students include also: 1) at the school level: whether the school offers full timeschedule;2)attheclasslevel:classsizemeasuredbothasthenumberofstudentsenrolledintheclassandasthenumberofstudentswhowerepresentatthetest,fullorparttimeschedule(measuredintermofthescheduleof
6
4.IdentificationandEstimation
We define the following three potential outcomes at the class level: Y00 if the class was
assignedtoaschoolwithnoexternalobserver(anuntreatedclassinanuntreatedschool),Y11
incaseofdirectmonitoring(atreatedclass inatreatedschool)andY01 if theclasswasnot
monitoredbyanexternalexaminerbutbelonged toaschoolwhereat leastoneotherclass
wasmonitored (anuntreated class in a treated school)9.Bydesign, all classesof untreated
schoolsareun‐monitored.
LetthedummyvariableSj takethevalueoneifschool jhasbeenassignedtoschool‐level
treatment(andzerootherwise)andthedummyCitakevalueoneifclassihasbeenassigned
toclass‐leveltreatment(andzerootherwise).TheobservedoutcomeYijforclassiinschoolj
canberepresentedintermsofpotentialoutcomesasfollows:
Yij (1 S j )Y00 S jCiY11 S j (1Ci)Y01 (1)
We are interested in the identification and estimation of a) the average direct effect of
monitoringE[Y11‐Y01];b)theaverageindirecteffectofmonitoringE[Y01‐Y00];c)theaverage
totaleffectofmonitoringE[Y11‐Y00],whereE[.]isthemeanoperator.
The samplingprocedure–described in INVALSI (2010a)–has the following features: a)
withina region, twoschoolsof thesamesize (i.e., samenumberof studentsenrolled in the
secondandfifthgrade)havethesameprobabilityofbeingassignedtoschool‐leveltreatment;
b) two treated schools of same size have the same probability of being assigned to the
selectionof oneor two classesper grade for externalmonitoring; c) two classesof a given
gradebelongingtotwodifferenttreatedschoolswiththesamesizehavethesameprobability
ofbeingmonitoredifthenumberofclassesinthegradeisthesameinthetwoschools.
Thisprocedure implies thatwehave conditional randomization,meaning that a) in each
region, theassignment to school ‐ level treatment is random, conditional on the sizeof the
school,measured by the number of students enrolled in the second and fifth grade; b) the
themedian student in the class, to avoidmeasurement errors); 3) at individual level: gender, place of birth,citizenship,attendanceofpre‐primaryschool,age,employment,educationandnationalityofparents.For fifthgradestudentsonlywehaveinformationon:whetherthestudentathomehasownbedroom,internetaccess,anencyclopaedia,owndesk,acomputerandaplace fordoinghomework, thenumberofbooks in thehouse, thenumberofsiblings,whethersheliveswithbothparentsornot,thelanguagespokenathome,whethershegetshelpwithherhomeworkornot.9Potentialoutcomesarestochasticvariables,withrealization(Y00,Y01,Y11)ijforclassiinschoolj.
7
assignmenttoclass‐leveltreatmentforaclassofagivengradeinatreatedschoolisrandom
conditionalonthesizeoftheschool,measuredbothbythenumberofstudentsenrolledinthe
secondandfifthgradeandbythenumberofclassesintheselectedgrade.
LetRDbeavectorofregionaldummies,RSjavectorofregionaldummiesinteractedwith
thesizeofschoolj,RCjavectorofregionaldummiesinteractedwiththenumberofclassesina
givengradeinschooljanddefinethevectorRasR=[RD,RS,RC].Conditionalrandomization
ineachgradeimpliesthat
Y00, Y01, Y11 S j, Ci | R (2)
When(2)holds,theeffectsofexternalmonitoringaregivenby
E[Yij |Ci 1,S j 1,R] E[Yij |Ci 0,S j 1,R] E[Y11 Y01 | R] (3)
E[Yij |Ci 0,S j 1,R] E[Yij |Ci 0,S j 0,R] E[Y01 Y00 | R] (4)
E[Yij |Ci 1,S j 1,R] E[Yij |Ci 0,S j 0,R] E[Y11 Y00 | R] (5)
LetXbeavectorofcovariatesattheschool,classandindividual level.Table2showsthe
means and standarddeviations of these covariates (PanelA) aswell as of other covariates
usedinSection7(PanelB)forthesampleoffifthgradersattendingthemathstest.Wetestfor
successfulrandomizationbycheckingwhether thevariables invectorXarebalanced in the
treatmentandcontrolsub‐samples.Althoughwehavedata forsecondandfifthgraders,we
focushereinafteronthelattertosavespace.Someresultsforsecondgradersareshowninthe
Appendix.Totestforbalancingweconsiderbothdifferencesbetweentreatedanduntreated
schools and differences between treated and untreated classeswithin treated schools. For
eachcovariateinvectorXwerun
X j t j RDRSj j (6)
forbetween‐schoolsand
Xij tij R ij (7)
8
forwithin‐school randomization,where t are dummy variables for treatment at the school
andclass level.Table3reportsthepointestimatesof theβcoefficientsandthesignificance
level of the test of the hypothesis H0: β=0 for each covariate in X. Since balancing is not
attained for the number of students enrolled in a class, which is greater among treated
classes,we include this variable as a covariate in all our regressions. Turning to individual
variables, although for some covariateswe detect statistically significant differences across
the various groups, the point estimates show that these differences are virtually zero in
almost all cases. Prudentially, we add these variables as covariates in our regressions to
eliminatetheriskofunbalancingandtoincreaseprecision.
Weevaluatetheeffectsofexternalmonitoringonclassperformanceinthe(maths)testby
estimating
Yij CijS j S j Rj Xij ij (8)
where the dependent variable is the percentage of correct answers in the class and the
standarderrorsarerobustandweightedwiththenumberofstudentsintheclass.Thedirect,
indirectandoveralleffectofexternalmonitoringaregivenby
a) directeffect:
E[Yij |Ci 1,S j 1,R,X] E[Yij |Ci 0,S j 1,R,X] (9)
b) indirecteffect:
E[Yij |Ci 0,S j 1,R,X] E[Yij |Ci 0,S j 0,R,X] (10)
c) totaleffect:
E[Yij |Ci 1,S j 1,R,X] E[Yij |Ci 0,S j 0,R,X] (11)
5.Results
Table4presentsourestimatesof(8)forfifthgradersandthemathstest10.Thefirstcolumn
inthetableconsidersallItalianregions,andtheremainingcolumnsshowestimatesbymacro
area(North,CentreandSouth).Wefindthathavinganexternalexaminerintheclassreduces
the percentage of correct answers by 3.59 percentage points, which corresponds to a 5.5
percent decline with respect to the mean score in untreated schools. Close to 80 percent
10ResultsforItalianandsecondgradersarequalitativelysimilarandareshownintheAppendix.
9
(2.79/3.59)ofthistotaleffectisdirect,andtheremaining20percent(0.81/3.59)isindirect.
AsshowninTableA.2intheAppendix,thetotaleffectissomewhatlargerforsecondgraders
(5.4 percentage points, or 8.5%of the average score in untreated schools). The size of the
total,directandindirecteffectsvarieswiththemacroareaandishighestinSouthernregions
(‐8.9%)andlowestinNorthernItaly(‐2.6%).
Why are test results worse on average in classes with the external examiner? One
possibilityisthatyoungstudentsaredistractedbythepresenceofastrangerintheclassand
under‐performasaconsequence.Theotherpossibility is thateitherstudentsorteachers in
classes without the external examiner engage in outright cheating11. We believe that the
secondoneistheexplanation,forthefollowingthreereasons.First,thereisnoevidencethat
students in classeswith the external examiner are negatively affected in their feelings and
motivation to complete the test properly. In a questionnaire filled up by fifth graders
participatingtothetest inclasseswithandwithouttheexternalexaminer,INVALSIaskeda
set of motivational questions aimed at capturing the psychological status of the students
duringthetest,whichincludedagreeingornotwiththefollowingsentences:a)Iwasalready
anxious before starting the test; b) I was so nervous I couldn’t find the answers; c) while
answering,IfeltlikeIwasdoingbadly;d)whileanswering,Iwascalm.Table5presentsthe
resultsofestimatingequation(8)whenthedependentvariableisthepercentageofstudents
intheclassagreeingwitheachofthefourstatementsabove.Wefindnoevidencethatbeingin
aclasswithanexternalexaminerincreasedanxietyornervousness.Quitetheopposite,there
issomemildevidencethatstudentsintheseclasseswerelessnervousandcalmerduringthe
test.
Second,we examine thedistribution of resultswithin classes. In the absence of external
controls, the teacher can communicate the correct answers to students or change their
answersintheanswersheet,orstudentscansimplycopyfromeachother.Ifoutrightcheating
bystudentsand/orteacherswastakingplaceintheclasseswithouttheexternalexaminer,we
shouldfindthatintheseclasses–ceterisparibus‐thestandarddeviationandthecoefficient
of variation of test results are lower than in classes with the external examiner, where
cheating is minimized or altogether absent. While distraction can reduce average
performance, it isnotobviousthatitreducesitsvariability.Table6showstheeffectsofthe
11Weimplicitlyassumethatexternalexaminershavenoincentivetocheatandtocolludewithschoolteachersandprincipalsinordertoboostschoolresults.Insupportofthisassumption,INVALSI(2010a)usedaprocedureto detect cheating inmonitored classes and concluded that therewas no evidence of cheating. The cheatingdetectionalgorithmisdescribedinINVALSI(2010b).
10
presenceofanexternalexamineronthewithin–classstandarddeviationandcoefficientof
variationofthepercentageofcorrectanswers,aswellasonthebottomquartile,medianand
topquartileofthedistributionoftestscoreswithinclasses.
ThetablefocusesontheresultsofthemathstesttakenbyfifthgradersinSouthernItaly,
where the gap in thepercentageof correct answers betweenmonitored andun‐monitored
classes is largest.We find that inclasseswith theexternalexaminer thestandarddeviation
and the coefficient of variation of results are about 10% and 20% higher than in un‐
monitoredclasses.Thereisalsoevidencethatthepresenceoftheexternalexamineraffectsto
a higher extent the performance of students in the lower quartile of the distribution of
outcomes, in linewith the expectation that cheating typically helps low performers.When
comparedwithstudentsinuntreatedschools,havinganexternalexaminerreducesthescore
ofthesestudentsbya12.9%(‐7.62/59.0).Thiseffect isstrongerforsecondgradestudents,
whereitreachesastriking18.7%.
Lastbutnotleast,wecomputeanindexofheterogeneityinthepatternofanswersgivenby
studentsineachclass.Foreachquestion,weuseamodifiedversionoftheHerfindahlIndex12
H 1 sa
2
a1
A
11
A
. (12)
wheresa is thewithin‐classshareofstudentswhochoseanswer“a” inthesetAofpossible
answers13. The variable H ranges between 0 and 1, with higher values signalling a more
heterogeneouspatternofanswerstoagivenquestion.Weobtainanoverallmeasureof the
heterogeneity of answers in the class by averagingH across all questions in the test. The
lower thismeasure the higher the likelihood that systematic cheating has occurred in the
class. Table 7 reports the estimates of equation (8)when thedependent variable isH, and
showsthatheterogeneityissignificantlyhigherinclasseswiththeexternalexaminer.Wealso
findthat,asinthecaseofthepercentageofcorrectanswersintheclass,theeffectsofexternal
monitoringon theheterogeneityofanswers increasesignificantlymoving fromNorthernto
SouthernItaly(columns(2)to(4)).
12SeeINVALSI(2010b).13Wetreatmissingvaluesasaseparatecategory.Answerstoopenquestionswithaunivocallycorrectanswerwerecodedascorrect,incorrectormissing.
11
An interesting and novel result of our analysis is that external examiners affect
performancenotonlyintheclasstheysupervisebutalsoinotherclassesofthesameschool.
This indirecteffectofmonitoring inschool testshasnotbeendetectedbeforeanddeserves
further explanation. One interpretation is that teachers administering the test in the same
schoolwheretheexternalexaminerispresentareafraidtobemonitoredbythissupervisor
and therefore restrain their cheating activities. This interpretation relies on irrational
behaviour, because teachers were informed before the test that the external examiner’s
mandatewasrestrictedtotherandomlyselectedclass.
An alternative explanation is that teachers dislike excessive dispersion in average test
scoreswithin the same school, because suchdispersion couldgenerate conflictswithother
teachers. To illustrate, consider a schoolwhere a single class is supervised by an external
examiner.Ifteachersadministeringthetestintheotherclassescheatfreely,theseclasseswill
lookmuchbetterthanthesupervisedclass,wherecheatingisrestrained.Thismaygenerate
conflictswiththeteacherinchargeofthesupervisedclass.Toreducetheseconflicts,teachers
inun‐monitoredclassesmaybeinducedtorestraintheircheating.
6.Robustnesschecks
In this sectionwe investigatewhether ourmain results are robust to several sensitivity
checks.First,sincethedependentvariableofourmainestimatesisafraction(thepercentage
of correct answers in the class)we implement the GLM estimator proposed by Papke and
Wooldridge (1996) to dealwith fractional dependent variables. Estimatedmarginal effects,
showninTableA.4intheAppendix,areinlinewiththebaselineestimatesinTable4.
Second,weexploitthecensusnatureofourdataandthefactthatweobservealmostthe
entirepopulationofstudentsineachgradetoapplyafinitepopulationcorrectiontostatistical
inference.Results(TableA.5intheAppendix)arequalitativelyunchangedwithrespecttothe
baseline,butprecisionincreasessignificantly.
Third,wedrop all observable covariatesnot required for balancing. Since assignment to
treatment does not depend on observables, finding differences between the estimates that
includeandexcludecovariatesisasymptomofstrategicmanipulationofthecompositionof
thepoolof testedstudents.Results inTableA.6 in theAppendixdonotprovideanystrong
evidence in this direction. Finally, we test directly for differences in absenteeism across
treatmentstatuses,usingasdependentvariable thepercentageofstudentsabsent fromthe
12
test in each class.Again,differences inbehaviour across the threegroupsareminimal (see
TableA.7intheAppendix).
7.Extensions
Inthisfinalsectionweaskwhethertheeffectsofhavinganexternalexaminervarywitha)
classsize;b)whethertheschoolispublicorprivate;c)thepercentageoftenuredteachersin
theschool;d)anindicatoroftheaverageparentalbackgroundofthestudentsintheclass;e)
measuresofsocialcapitalintheprovincewheretheschoolislocated.Descriptivestatisticsfor
thesevariablesareshowninTable2.
On the one hand, if student cheating is easier in larger classes, we should find that the
overall effect of having an external examiner increaseswith class size. On the other hand,
largerclassescouldincreasethecostofcheatingbyteachersorcouldreducetheeffectiveness
of external supervision. In this case, the overall effect should be smaller in larger classes.
Column(1)inTable8presentsourestimateswhenboththedirectandtheindirecteffectare
interacted with class size14. The evidence suggests that the overall effect of external
supervisionissmallerinlargerclasses,inlinewiththesecondhypothesis.
Column(2)inthetableshowsthattheschooltype–publicorprivate–doesnotinfluence
inastatisticallysignificantwaytheoveralleffectofexternalsupervision.Incontrast,column
(3)showsthatthatboththedirectandtheoveralleffectofexternalmonitoringarehigherin
schools where the percentage of tenured teachers is higher. Typically, these are senior
teacherswithverysecurejobs,whoarelesswillingtoadjusttheirteachingstyletotheneeds
tostandardizedtestsandmaythereforebemorelikelytoengageincheatingandsabotaging.
Column(4)looksattheinteractionsoftheoverall,directandindirecteffectswithESCS,the
indicator of the average parental background in the class. If the incentives to engage in
cheating were higher in classeswith poor parental background, perhaps because teachers
wishtoaltruisticallycompensatetheirstudents fortheirunfavourable initialconditions,we
shouldfindthatthenegativeeffectofexternalsupervisionishigherintheseclasses.Yet,there
isnostatisticalevidencethatthisisthecase15.
Finally, we ask whether the regional differences in the size of the effects of external
monitoring are associated to the differences in the level of social capital each province is
14Inthisandinthefollowingregressionsweincludetheinteractedvariableasanadditionalcontrol.15Onepossibleexplanationisthatnotonlyteachers,butalsoexternalexaminersmaybeinducedtoengageincompensatorybehaviour.
13
endowedwith16.Guiso,SapienzaandZingales(2010)definesocialcapitalasciviccapital,or
as “...those persistent and shared beliefs and values that help a group overcome free rider
outcomes...”(p.8). They report higher levels of social capital in Northern and Central Italy
comparedtotheSouth.
We interact both the direct and the indirect effect of external monitoring with two
measures of social capital at the provincial level taken from Guiso, Sapienza and Zingales
(2004), the number of blood donations per million inhabitants in 1995 and the average
electoral participation in the referenda held in Italy between 1946 and 1987. Since social
capitalisstronglycorrelatedwithlocaleconomicconditions,asshowninFigures1.a‐1.d,we
alsointeractbotheffectswithprovincialGDPpercapitaandunemploymentratesin2009.
ResultsareshowninTable9.Inallregressions,bothsocialcapitalandthemacroeconomic
variablesarere‐scaledtovarybetween0and1.Column(1)ofthetablereportstheestimates
of the baseline model in the sub‐sample of provinces for which data on social capital are
available.Resultsare in linewith thosepresented inTable4.Column(2)and (4) showthe
interactions of the direct, indirect and overall effect of external monitoring with the two
selectedmeasuresofsocialcapital(blooddonationsandturnoutatreferenda).Wefindthat
boththedirectandtheoveralleffectaresmallerinschoolslocatedinprovinceswithahigher
socialcapital.Thisqualitativeresultremainswhenweaddtotheregressionstheinteractions
withprovincialunemploymentandGDPpercapita(columns(3)and(5)),althoughtheeffect
ofsocialcapitalissmaller.
Conclusions
Test‐based accountability systems in education may be gamed by teachers and school
administrators inordertoobtainhighermeasured levelsofperformance.Thispapershows
thathavinganexternalexaminerwhomonitorstestprocedureshasbothdirectandindirect
negativeeffectsonthemeasuredperformanceoftestedclassesandschools.Whilethedirect
effectisonthemonitoredclass,theindirecteffectisontheun‐monitoredclassesofthesame
school.
16Intheirseminalwork,Putnametal.(1993)linksdifferencesintheperformanceoflocalItaliangovernmentstoregional heterogeneity in social capital, measured in terms of local patterns of associationism, newspaperreadership and political participation. Guiso, Sapienza and Zingales (2004) show that social capital is a keydeterminantoffinancialdevelopment,andNannicinietal.(2012)studytheimpactofsocialcapitalonpoliticalaccountability. Finally, Ichino and Maggi (2000) measure civicness in terms of shirking behaviour in theworkplaceanddocumentlargeshirkingdifferentialsbetweenNorthernandSouthernItaly.
14
These results are based on a natural experiment designed by the Italian national test
administrator (INVALSI) tomonitor test procedures in a randomsampleof Italianprimary
schoolclasses.Wehaveusedrandomassignmenttotreatmenttoestimateboththedirectand
indirecteffectsofexternalmonitoring.Theformerisbasedonthecomparisonofmonitored
and un‐monitored classeswithin the same school and the latter on the comparison of un‐
monitoredclassesinschoolswithandwithouttheexternalexaminer.
Theoveralleffect(directplusindirect)ofexternalmonitoringisstatisticallysignificantand
sizeable: depending on the grade, the presence of an external examiner reduces the
percentage of correct answers in the class by 5.5 to 8.5 percentwith respect to classes in
schoolswithnoexternalmonitor.Externalmonitoringspillsovertoun‐monitoredclassesof
thesameschool,butthesizeofthisbeneficialeffectisrathersmall(about20percentofthe
overalleffect).
Using additional supporting evidence on the psychological conditions of students before
andduringthetestandonthedistributionofanswerswithinclasses,wehaveconcludedthat
thebetterperformanceofclasseswithouttheexternalexaminerisduetothemanipulationof
testoutcomesbyteachersand/orstudents,andthattheperformancegapbetweenmonitored
andun‐monitoredclassescanbeinterpretedasameasureoftheaverageintensityofcheating
takingplaceinthelatter.Wehavealsoarguedthatthespill‐overeffectofexternalmonitoring
on un‐monitored classes of the same school can be explained either by (irrational) fear of
supervisionorbyamodelwhere rational teachersadministering the testsdislikeexcessive
dispersionoftestresultswithintheschool.
15
TABLESANDFIGURES
Table 1. Total and Sampled Number of Schools, Classes and Students. INVALSI SNV Test2009/2010
Numberofschools(total)
Numberofclasses(total)
Numberof
students(total)
Numberofsampledschools
Numberofsampledclasses
Numberofsampledstudents
SecondGrade
7,700 30,175 555,347 1,385 2,000 39,299
FifthGrade 7,700 30,476 565,064 1,385 2,000 39,643
16
Table2.MeanandStandardDeviationofCovariates‐Mathstests‐VgradersPanelA Mean StDev Mean StDevGender Motheroccupation Missing(%) 0.01
0.10Missing(%) 0.20 0.40
Male(%) 0.50 0.50 Unemployedorretired(%)
0.35 0.48
Placeofbirth Employee(%) 0.31 0.46Missing(%) 0.04 0.20 Entrepreneur(%) 0.08 0.28Italy(%) 0.89 0.31 Middlemanager(%) 0.06 0.23Citizenship Fatheroccupation Missing(%) 0.02 0.15 Missing(%) 0.22 0.41Italian(%) 0.89 0.32 Unemployedorretired
(%)0.04 0.19
Firstgenerationforeigner(%)
0.05 0.22 Employee(%) 0.39 0.49
Secondgenerationforeigner(%)
0.04 0.20 Entrepreneur(%) 0.25 0.43
Pre‐primaryschool Middlemanager(%) 0.11 0.31Missing(%) 0.15 0.35 Mothereducation Yes(%) 0.83 0.37 Missing(%) 0.21 0.41Age Primary(%) 0.39 0.49Missing(%) 0.01 0.10 Secondary(%) 0.29 0.45Olderthanregular(%) 0.03 0.16 Tertiary(%) 0.11 0.32Regular(%) 0.87 0.33 Fathereducation Youngerthanregular(%) 0.09 0.29 Missing(%) 0.22 0.42Mathsgradeinprevioussemester(range:1‐10)
Primary(%) 0.43 0.49
Missing(%) 0.07 0.26 Secondary(%) 0.25 0.431‐4(%) 0.00 0.04 Tertiary(%) 0.10 0.305(%) 0.04 0.20 Mothernationality 6‐7(%) 0.38 0.48 Missing(%) 0.09 0.288‐10(%) 0.51 0.50 Italian(%) 0.80 0.40Italiangradeinprevioussemester(range:1‐10)
Fathernationality
Missing(%) 0.07 0.25 Missing(%) 0.09 0.291‐4(%) 0.00 0.04 Italian(%) 0.82 0.395(%) 0.04 0.19 Privateschool 0.05 0.236‐7(%) 0.41 0.49 Fulltimescheduleclass 0.23 0.428‐10(%) 0.48 0.50 Numberofstudents
enrolledinclass19.00 4.65
(continues)
17
(continued) Mean StDev Mean StDevHasownbedroom Numberofsiblings Missing(%) 0.03 0.17 Missing(%) 0.02 0.15Yes(%) 0.55 0.50 0(%) 0.15 0.36Hasinternetaccess 1(%) 0.55 0.50Missing(%) 0.03 0.16 2(%) 0.20 0.40Yes(%) 0.76 0.43 3 (%) 0.05 0.21Hasanencyclopedia 4ormore(%) 0.03 0.17Missing(%) 0.03 0.16 Liveswith Missing(%) 0.71 0.46 Missing(%) 0.02 0.15Hasowndesk Bothparents(%) 0.86 0.35Missing(%) 0.02 0.15 Oneparentonly(%) 0.06 0.24Yes(%) 0.85 0.36 Bothparents
alternatively(%)0.05 0.22
HasaPC Others(%) 0.01 0.08Missing(%) 0.03 0.16 Languagespokenat
home
Yes(%) 0.75 0.43 Missing(%) 0.04 0.21Hasaplaceforhomework Italian(%) 0.73 0.44Missing(%) 0.03 0.16 Dialect(%) 0.15 0.36Yes(%) 0.84 0.37 Other(%) 0.07 0.25Numberofbooksathome Helpwithhomework Missing(%) 0.04 0.20 Missing(%) 0.07 0.260‐10(%) 0.12 0.33 Nohomework(%) 0.01 0.0711‐25(%) 0.25 0.43 Nohelpneeded(%) 0.20 0.4026‐100(%) 0.31 0.46 Parents(%) 0.45 0.50101‐200(%) 0.15 0.36 Siblings(%) 0.12 0.32>200(%) 0.12 0.33 Privateteacher(%) 0.03 0.17 Other(%) 0.04 0.20 Noone(%) 0.09 0.28PanelB Mean St.Dev. Mean St.Dev. Blooddonations 0.03 0.21Tenuredteachersintheschool(%)
90.33 9.13 Averageturnoutatreferenda(%)
80.28 8.37
ClassaverageESCSindex ‐0.045 0.51 Provincial unempl. rate(2009)
7.95 3.69
Classsize 16.93 4.64 ProvincialpercapitaGDP(2009)
23.84 5.60
Notes:Thetablereportsthemeanandstandarddeviationofthecovariates includedintheregressions.Thesestatisticsarebasedonindividual,schoolandclassleveldata.Exceptforthenumberofstudentsenrolledineachclass,allPanelAvariableshavebeencategorizedasdummyvariables.Classsizereferstothenumberofstudentsattendingthetest.Blooddonationsarethenumberofbloodbagspermillionofinhabitantsintheprovince.PercapitaGDPismeasuredinthousandsofeuro.SeetheAppendixforfurtherdetails.
18
Table3‐BalancingTests.First(betweenschools)andsecondstage(withinschool)randomization.‐Mathstests‐Vgraders.PanelA Between
schoolsWithinschool
Betweenschools
Withinschool
Privateschool(%) 0.003 Motheroccupation Fulltimeschedule(%) 0.015 0.011 Missing(%) ‐0.010 ‐0.023***Numberofstudentsenrolledinclass 0.039 0.425*** Unemployedorretired(%) 0.005 0.010*Gender Employee(%) 0.002 0.004Missing(%) 0.005*** 0.021*** Entrepreneur(%) 0.001 0.005**Male(%) ‐0.003 ‐0.007* Middlemanager(%) 0.002 0.003Placeofbirth Fatheroccupation Missing(%) ‐0.014*** ‐0.025*** Missing(%) ‐0.010 ‐0.023***Italy(%) 0.014*** 0.026*** Unemployedorretired(%) ‐0.001 0.000Citizenship Employee(%) 0.000 0.014**Missing(%) ‐0.008*** ‐0.011*** Entrepreneur(%) 0.010** 0.007Italian(%) 0.008** 0.010*** Middlemanager(%) 0.002 0.003Firstgenerationforeigner(%) ‐0.001 0.000 Mothereducation Secondgenerationforeigner(%) 0.002 0.002 Missing(%) ‐0.011 ‐0.029***Pre‐primaryschool Primary(%) 0.006 0.018***Missing(%) ‐0.019** ‐0.008 Secondary(%) 0.002 0.010**Yes(%) 0.018** 0.008 Tertiary(%) 0.002 0.001Age Fathereducation Missing(%) 0.006*** 0.019*** Missing(%) ‐0.011 ‐0.026***Olderthanregular(%) 0.000 0.001 Primary(%) 0.011 0.016**Regular(%) ‐0.008*** ‐0.016*** Secondary(%) ‐0.001 0.008*Youngerthanregular(%) 0.001 ‐0.003 Tertiary(%) 0.001 0.002Mathsgrade Mothernationality Missing(%) ‐0.023*** ‐0.008 Missing(%) ‐0.015** ‐0.012**1‐4(%) 0.000 ‐0.001** Italian(%) 0.013** 0.011*5(%) 0.001 0.000 Fathernationality 6‐7(%) 0.010** 0.007 Missing(%) ‐0.015** ‐0.011**8‐10(%) 0.012** 0.002 Italian(%) 0.013** 0.008Italiangrade Missing(%) ‐0.023*** ‐0.006 1‐4(%) 0.000 0.000 5(%) 0.000 0.001 6‐7(%) 0.007 0.004 8‐10(%) 0.015** 0.002 PanelB Between
schoolsWithinschool
Betweenschools
Withinschool
Hasownbedroom Numberofsiblings Missing(%) ‐0.007*** ‐0.008*** Missing(%) ‐0.007*** ‐0.007***Yes(%) ‐0.002 0.004 0(%) 0.000 0.000Hasinternetaccess 1(%) 0.004 0.006*Missing(%) ‐0.006** ‐0.007*** 2(%) 0.002 0.001Yes(%) 0.006* 0.007** 3(%) 0.001 0.000Hasanencyclopedia 4ormore(%) 0.000 0.000Missing(%) ‐0.006** ‐0.006*** Liveswith Yes(%) 0.008* 0.015*** Missing(%) ‐0.008*** ‐0.008***Hasowndesk Bothparents(%) 0.008*** 0.006*Missing(%) ‐0.006** ‐0.006*** Oneparentonly(%) 0.000 0.001Yes(%) 0.004 0.007** Bothparentsalternatively(%) 0.000 0.002HasaPC Others(%) 0.000 0.000Missing(%) ‐0.006** ‐0.006*** Languagespokenathome Yes(%) 0.008*** 0.010*** Missing(%) ‐0.008*** ‐0.008***Hasaplaceforhomework Italian(%) 0.003 0.006Missing(%) ‐0.006** ‐0.006*** Dialect(%) 0.004 0.002Yes(%) 0.007*** 0.006* Other(%) 0.001 0.000Numberofbooksathome Helpwithhomework Missing(%) ‐0.008*** ‐0.007*** Missing ‐0.007*** ‐0.005*0‐10(%) ‐0.001 0.000 Nohomework(%) ‐0.001** ‐0.001***11‐25(%) ‐0.003 0.000 Nohelpneeded(%) ‐0.002 0.006*26‐100(%) 0.001 0.005 Parents(%) 0.005* 0.000101‐200(%) 0.005** 0.003 Siblings(%) 0.003** ‐0.001>200(%) 0.006** 0.000 Privateteacher(%) ‐0.001 0.001 Other(%) 0.002** ‐0.001 Noone(%) 0.000 0.002Notes: the table shows the point estimates of the balancing tests between andwithin schools.We compute school or class averages ofindividual variables and test for balancing using regressions (5) and (6). Full time schedule refers to schools offering this option in thebetween schools analysis and to the schedule of the single class in thewithin school analysis. The variables in Panel A are available forstudentsinbothgrades.ThevariablesinPanelBareavailableonlyforfifthgradestudents.Allregressionsareweightedbythenumberofstudentsineachschoolorclass.Robuststandarderrors.One,twoandthreestarsforstatisticalsignificanceatthe10,5and1percentlevel.
19
Table4.WeightedOLSestimatesoftheEffectsofExternalMonitoring.Mathstests–Vgrade.Dependentvariable:PercentageofCorrectAnswersintheClass. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐2.79*** ‐0.99*** ‐2.27*** ‐4.92*** (0.27) (0.30) (0.52) (0.56)IndirectEffectx100 ‐0.81*** ‐0.70*** ‐0.73** ‐1.04*** (0.19) (0.20) (0.34) (0.40)OverallEffectx100 ‐3.59*** ‐1.69*** ‐2.99*** ‐5.96*** (0.250) (0.276) (0.484) (0.501) Observations 27,325 11,541 4,886 10,898R‐squared 0.97 0.99 0.98 0.96 CovariatesinTable2PanelA
Yes Yes Yes Yes
Mean‐UntreatedSchoolsx100
65.1 63.9 64.0 66.8
Notes: all regressions include the number of students enrolled in the class, regional dummies and regionaldummies interactedwith school size andwith the number of fifth grade classes in the school. Estimates areweighted by class size. Robust standard errors in parentheses. One, two and three stars for statisticalsignificanceatthe10,5and1percentlevelofconfidence.
20
Table 5. Weighted OLS estimates of the Effects of External Monitoring on StudentPsychologicalConditions.Maths tests–Vgrade.Dependentvariable:PercentageofPositiveAnswersintheClass. (1) (2) (3) (4)
DirectEffectx100 0.25 ‐0.92*** ‐0.08 0.64 (0.43) (0.29) (0.41) (0.41)IndirectEffectx100 0.25 0.01 0.36 ‐0.01 (0.28) (0.19) (0.26) (0.26)OverallEffectx100 0.50 ‐0.90*** 0.28 0.63* (0.39) (0.272) (0.38) (0.37) Observations 27,141 27,142 27,141 27,140R‐squared 0.94 0.75 0.92 0.93 CovariatesinTable2panelA
Yes Yes Yes Yes
Mean‐UntreatedSchoolsx100
61.0 19.2 50.7 53.1
Notes:seeTable4.Ineachcolumn,thedependentvariableisthepercentageofstudentsintheclasswhoagreewiththefollowingsentences:1)Iwasalreadyanxiousbeforestartingthetest;2)IwassonervousIcouldn’tfindthe answers;3)while answering, I felt like Iwasdoingbadly; 4)while answering, Iwas calm. Studentswithmissinganswershavebeendropped fromtheestimationsample (about2percentof the total).Theestimatesrefertotheentirecountry.
21
Table 6. Weighted OLS estimates of the Effects of External Monitoring on the StandardDeviation,theCoefficientofVariationandtheQuartilesoftheDistributionofCorrectAnswerswithintheClass.SouthernItaly‐Mathstests–Vgrade. (1) (2) (3) (4) (5) Standard
DeviationCoefficient
ofVariation
Firstquartile
Secondquartile
Thirdquartile
DirectEffectx100 1.34*** 3.82*** ‐6.72*** ‐5.57*** ‐4.18*** (0.17) (0.39) (0.66) (0.62) (0.56)IndirectEffectx100 0.13 0.47* ‐0.90* ‐0.91** ‐0.93** (0.12) (0.26) (0.48) (0.44) (0.39)OverallEffectx100 1.47*** 4.30*** ‐7.62*** ‐6.48*** ‐5.10*** (0.16) (0.37) (0.59) (0.56) (0.51) Observations 10,898 10,898 10,898 10,898 10,898R‐squared 0.88 0.83 0.92 0.95 0.97CovariatesinTable2panelA Yes Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 12.2 20.1 59.0 67.6 75.4Notes:seeTable4.
22
Table7.WeightedOLSestimatesoftheEffectsofExternalMonitoringontheHeterogeneityofAnswersineachClass.Mathstests–Vgrade.DependentVariable:AverageHerfindhalIndexinEachClassx100. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 4.35*** 1.46*** 2.99*** 8.00*** (0.37) (0.38) (0.68) (0.77)IndirectEffectx100 1.08*** 0.86*** 0.95** 1.50*** (0.26) (0.27) (0.48) (0.57)OverallEffectx100 5.43*** 2.32*** 3.94*** 9.50*** (0.34) (0.35) (0.63) (0.69) Observations 27,325 11,541 4,886 10,898R‐squared 0.94 0.98 0.96 0.88CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 57.3 61.8 60.1 51.4Notes:seeTable4.
23
Table 8. Heterogeneous Effects of External Monitoring. Maths tests – V grade. Dependentvariable: Percentage of Correct Answers in the Class. Interactions of direct, indirect andoveralleffectswithclasssize,schooltype,%tenuredteachersandaverageESCS. (1) (2) (3) (4) Classsize PrivateSchool %TenuredTeachers ESCS DirectEffectx100 ‐3.41*** ‐2.79*** ‐1.34*** ‐2.65*** (0.43) (0.275) (0.33) (0.35)DirectEffect–Interactionx100 0.98* 0.132 ‐2.98*** ‐0.15 (0.55) (2.70) (0.55) (0.55)IndirectEffectx100 ‐0.94*** ‐0.815*** ‐0.66*** ‐0.67*** (0.28) (0.190) (0.22) (0.23)IndirectEffect–Interactionx100 0.22 0.498 ‐0.19 ‐0.30 (0.35) (2.17) (0.37) (0.36)OverallEffectx100 ‐4.35*** ‐3.60*** ‐2.00*** ‐3.32*** (0.003) (0.30) (0.29) (0.32)OverallEffect–Interactionx100 1.20** 0.60 ‐3.17*** ‐0.45 (0.48) (1.80) (0.49) (0.48) Observations 27,325 27,325 26,313 27,323R‐squared 0.97 0.97 0.97 0.97CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 65.1 65.1 64.9 65.1Notes: In each regression the interacted variable enters also separately. Class size, proportion of tenuredteachersandclassESCSarecodedasdummyvariablestakingvalueonewhenabovethemedianandzerowhenbelow(forclassESCS,thedummytakesvalueonewhenbelowmedianandzerowhenabove).Theproportionoftenuredteachersisnotavailableforprivateschools(729classes),forthepublicschoolslocatedintheProvinceofTrento (263classes) and for fiveSicilianpublic schoolswhodidnot transmit the information (20 classes).AverageESCSisnotavailablefor2classes.Allregressionsincludethenumberofstudentsenrolledintheclass,regionaldummiesandregionaldummiesinteractedwithschoolsizeandwiththenumberoffifthgradeclassesintheschool.Estimatesareweightedbyclasssize.Robuststandarderrorsinparentheses.One,twoandthreestarsforstatisticalsignificanceatthe10,5and1percentlevelofconfidence.
24
Table 9. Weighted OLS estimates of the Effects of External Monitoring, with InteractionsbetweenSocialCapitalandExternalMonitoring.Maths tests–Vgrade.Dependentvariable:PercentageofCorrectAnswersintheClass (1) (2) (3) (4) (5)
Baseline
InteractedwithBloodDonations
InteractedwithBloodDonations
andMacroVariables
Interactedwith
Turnoverat
Referenda
InteractedwithTurnoveratReferenda&
MacroVariables DirectEffectx100 ‐2.78*** ‐4.76*** ‐3.47** ‐7.13*** ‐4.73** (0.28) (0.48) (1.47) (0.84) (1.99)InteractedDirectEffectx100 8.54*** 4.33*** 7.39*** 4.20** (1.36) (1.46) (1.17) (2.00)IndirectEffectx100 ‐0.82*** ‐0.68** ‐1.34 ‐1.05* ‐1.27 (0.19) (0.33) (1.04) (0.59) (1.44)InteractedIndirectEffectx100 ‐0.65 ‐1.35 0.42 ‐0.55 (0.90) (1.00) (0.81) (1.41)Overalleffectx100 ‐3.60*** ‐5.44*** ‐4.82*** ‐8.19*** ‐5.99*** (0.25) (0.43) (1.30) (0.73) (1.78)InteractedOverallEffectx100 7.89*** 2.99** 7.81*** 3.65** (1.19) (1.31) (1.01) (1.81) Observations 27,178 27,178 27,178 27,178 27,178R‐squared 0.97 0.97 0.97 0.97 0.97CovariatesinTable2panelA Yes Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 65.1 65.1 65.1 65.1 65.1Notes:Interactedeffectsrefertotheinteractionsbetweendirect,indirectandoveralleffectsandthemeasureofsocial capital listed at the topof each column. Eachmeasure enters also as an independent covariate in eachregression.SocialcapitalmeasuresarenotavailablefortheprovincesofBellunoandIsernia(147classes).PercapitaGDPandunemploymentratesintheprovinceenterincolumns(3)and(5)bothasindependentcovariatesand interactedwith eacheffect.All regressions include thenumberof studentsenrolled in the class, regionaldummiesand regionaldummies interactedwith school sizeandwith thenumberof fifthgrade classes in theschool.Estimatesareweightedbyclasssize.Robuststandarderrorsinparentheses.One,twoandthreestarsforstatisticalsignificanceatthe10,5and1percentlevelofconfidence.
25
Figure 1. Geographical distribution of Blood Donations, Average Turnout at Referenda, theUnemploymentRateandGDPpercapitaintheItalianProvinces.
a. b.
c. d. Notes: Panel a) number of blood donations per million of inhabitants in capita in 1995. Panel b) averageturnoveratthereferendathattookplacebetween1946and1989.Panelc)Unemploymentratein2009.Paneld)GDPpercapitain2009.Thedataareorderedbyquintiles,withdarkercoloursreferringtothetopquintileofthedistribution.
26
Appendix
1) TablesTableA.1.WeightedOLSestimatesoftheEffectsofExternalMonitoring.Italiantests–Vgrade.Dependentvariable:PercentageofCorrectAnswersintheClass. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐2.61*** ‐1.03*** ‐2.17*** ‐4.39*** (0.21) (0.23) (0.43) (0.42)IndirectEffectx100 ‐0.67*** ‐0.38** ‐0.81*** ‐0.99*** (0.15) (0.17) (0.28) (0.31)OverallEffectx100 ‐3.28*** ‐1.41*** ‐2.98*** ‐5.37*** (0.194) (0.205) (0.393) (0.381) Observations 27,369 11,557 4,894 10,918R‐squared 0.98 0.99 0.99 0.97CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 70.0 70.2 70.1 69.7Notes:seeTable4.TableA.2.WeightedOLSestimatesoftheEffectsofExternalMonitoring.Mathstests–IIgrade.Dependentvariable:PercentageofCorrectAnswersintheClass. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐4.20*** ‐1.57*** ‐3.09*** ‐7.50*** (0.30) (0.34) (0.55) (0.61)IndirectEffectx100 ‐1.22*** ‐0.91*** ‐1.37*** ‐1.53*** (0.22) (0.25) (0.42) (0.47)OverallEffectx100 ‐5.42*** ‐2.48*** ‐4.47*** ‐9.03*** (0.274) (0.312) (0.501) (0.547) Observations 27,012 11,724 4,905 10,383R‐squared 0.96 0.98 0.97 0.94CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 62.9 59.9 61.8 66.7Notes:seeTable4.
27
TableA.3.WeightedOLSestimatesoftheEffectsofExternalMonitoring.Italiantests–IIgrade.Dependentvariable:PercentageofCorrectAnswersintheClass. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐3.40*** ‐1.36*** ‐2.17*** ‐6.21*** (0.28) (0.34) (0.55) (0.54)IndirectEffectx100 ‐1.04*** ‐0.71*** ‐1.25*** ‐1.33*** (0.20) (0.24) (0.39) (0.41)OverallEffectx100 ‐4.44*** ‐2.07*** ‐3.42*** ‐7.54*** (0.250) (0.301) (0.501) (0.481) Observations 27,025 11,721 4,911 10,393R‐squared 0.971 0.98 0.98 0.96CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 65.9 65.0 66.2 66.7Notes:seeTable4.TableA.4.GLMestimatesoftheEffectsofExternalMonitoring.Mathstests–Vgrade.Dependentvariable:PercentageofCorrectAnswersintheClass. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐2.74*** ‐0.95*** ‐2.27*** ‐4.74*** (0.27) (0.30) (0.51) (0.53)IndirectEffectx100 ‐0.79*** ‐0.76*** ‐0.73** ‐0.96** (0.19) (0.20) (0.34) (0.39)OverallEffectx100 ‐3.53*** ‐1.71*** ‐3.00*** ‐5.70*** (0.242) (0.271) (0.472) (0.469) Observations 27,325 11,541 4,886 10,898CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 65.1 63.9 64.0 66.8Notes:seeTable4.
28
TableA.5.OLSestimatesoftheEffectsofExternalMonitoring.Mathstests–Vgrade.Dependentvariable:PercentageofCorrectAnswersintheClass.FinitePopulationCorrection. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐2.89*** ‐1.08*** ‐2.35*** ‐5.05*** (0.09) (0.10) (0.17) (0.17)IndirectEffectx100 ‐0.83*** ‐0.71*** ‐0.70*** ‐1.06*** (0.06) (0.07) (0.11) (0.12)OverallEffectx100 ‐3.72*** ‐1.79*** ‐3.05*** ‐6.11*** (0.0789) (0.0880) (0.155) (0.155) Observations 27325 11541 4886 10898R‐squared 0.97 0.99 0.98 0.96CovariatesinTable2panelA Yes Yes Yes Yes Mean‐UntreatedSchoolsx100 65.1 63.9 64.0 66.8Notes: Population size: 30.310 classes. All regressions include the number of students enrolled in the class,regionaldummiesandregionaldummiesinteractedwithschoolsizeandwiththenumberoffifthgradeclassesintheschool.Robuststandarderrorsinparentheses.One,twoandthreestarsforstatisticalsignificanceatthe10,5and1percentlevelofconfidence.TableA.6.WeightedOLSestimatesoftheEffectsofExternalMonitoring.Mathstests–Vgrade.Dependentvariable:PercentageofCorrectAnswersintheClass.WithoutCovariates. (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐2.82*** ‐0.85** ‐2.04*** ‐5.29*** (0.29) (0.33) (0.55) (0.58)IndirectEffectx100 ‐0.70*** ‐0.82*** ‐0.46 ‐0.70* (0.20) (0.23) (0.37) (0.42)OverallEffectx100 ‐3.52*** ‐1.68*** ‐2.50*** ‐5.99*** (0.264) (0.302) (0.499) (0.527) Observations 27,325 11,541 4,886 10,898R‐squared 0.97 0.98 0.98 0.95CovariatesinTable2panelA No No No No Mean‐UntreatedSchoolsx100 65.1 63.9 64.0 66.8Notes:seeTable4.
29
TableA.7.WeightedOLSestimatesoftheEffectsofExternalMonitoring.Mathstests–Vgrade.Dependentvariable:PercentageAbsentfromtheTest (1) (2) (3) (4) Italy North Centre South DirectEffectx100 ‐0.53** ‐0.50 ‐0.47 ‐0.55 (0.25) (0.41) (0.47) (0.41)IndirectEffectx100 ‐0.10 0.44 ‐0.44 ‐0.51 (0.18) (0.28) (0.35) (0.32)OverallEffectx100 ‐0.63*** ‐0.06 ‐0.91** ‐1.06*** (0.22) (0.36) (0.42) (0.36) Observations 27,325 11,541 4,886 10,898R‐squared 0.56 0.58 0.61 0.53CovariatesinTable2panelA No No No No Mean‐UntreatedSchoolsx100 11.0 10.4 11.7 11.4Notes:seeTable4.
30
2)Fromtheinitialdatasettothefinalsample
Ourdataaredrawnfromthe2009/2010waveoftheINVALSISNVsurveyofeducational
achievementsinItalianprimaryschools.ThesedataarefreelyavailablefromINVALSI.Inthis
sectionoftheappendixwebrieflydescribeourdatahandling.
1) WeexcludeValled’AostaandtheProvinceofBolzano,becauseallclassesinthese
areaswereassignedtoexternalmonitoring.
2) Wedropschoolswherethereisadifferentnumberofsecondandfifthgradeclasses
assigned to monitoring, because this outcome is inconsistent with the sampling
scheme.
3) We drop classeswith less than five students and schoolswith a single class per
gradeorwithtwoclassesifbothwereassignedtomonitoring.
To illustrate theeffectsof theseactions,weconsider themaths test for fifthgraders.For
thisgroup,thepopulationconsistsof7,700schools,30,476classesand565,064students.Our
initial dataset includes 7,541 schools, 29,811 classes and 491,421non‐disabled students in
schoolswithmore than tenstudents (smaller schoolsareexcluded fromtesting)whowere
presentduringthetestingday.DroppingdatafortheprovincesofAostaandBolzanoreduces
thetotalnumberofschoolsto7,502,with29,647classesand489,396students.Eliminationof
treatedschoolswherethereisadifferentnumberofsecondandfifthgradeclassesleavesus
with489,126studentsallocatedin29,629classesof7,498schools.Purgingoutclasseswith
less than 5 students leaves uswith 28,677 classes in 7,452 schools and a total of 486,531
students.Afterdroppingschoolswithasingleclassinthegradeorwithtwoclassesifbothare
treatedweobtainourestimationsample,whichconsistsof6,108schools,27,325classesand
462,570students.
3)Otherdata
UnemploymentandpercapitaGDPdatarefertoyear2009andaredrawnfromEUROSTAT
regionalstatisticsdatabase.Dataonblooddonationsandtheaverageturnoutatreferendaare
fromGuiso,SapienzaandZingales(2004).Theoriginaldatahavebeenre‐classifiedtomatch
INVALSIclassification,whichincludes103provinces
31
ReferencesAngelucci, M. and De Giorgi, G., 2009. Indirect Effects of an Aid Program: How Do Cash
TransfersAffectIneligibles'Consumption?AmericanEconomicReview,99(1),pp.486‐508.
Bokhari,F.A.S.andSchneider,H.,2011.SchoolAccountabilityLawsandtheConsumptionof
Psychostimulants.JournalofHealthEconomics,30(2),pp.355‐372.
Cullen, J.B. and Reback, R., 2006. Tinkering Toward Accolades: School Gaming under a
Performance Accountability System. In: Gronberg, T.J. and Jansen, D.W. (eds.),Advances in
AppliedMicroeconomics,14,pp.1‐34.
Figlio,D.N.,2006.Testing,CrimeandPunishment.JournalofPublicEconomics,90(4),pp.837‐
851.
Figlio,D.N.andGetzler,S.G,2006.Accountability,AbilityandDisability:GamingtheSystem.
In:Gronberg,T.J.andJansen,D.W.(eds.),AdvancesinAppliedMicroeconomics,14,pp.35‐49
Figlio, D. N. and Loeb, S., 2011. School Accountability. In: Hanushek, E. A., Machin, S. and
Woessmann,L.(eds.),HandbookoftheEconomicsofEducation,3,pp.383‐421.
Figlio,D.N.,Winicki, J.,2005.Foodforthought: theeffectsofschoolaccountabilityplanson
schoolnutrition,JournalofPublicEconomics,89(2),pp.381‐394.
Guiso, L., Sapienza, P. and Zingales, L., 2004. The Role of Social Capital in Financial
Development.AmericanEconomicReview,94(3),pp.526‐556.
Guiso,L.,Sapienza,P.andZingales,L.,2010.CivicCapitalastheMissingLink.NBERworking
Paper15845.
Heckman,J.J.,Lalonde,R.J.andSmith,J.A.,1999.TheEconomicsandEconometricsofActive
Labor Market Programs. In: Ashenfelter, O. C. and Card, D. (eds.), Handbook of Labor
Economics,3(1),pp.1865‐2097.
Hussain, I., 2012. Subjective Performance Evaluation in the Public Sector: Evidence from
SchoolInspections.CEEDiscussionPaper135,LondonSchoolofEconomics.
32
Ichino, A. and Maggi, G. 2000. Work Environment and Individual Background: Explaining
RegionalShirkingDifferentialsinaLargeItalianFirm”.QuarterlyJournalofEconomics,115(3),
pp.1057‐1090.
INVALSI, 2010a. Sistema Nazionale di Valutazione – A.S. 2009/2010 ‐ Rilevazione degli
apprendimenti.
INVALSI, 2010b. Esami di Stato Primo Ciclo – A.S. 2009/2010 – Prova Nazionale. Prime
Analisi.
Jacob,B.A.,2005.Accountability,incentivesandbehavior:theimpactofhigh‐stakestestingin
theChicagoPublicSchools.JournalofPublicEconomics,89(5),pp.761‐796.
Jacob, B. A. and Levitt, S., 2003. Rotten Apples: An Investigation of the Prevalence and
PredictorsofTeacherCheating.QuarterlyJournalofEconomics,118(3),pp.843‐77.
Miguel,E.andKremer,M.,2004.Worms:IdentifyingImpactsonEducationandHealthinthe
PresenceofTreatmentExternalities.Econometrica,72(1),pp.159‐217.
Nannicini, T. et al., 2012.Social Capital and Political Accountability. American Economic
Journal:EconomicPolicy,forthcoming.
Papke, L. E. and Wooldridge, J. M., 1996. Econometric Methods for Fractional Response
Variables with an Application to 401(K) Plan Participation Rates. Journal of Applied
Econometrics,11(6),pp.619‐32.
Putnam,R.D. etal., 1993.MakingDemocracyWork:CivicTraditions inModern Italy.1sted.
Princeton,NJ:PricetonUniversityPress.