27
Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Identifying and Measuring the Effect of Firm Clusters Among Certified Organic Processors and Handlers Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA- ERS)

Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 1 AEREC Research Workshop Series (Oct. 29, 2008)

Identifying and Measuring the Effect of Firm Clusters Among Certified Organic Processors and Handlers

Edward C. JaenickeStephan J. GoetzPing-Chao WuCarolyn Dimitri (USDA-ERS)

Page 2: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 2 AEREC Research Workshop Series (Oct. 29, 2008)

Format/Outline:

1. Formulation of Research Idea2. Preliminary Analysis: Data and estimation check, and decision to go forward3. Research Plan: Double check theory, proceed with empirics4. Focus on Methods: What econometric model should we use?5. Mistakes Along the Way: Econometrics6. Turning the Results Into a Paper: Revisit theory and other thorny issues7. Improving the Paper: Future research (hopefully)

Page 3: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 3 AEREC Research Workshop Series (Oct. 29, 2008)

1. Research Idea Formation

Goetz volunteers to fund a grad student for the summer IF we can find a topic that fits the Northeast Regional Center’s program.Jaenicke and Goetz – ongoing projects, available datasets, research interestsGoetz: Can you analyze the impact of “clusters” of certified organic handling firms on firm behavior?Jaenicke: I don’t know. (But I’ll check)

Page 4: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 4 AEREC Research Workshop Series (Oct. 29, 2008)

2. Preliminary Analysis: Data and estimation check, and decision to go forward

Page 5: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 5 AEREC Research Workshop Series (Oct. 29, 2008)

3. Research Plan: Double check theory, proceed with empirics

Decisions: Theoretical, empirical contribution, or mixture? Proceed?Prior: Existing literature shows very few studies measuring the impacts of clusters on firm performance

Lit review:Confirms prior, though some related empirical examples existMore empirical literature on cluster formation (not cluster impact)

Bottom line:The link between firm agglomeration theory and measurement of agglomeration impact is not well established.Difficult to build on theory.On the other hand, there is an empirical gap.If we could empirically measure the impact of firm clusters (agglomeration), we would have a paper.I.e.: Proceed

Page 6: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 6 AEREC Research Workshop Series (Oct. 29, 2008)

3. (Continued) Preliminary EmpiricsLots of empirical choices and decisions:1. How should firm agglomerations or clusters be defined? Issues to consider?

As a continuous variable – e.g., industrial intensity?As a binary variable – e.g., presence of a minimum # of firms within an area?

Area?Minimum #?

Answers:Cluster as binary variable because we have micro (firm-level) data.But stay flexible on the Minimum # and Area (i.e., try both county and zip code)

2. What variables should be included in a “cluster impact” equation, and how would they be justified? “Impact” on what variable(s)?

Answers:Draw from available survey data.Plus Census data?Proceed by considering RHS variables as controlling factors

Page 7: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 7 AEREC Research Workshop Series (Oct. 29, 2008)

3. (Continued) Preliminary Econometric Model

(1) yji = Cn,i + xi + i ,

yj is a firm-level decision of type jCn,i is a binary variable, where n is the minimum # of firms in a cluster.xi is the vector of controlling factors and are parameters: is our main interest

Page 8: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 8 AEREC Research Workshop Series (Oct. 29, 2008)

3. (Continued) Preliminary EmpiricsTime to go the data and try a preliminary estimation

OLS: LHS (yj) = Total Gross Sales RHS = Cluster variable (Cn,) plus survey variables (xi) such as dummy

variables for types of firm, years in business, etc. More on these variables later.

Results: not so good (See word doc 1)

Page 9: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 9 AEREC Research Workshop Series (Oct. 29, 2008)

3. (Continued) Preliminary EmpiricsTime to try go the data and try a preliminary estimation

OLS: LHS (yj) = Total Gross Sales RHS = Cluster variable (Cn,) plus survey variables (xi) such as dummy

variables for types of firm, years in business, etc. More on these variables later.

Results: not so good (See word doc 1)

Change in plans: Because existing literature focuses on (endogenous) cluster formation, account for

endogeneity by including this equation in the system3SLS: LH1 = Total Sales

RH1 = Cluster variable plus survey variables LH2 = Cluster variable RH2 = Cluster variable plus more survey variables

Page 10: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 10 AEREC Research Workshop Series (Oct. 29, 2008)

3. Revised Econometric Model

(1) yji = Cn,i + xi1 + 1i ,

(2) Cn,i = zi2 + 2i ,

yj is a firm-level decision of type jCn,i is a binary variable, where n is the minimum # of firms in a cluster.xi is the vector of controlling factors in the cluster impact equationzi is the vector of controlling factors in the cluster formation equation, 1, and 2 are parameters: is still our main interestWe also added county-level demographics to x and z:

# of farms, land values, education-college, nonfarm per-capita income, population

Results: Seem better! (See word doc 1 again)

Page 11: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 11 AEREC Research Workshop Series (Oct. 29, 2008)

3. Preliminary Empirics: 3SLS ResultsEstimates of : (Alleged) Impacts of Clustering with Different Cluster Size

Dependent Variable Clusters (Cn)

3 4 5 6 7 8 9 10

Total gross sales +*** -** -** -** -** -** -*

Total full-time employees +** -** -* Sales per employee -** -* -* Percentage of organic sales

Percentage of organic procurement -* -* -*

Organic sales-locally -*** -** +** +** +** +*

Organic sales-regionally +*

Organic sales-nationally +** -* -** -** -*

Organic sales-internationally

Organic procurement-locally +*** +** +**

Organic procurement-regionally

Organic procurement-nationally -** -* -** -** -**

Organic procurement-internationally

Looks good: Two “stories” emerge.But something’s wrong: What is it?

Page 12: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 12 AEREC Research Workshop Series (Oct. 29, 2008)

3. Two Stories, and a MistakeTwo stories:1. In many cases, the cluster variable has a significant impact on firm performance

or firm decisions.2. Varying n (the minimum number of firms that defines a cluster) appears to

change the impact

What’s wrong? We need to account for the endogeneity of Cn, so yes we need a system.

But not 3SLS. Why not?Because Cn is binary and Three Stage Least Squares would yield biased estimates.Equation (2) needs to be binary (e.g., logit or probit)

Page 13: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 13 AEREC Research Workshop Series (Oct. 29, 2008)

4. Refocus on Methods: What econometric model should we use?Back to the drawing board: revisit key features

Cn is binary and endogenous – seems relatively simpleAre the two error terms contemporaneously correlated?

Worst case, construct a Likelihood function for (1) and (2), and program it manually.Turns out there is a fairly large literature on this econometric modelEstimation of “Treatment Effects” (from labor economics)

Thanks to Ping-Chao for first suggesting it.

Page 14: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 14 AEREC Research Workshop Series (Oct. 29, 2008)

4. Treatment EffectsAgronomical example:

Suppose we wanted to estimate the effect that a new pesticide (Pesticide X) had on corn yieldsHow would you design an agronomic experiment to measure this effect?Some sort of randomized plots, control the inputs, measure the output (yield): treatment effect measured by analysis of varianceInstead of experimental data, suppose you had observed data on inputs and yields from some farmers who used Pesticide X, and some who didn’t. Would you just estimate a regression with Pesticide X use on the right hand side?Under what scenario would that method be accurate?If the pesticide company distributed Pesticide X to farmers randomly (as in an experimental trial), then this method would be fine.But, more likely, farmers self select according to some non-random criteria (such as “prior success with new chemical use” or something similar)Therefore, we must account for selection bias.

Page 15: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 15 AEREC Research Workshop Series (Oct. 29, 2008)

4. Econometric model: Treatment Effects and EvaluationA good example is found in the Stata manual.Topic--Women’s labor market. Similar econometric structure:

(1) yi = Cn,i + xi1 + 1i ,(2) C = zi2 + 2i ,

where 1i and 2i have covariance matrix:Now, let

yi = women’s wages C (again binary) = 1 if the women has a college degree, 0 otherwise.

Is C endogenous ?Probably. Hence estimating (1) alone would be incorrect.

What if (1) and (2) were estimated jointly? What would signify?Normally it would be the y due to C switching from 0 to 1.But not in this case. Why not?“Selection bias”

Page 16: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 16 AEREC Research Workshop Series (Oct. 29, 2008)

4. Econometric model: Average Treatment Effect, etc.Measured Treatment Effects come in two forms:

1. Average Treatment Effect: ATE = E[y1|x, C = 1] - E[y0|x, C = 0] 2. Average Treatment Effect on the Treated: ATET = E[y1- y0|C = 1]

In both cases, the conditional expectations can introduce selection bias.

If > 0, underestimates ATE.If = 0, is an unbiased estimate of ATE.

Page 17: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 17 AEREC Research Workshop Series (Oct. 29, 2008)

4. Treatment Effects: Stata Command

Stata language: treatreg sales_per_employee x1 x2 x3 x_etc, treat (C = z1 z2 z3 z_etc)Key options: ML or two-step procedure

Key estimation issues:The Treatment Effect model seems appropriate. Will it work/converge?Will be significant? Yes mostly means clusters have an impact.Will be significant? Yes, means that selection bias plays a role.13 choices for yj. How will results vary?8 choices for Cn. Again, how will results vary?Which variables should be in the x and z vectors?

Stata: “TreatReg”. ML estimation of the following Likelihood function

Page 18: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 18 AEREC Research Workshop Series (Oct. 29, 2008)

5. Mistakes Along the Way: Re-estimation with TreatRegSee results (Word document 2):

At first glance, our results do not look so hot.If you look very carefully, you’ll see an indication of another mistake on my part.

Page 19: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 19 AEREC Research Workshop Series (Oct. 29, 2008)

5. Mistakes Along the Way: Re-estimation with TreatRegOur excitement over some preliminary success (significant estimates of ) caused me to be too hasty in choosing RHS variables in x and z.The mistake cause me to revisit the variable choices. See the “raw” survey results (Word doc 3).See the estimation results for one ML estimated model (Word 4 doc).

Page 20: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 20 AEREC Research Workshop Series (Oct. 29, 2008)

5. Mistakes Along the Way: New Results (no mistakes?)

Clusters, Cn, where n = 3 to 10 Dependent Variable n = 3 4 5 6 7 8 9 10

Total gross sales per employee ($ millions)

+ –*** –*** –*** –*** –*** nc nc

Total gross sales ($ millions)

+*** +*** –*** –** – – – –

Total full-time employees +*** +*** – – – +*** +*** +***

Percentage of organic sales –*** –*** –** –** –** – –** –

Percentage of organic procurement

–*** –*** –*** –*** –*** –*** –*** –***

Organic sales - % local –*** –*** –** + + + – –

Organic sales – % regional +*** +*** +*** + + + + +

Organic sales – % national +*** + –* – – + nc +

Organic sales – % international

–*** –*** –*** –*** –*** –*** – –

Organic procurement – % local

+*** +*** + – – – –* –

Organic procurement – % regional

–** –** – – – – – –

Organic procurement – % national

+*** +*** +*** +*** +*** +*** +*** +***

Organic procurement – % internat’l

–*** –** –*** – – – – –

Signs of ML Estimates for , with Different Cluster Definitions

Page 21: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 21 AEREC Research Workshop Series (Oct. 29, 2008)

5. Calculating ATEWe could use the formula on Slide 14.Or, we could ask Stata to calculate the predicted values for y, conditional on whether the observation was treated (C=1 yctrt) or not treated (C= 0 ycntrt) .

Stata code:predict yctrtspe_5,yctrtpredict ycntrtspe_5,ycntrtgenerate diffspe_5 = yctrtspe_5 - ycntrtspe_5summarize diffspe_5

Stata for…y conditional on treatment

Stata for…y conditional on no treatment

Page 22: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 22 AEREC Research Workshop Series (Oct. 29, 2008)

5. Results – AverageTreatment EffectsATE Accounting for Selection Bias, with Different Cluster Definitions

Clusters, Cn, where n = 3 to 10 Dependent Variable n = 3 4 5 6 7 8 9 10

Total gross sales per employee ($ millions) 0.16 0.49* 1.00 1.36* 1.42* 1.48* nc nc

Total gross sales ($ millions) 1.23* -2.74* 14.7* 14.2* 15.8* 11.7 5.33 9.80

Total full-time employees -1.26* -21.8* 22.8 5.8 10.7 -87.0* -80.7* -102.1*

Percentage of organic sales 7.6* 12.7* 12.4* 16.4* 17.1* 10.0 16.0* 22.4

Percentage of organic procurement 9.5* 15.0* 13.0* 23.1* 28.7* 25.6* 30.4* 41.7*

Organic sales - % local 9.4* 11.4* 15.5* 3.4* 3.9* 6.0 10.2 13.9

Organic sales – % regional -8.5* -11.6* -12.0* -4.0 -7.6 -8.3 -0.8 -4.5

Organic sales – % national -10.0* -8.7 -9.1 -8.1 -3.0 -0.2 nc -18.1

Organic sales – % international 4.1* 5.1* 6.8* 3.2* 4.5* 3.2* -6.0 -0.4

Organic procurement – % local 2.2* 1.3 15.2 23.0 22.8* 22.7* 21.4* 13.9

Organic procurement – % regional 3.7* 1.6* 0.3 -3.8 -6.2 -14.8 5.0 9.2

Organic procurement – % national -21.7 -24.4* -28.2* -34.2* -32.4* -30.4* -25.7* -28.9*

Organic procurement – % internat’l 10.1* 14.3* 18.4* 17.2* 16.5* 20.1* 5.7 12.7

Page 23: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 23 AEREC Research Workshop Series (Oct. 29, 2008)

5. Results – AverageTreatment EffectsATE Accounting for Selection Bias, with Different Cluster Definitions

Clusters, Cn, where n = 3 to 10 Dependent Variable n = 3 4 5 6 7 8 9 10

Total gross sales per employee ($ millions) 0.16 0.49* 1.00 1.36* 1.42* 1.48* nc nc

Total gross sales ($ millions) 1.23* -2.74* 14.7* 14.2* 15.8* 11.7 5.33 9.80

Total full-time employees -1.26* -21.8* 22.8 5.8 10.7 -87.0* -80.7* -102.1*

Percentage of organic sales 7.6* 12.7* 12.4* 16.4* 17.1* 10.0 16.0* 22.4

Percentage of organic procurement 9.5* 15.0* 13.0* 23.1* 28.7* 25.6* 30.4* 41.7*

Organic sales - % local 9.4* 11.4* 15.5* 3.4* 3.9* 6.0 10.2 13.9

Organic sales – % regional -8.5* -11.6* -12.0* -4.0 -7.6 -8.3 -0.8 -4.5

Organic sales – % national -10.0* -8.7 -9.1 -8.1 -3.0 -0.2 nc -18.1

Organic sales – % international 4.1* 5.1* 6.8* 3.2* 4.5* 3.2* -6.0 -0.4

Organic procurement – % local 2.2* 1.3 15.2 23.0 22.8* 22.7* 21.4* 13.9

Organic procurement – % regional 3.7* 1.6* 0.3 -3.8 -6.2 -14.8 5.0 9.2

Organic procurement – % national -21.7 -24.4* -28.2* -34.2* -32.4* -30.4* -25.7* -28.9*

Organic procurement – % internat’l 10.1* 14.3* 18.4* 17.2* 16.5* 20.1* 5.7 12.7

Page 24: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 24 AEREC Research Workshop Series (Oct. 29, 2008)

6. Turning the Results Into a Paper: Revisit theory and other thorny issues

We believe that the previous table provides the type of results needed for an empirical contribution. (Of course we could be wrong.)So now we to anticipate what issues that would affect a paper’s “publish-ability”

Two* big issues (or weak spots)1. Leading readers from “theory” on firm agglomeration and clusters to the

empirical model.Be forthright in noting that some previous research resembles our approach, while some does not.Focus on econometric model (Treatment Effects) rather than theoretical modelBrief mention of how one might try to link the econometric model to theory (e.g., a restricted profit function)

* A third issue could be additional endogeneity in the RHS of equation (1) or (2). If there’s time, I’ll revisit this issue.

Page 25: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 25 AEREC Research Workshop Series (Oct. 29, 2008)

6. Two big issues (or weak spots)

2. Providing a rationale for the choice of regressors (especially x and z)Admit it’s ad hoc (don’t try to skip over that)Here, we again try to point to prior research.One paper in particular categorizes regressors in a way that works well for us:

Categories for C, x and z: (i) agglomeration variables (C),(ii) urban encroachment and population characteristic variables, (iii) input availability variables, (iv) firm productivity and specialization variables, (v) local economic variables.

Our Table uses these categories. (See word doc 5)

Page 26: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 26 AEREC Research Workshop Series (Oct. 29, 2008)

7. Improving the Paper: Future research (hopefully)

Two main refinements: Hopefully these can wait for a sequel1. Incorporate spatial econometrics:

Does a neighboring firm cluster (in the next county) affect a firm’s performance?Try a spatial lag or a spatial error model.

2. As an alternative to the Treatment Effects model, Use “Propensity Scores” to get an accurate estimate of the ATET (Average Treatment Effect on the Treated)Propensity scores can match treated against non-treated recreating a quasi-experimental design from secondary data.May work better or worse than Treatment Effect models.

Page 27: Slide 1 AEREC Research Workshop Series (Oct. 29, 2008) Edward C. Jaenicke Stephan J. Goetz Ping-Chao Wu Carolyn Dimitri (USDA-ERS)

Slide 27 AEREC Research Workshop Series (Oct. 29, 2008)

8. Thanks

Discussion and questions?

Critique of format?