41
Quantitative Methods For Social Sciences Lionel Nesta Observatoire Français des Conjonctures Economiques [email protected] CERAM February-March-April 2008

Quantitative Methods For Social Sciences Lionel Nesta Observatoire Français des Conjonctures Economiques [email protected] CERAM February-March-April

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Quantitative MethodsFor Social Sciences

Lionel Nesta

Observatoire Français des Conjonctures Economiques

[email protected]

CERAM February-March-April 2008

Objective of The Course The objective of the class is to provide students with a set of techniques to analyze

quantitative data. It concerns the application of quantitative and statistical approaches as

developed in the social sciences, for future decision makers, policy markers, stake

holders, managers, etc.

All courses are computer-based classes using the SPSS statistical package. The objective

is to reach levels of competence which provide the student with skills to both read and

understand the work of others and to carry out one's own research.

Class Password: stmarec123

Examples Rise in biotechnology

Should the EU fund fundamental research in biotechnology?

Has biotechnology increased the productivity of firm-level R&D?

Did it increase the speed of discovery in pharmaceutical R&D?

Increasing university-industry collaborations

Does it facilitate innovation by firms?

Does it increase the production of new knowledge by academics?

Does it modify the fundamental/applied nature of research?

Examples Economic (productivity) Growth

Does it come mainly from new firms or improving existing firms?

Is market selection operating correctly?

Why do good firms exit the market?

How does the organisation of knowledge impact on performance?

How do knowledge stock and specialisation impact on productivity?

How do firms enter into new technological fields?

Do firms diversify in new technologies/businesses purposively?

Structure of the Class

Class 1 : Descriptive Statistics

Class 2 : Statistical Inference

Class 3 : Relationship Between Variables

Class 4 : Ordinary Least Squares (OLS)

Class 5 : Extension to OLS

Class 6 : Qualitative Dependent variables

Structure of the Class

Class 1 : Descriptive Statistics

Mean, variance, standard deviation

Data management

Class 2 : Statistical Inference

Class 3 : Relationship Between Variables

Class 4 : Ordinary Least Squares (OLS)

Class 5 : Extension to OLS

Class 6 : Qualitative Dependent variables

Structure of the Class

Class 1 : Descriptive Statistics

Class 2 : Statistical Inference

Distributions

Comparison of means

Class 3 : Relationship Between Variables

Class 4 : Ordinary Least Squares (OLS)

Class 5 : Extension to OLS

Class 6 : Qualitative Dependent variables

Structure of the Class

Class 1 : Descriptive Statistics

Class 2 : Statistical Inference

Class 3 : Relationship Between Variables

ANOVA, Chi-Square

Correlation

Class 4 : Ordinary Least Squares (OLS)

Class 5 : Extension to OLS

Class 6 : Qualitative Dependent variables

Structure of the Class

Class 1 : Descriptive Statistics

Class 2 : Statistical Inference

Class 3 : Relationship Between Variables

Class 4 : Ordinary Least Squares (OLS)

Correlation coefficient, simple regression

Multiple regression

Class 5 : Extension to OLS

Class 6 : Qualitative Dependent variables

Structure of the Class

Class 1 : Descriptive Statistics

Class 2 : Statistical Inference

Class 3 : Relationship Between Variables

Class 4 : Ordinary Least Squares (OLS)

Class 5 : Extension to OLS

Regressions diagnostics

Qualitative explanatory variables

Class 6 : Qualitative Dependent variables

Structure of the Class

Class 1 : Descriptive Statistics

Class 2 : Statistical Inference

Class 3 : Relationship Between Variables

Class 4 : Ordinary Least Squares (OLS)

Class 5 : Extension to OLS

Class 6 : Qualitative Dependent variables

Linear probability model

Maximum likelihood (logit, probit)

Class 1Descriptive Statistics

Types of DataDescriptive statistics is the branch of statistics which gathers all techniques used to describe and summarize quantitative and qualitative data.

Quantitative data Continuous Measured on a scale (value its the range) The size of the number reflect the amount of the variable Age; wage, sales; height, weight; GDP

Qualitative data Discrete, categorical The number reflect the category of the variable Type of work; gender; nationality

Descriptive Statistics

All means are good to summarize data in a synthetic way: graphs; charts; tables.

Quantitative data Graphs: scatter plots; line plots; histograms Central tendency Dispersion

Qualitative data Graphs: pie graphs; histograms Tables, frequency, percentage, cumulative percentage Cross tables

Central Tendency and Dispersion A distribution is an ordered set of numbers showing how many

times each occurred, from the lowest to the highest number or the

reverse

Central tendency: measures of the degree to which scores are

clustered around the mean of a distribution

Dispersion: measures the fluctuations around the characteristics of

central tendency

In other words, the characteristics of central tendency produce

stylized facts, when the characteristics of dispersion look at the

representativeness of a given stylized fact.

Central Tendency The mode

The most frequent score in distribution is

called the mode.

The median The middle value of all observed values, when

50% of observed value are higher and 50% of

observed value are lower than the median

The mean The sum of all of the values divided by the

number of value 1

1

i n

ii

X xN

The mode, the mean and the median ore equal if and only of the distribution is symmetrical and unimodal.

Dispersion

22 1

i n

ii

x X

N

The range

Difference between the maximum and

minimum values

The variance Average of the squared differences between

data points and the mean (average)

quadratic deviation

The standard deviation Square root of variance, therefore measures

the spread of data about the mean,

measured in the same units as the data

22 1

i n

ii

x X

N

max min R x x

Dispersion

22 1

i n

ii

x X

N

The range

Difference between the maximum and

minimum values

The variance Average of the squared differences between

data points and the mean (average)

quadratic deviation

The standard deviation Square root of variance, therefore measures

the spread of data about the mean,

measured in the same units as the data

22 1

i n

ii

x X

N

max min R x x

Research Productivity in the Bio-pharmaceutical Industry EU Framework Programme 7

Stylised Facts about Modern Biotech1. Innovations emerge from uncertain, complex processes

involving knowledge and markets: Roles of networks.

2. Economic value is created in many ways – globally and in geographical agglomerations

3. Various linkages exist among diverse actors (LDFs, DBFs, Univ, Venture Capital) in innovation processes, but the firm plays a particularly important role.

4. Regulations, social structures and institutions affect on-going innovation processes as well as their impacts on society: Importance of IPR.

SPSSStatistical Package for the Social Sciences

The SPSS software Statistical Package for the Social Sciences (1968)

Among the most widely used programs for statistical analysis

in social sciences.

Market researchers, health researchers, survey companies,

government, education researchers, and others.

Data management (case selection, file reshaping, creating

derived data)

Features of SPSS are accessible via pull-down menus

The pull-down menu interface generates command syntax.

SPSS : Opening SPSS

SPSS : Importing data

SPSS : Importing data

SPSS : Importing data Settings in the “import text” dialogue box

No predefine format (1)

Delimited (2)

First lines contains the variable names (2)

One observation per line // all observations (3)

Tab delimited only (4)

Finish (6)

SPSS windows SPSS has opens automatically windows

The datasheet window

Observe, manage, modify, create, data

The results window

Everything you do will be stored there

The syntax window can be opened

SPSS : Data sheet (1)

SPSS : Data sheet (2)

SPSS : Result / Journal

SPSS : Saving data

SPSS : working, at last!

Recoding Variables Changing existing values to new values (biotechnologie → DBF,

pharmaceutique → LDF)

1

2

3

Computing New Variables Taking logarithm (normalization of continuous variables)

1 2

Creating Dummy Variables Taking logarithm (normalization of continuous variables)

1

2

3

Computation of Descriptive Statistics

1

2

3

Descriptive Statistics

Statistiques descriptives

457 286 0 286 11.92 22.901 524.470

457 35788473.97 4422.18 35792896.15 4358371.54 6086530.85 3.705E+013

457 1917997.980 858.53204 1918856.512 330236.630 405160.516 164155043889

457 2.0235309 -1.1298400 .8936909 -.056808610 .3374751802 .114

457 1 0 1 .63 .482 .232

457 1 0 1 .37 .482 .232

457

patent

assets

rd

spe

pharma

biotech

N valide (listwise)

N Intervalle Minimum Maximum Moyenne Ecart type Variance

Splitting Database

1 2

Descriptive Statistics (by type)

Statistiques descriptives

167 202 0 202 12.11 21.066 443.764

167 2442619 4422.18 2447041 342934.49 478511.938 2E+011

167 495443.5 858.53204 496302.1 58116.590 88638.5347 8E+009

167 1.7544527 -1.12984 .6246127 -.10630582 .343286812 .118

167 0 0 0 .00 .000 .000

167 0 1 1 1.00 .000 .000

167

290 286 0 286 11.81 23.929 572.609

290 4E+007 218006.47 4E+007 6670709.4 6605972.68 4E+013

290 1912600 6256.248 1918857 486940.24 432514.940 2E+011

290 1.6904465 -.7967556 .8936909 -.02830504 .331330781 .110

290 0 1 1 1.00 .000 .000

290 0 0 0 .00 .000 .000

290

patent

assets

rd

spe

pharma

biotech

N valide (listwise)

patent

assets

rd

spe

pharma

biotech

N valide (listwise)

typeDBF

LDF

N Intervalle Minimum Maximum Moyenne Ecart type Variance

Assignments Compute logarithm for all quantitative variables patent, assets,

rd, and name them lnpatent, lnassets and lnrd, respectively.

Compute descriptive statistics for both LDFs and DBFs.

Draw conclusion by comparing means.

Logarithm

log 1log

x xx

x x x

Normalization

Taking the logarithm is a transformation which usually normalize distribution.

Elasticities http://en.wikipedia.org/wiki/Elasticity_(economics)

A change in log of x is a relative change of x itself.

Cobb-Douglas production function