BA-course 1-8.10.2012

Embed Size (px)

Citation preview

  • 7/29/2019 BA-course 1-8.10.2012

    1/29

    Basic Concepts of Statistics

    Prof. dr. Liliana [email protected]

  • 7/29/2019 BA-course 1-8.10.2012

    2/29

    Discipline Definition

    Statistics is the science of data. Itinvolves:

    collecting, classifying,

    summarizing,

    organizing,

    analyzing and

    interpreting the numerical information.

  • 7/29/2019 BA-course 1-8.10.2012

    3/29

    Applications of Statistics

    Descriptive Statistics uses numerical and graphicalmethods:

    to look for patterns in a data set,

    to summarize the information revealed in a data set,

    to present the information in a convenient form.

    Inferential Statistics uses sample data:

    to make estimates, decisions, predictions, or othergeneralizations about a larger set of data.

  • 7/29/2019 BA-course 1-8.10.2012

    4/29

    Basic Concepts of Statistics

    The population (collectivity) = thephenomenon to be studied (events, people,objects, tranzactions)

    The experimental unit (statistic unit) = theintegrant element of the population (simpleor complex)

    A variable (characteristic) = is a property ofan individual experimental unit

    The value (measurement)

    The frequency (the number of units with the

    same value of characteristic)

  • 7/29/2019 BA-course 1-8.10.2012

    5/29

    Fundamental Elements of Statistics

    A sample = a subset of the units of a population

    Astatistical inference = an estimate or prediction

    or some other generalization about a population

    based on information contained in a sample A measure of reliability = a statement (usually

    quantified) about the degree of uncertainty

    associated with a statistical inference.

  • 7/29/2019 BA-course 1-8.10.2012

    6/29

    Types of Data

    Quantitative data are measurements that

    are recorded on a naturally occurring

    numerical scale.

    Qualitative data can only be classified into

    categories.

    The statistical methods for describing,

    reporting and analyzing the data, depend on

    the data type (quantitative or qualitative).

  • 7/29/2019 BA-course 1-8.10.2012

    7/29

    Describing Qualitative Data

    Aclass is one of the categories into which qualitative datacan be classified.

    Theclass frequency is the number of observations in the

    data set falling into a particular class. Theclass relativefrequency (class percentage) is the class frequency dividedby the total number of observations in the data set (*100).

    Summary of Graphical Descriptive Methods for qualitativedata: Bar graph,

    Pie chart,

    Pareto diagrams: a column graph with the categories of thequalitative variable (the columns) arranged by height in descendingorder from left to right.

  • 7/29/2019 BA-course 1-8.10.2012

    8/29

    Bar graphFunction Asistent Lecturer Assistant

    professor

    Professor total

    fi 82 82 49 57 270

    The distribution of academic staff in a survey, at Transilvania

    University, in 2009

    82 82

    49

    57

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    Assistant Lecturer Ass. Professor Professor

    functions

    number

  • 7/29/2019 BA-course 1-8.10.2012

    9/29

    Pie chartFunction Asistent Lecturer Assistant

    professorProfessor total

    fi 30.4% 30.4% 18.1% 21.1% 100%

    The structure of the academic staff survey, at Transilvania

    University, in 2009

    Assistant

    31%

    Ass. Professor

    18%

    Lecturer

    30%

    Professor

    21%

  • 7/29/2019 BA-course 1-8.10.2012

    10/29

    Pareto diagram

    The Pareto diagram of the academic staff distribution in a survey, atTransilvania University, in 2009

    30% 30%

    21%

    18%

    0%

    5%

    10%

    15%

    20%

    25%

    30%

    35%

    Assistant Lecturer Professor Ass. Professor

    functions

    number

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

  • 7/29/2019 BA-course 1-8.10.2012

    11/29

    The 40 Best Paid Executives

    fifi*

    Bachelors 820%

    Law 410%

    Masters 410%

    MBA 2050%

    None 2

    5%

    PhD 25%

    Total 40 100%

    fi fi* fi*c

    MBA20 50% 50%

    Bachelors

    8 20% 70%

    Law4 10% 80%

    Masters4 10% 90%

    None2 5% 95%

    PhD2 5% 100%

    Total

    40 100%

    Source: Forbes, May 8, 2006

  • 7/29/2019 BA-course 1-8.10.2012

    12/29

    The 40 Best Paid Executives

    The Pareto diagram for degrees of 40 CEOs, in 2005

    50%

    20%

    10% 10%5% 5%

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    MBA Bachelors Law Masters None PhD

    degrees

    frequency(%)

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

    Source: Forbes, May 8, 2006

  • 7/29/2019 BA-course 1-8.10.2012

    13/29

    The Pareto Principle

    Vilfredo Pareto (1843 1923) Born in Paris, University of Turin: engineering

    and mathematics

    Univ. of Lausanne in Switzerland (1896) - Cours

    deconomie politique proved that the distribution of income and

    wealth in society is not random

    The pattern appears throughout history in all

    societies: approximately 80% of the totalwealth in a society lies with only 20% of thefamilies.

    vital few and the trivial many - the Pareto

    principle in economics

  • 7/29/2019 BA-course 1-8.10.2012

    14/29

    Graphical Methods for

    Describing Quantitative Data

    Dot plots

    Stem-and-leaf displays

    Histograms

  • 7/29/2019 BA-course 1-8.10.2012

    15/29

    the use of a diagram

    A diagram can be used for the purposes:

    to summarize large sets of data (structure), or

    to focus attention on some aspectof the data, or

    to display a trendin the data over time.

    A good diagram enables the viewer to grasp in a single

    glance the relevant features of the data, features thatwouldn't be obvious from the raw numbers themselves.

    The power that diagrams have to give us an instant

    impression of the data can also be abused. Diagramscan be constructed to give the impression that the datahave a feature that they don't really have, or somecommon ways of pictorially representing (andmisrepresenting) data.

  • 7/29/2019 BA-course 1-8.10.2012

    16/29

    Hong Kong's soaring population

  • 7/29/2019 BA-course 1-8.10.2012

    17/29

    Example

    Years Newspaper A(thou. pieces)

    Newspaper B(thou. pieces)

    1990 510 19111991 621 1829

    1992 624 1636

    1993 654 1555

    1994 732 1490

    S l i i

  • 7/29/2019 BA-course 1-8.10.2012

    18/29

    Scales originesusing two Y Axis, one for each series

    Number of copies evolution for newspaprs A and B,during 1990-1994

    0

    500

    1000

    1500

    2000

    2500

    1990 1991 1992 1993 1994

    years

    n

    r.ofpiecesB

    (thou.)

    500

    550

    600

    650

    700

    750

    n

    r.ofpiecesA(

    thou.)

    B A

  • 7/29/2019 BA-course 1-8.10.2012

    19/29

    Using two Y Axis (a)

    Number of copies evolution for newspaprs A and B,during 1990-1994

    0

    500

    1000

    1500

    2000

    2500

    1990 1991 1992 1993 1994

    years

    nr.ofpiecesB(thou.)

    0

    100

    200

    300

    400

    500

    600

    700

    800

    n

    r.ofpiecesA

    (thou.)

    B A

  • 7/29/2019 BA-course 1-8.10.2012

    20/29

    Using two Y Axis (b)

    Number of copies evolution for newspaprs A and B,

    during 1990-1994

    1250

    1350

    1450

    1550

    1650

    1750

    1850

    1950

    1990 1991 1992 1993 1994

    years

    nr.ofpiecesB(thou.)

    400

    450

    500

    550

    600

    650

    700

    750

    nr.ofpiecesA

    (thou.)

    B A

  • 7/29/2019 BA-course 1-8.10.2012

    21/29

    Using two Y Axis (c)

    Number of copies evolution for newspaprs A and B,

    during 1990-1994

    1450

    1500

    1550

    1600

    1650

    1700

    1750

    1800

    1850

    1900

    1950

    1990 1991 1992 1993 1994

    years

    nr.ofpiecesB(thou.)

    500

    550

    600

    650

    700

    750

    nr

    .ofpiecesA

    (thou.)

    B A

  • 7/29/2019 BA-course 1-8.10.2012

    22/29

    Correct graph

    Number of copies evolution for newspaprs A and B,during 1990-1994

    0

    500

    1000

    1500

    2000

    2500

    1990 1991 1992 1993 1994

    years

    nr.of

    pieces

    (tho

    u.

    )

    A B

    Comparative evolution of newspapers A si B, during 1990-1994

  • 7/29/2019 BA-course 1-8.10.2012

    23/29

  • 7/29/2019 BA-course 1-8.10.2012

    24/29

    Using plane images

    Doubling the production

    1996 1999

  • 7/29/2019 BA-course 1-8.10.2012

    25/29

    The mobile phone revolution

  • 7/29/2019 BA-course 1-8.10.2012

    26/29

    Using spatial images

    2 errors: the dimensions and the inflation

  • 7/29/2019 BA-course 1-8.10.2012

    27/29

    Tricky comparisons

    Absolute values

    Governmental expenditure evolution in U.S.A. during 1930-1984

    (Wonnacott, ediia a4 -a. pag.64)

    0

    100

    200

    300

    400

    500

    600

    700

    800

    900

    bilioane

    $

    1930193619421948195419601966197219781984

    anii

  • 7/29/2019 BA-course 1-8.10.2012

    28/29

    Correct graph Relative values (%)

    Governmental expenditure evolution in U.S.A. (% din PNB) during 1930-1984(Wonnacott, ediia a4 -a. pag.64)

    0

    10

    20

    30

    40

    50

    60

    procentedinPN

    B(%)

    1930193619421948195419601966197219781984

    anii

  • 7/29/2019 BA-course 1-8.10.2012

    29/29

    An effective campaign?

    In 1956, the U.S.A.state of Connecticut

    began a severe

    crackdown on

    speeding drivers.

    The following

    graph shows the

    annual number oftraffic fatalities

    before and after the

    crackdown.