Upload
ioana-chima
View
222
Download
0
Embed Size (px)
Citation preview
7/29/2019 BA-course 1-8.10.2012
1/29
Basic Concepts of Statistics
Prof. dr. Liliana [email protected]
7/29/2019 BA-course 1-8.10.2012
2/29
Discipline Definition
Statistics is the science of data. Itinvolves:
collecting, classifying,
summarizing,
organizing,
analyzing and
interpreting the numerical information.
7/29/2019 BA-course 1-8.10.2012
3/29
Applications of Statistics
Descriptive Statistics uses numerical and graphicalmethods:
to look for patterns in a data set,
to summarize the information revealed in a data set,
to present the information in a convenient form.
Inferential Statistics uses sample data:
to make estimates, decisions, predictions, or othergeneralizations about a larger set of data.
7/29/2019 BA-course 1-8.10.2012
4/29
Basic Concepts of Statistics
The population (collectivity) = thephenomenon to be studied (events, people,objects, tranzactions)
The experimental unit (statistic unit) = theintegrant element of the population (simpleor complex)
A variable (characteristic) = is a property ofan individual experimental unit
The value (measurement)
The frequency (the number of units with the
same value of characteristic)
7/29/2019 BA-course 1-8.10.2012
5/29
Fundamental Elements of Statistics
A sample = a subset of the units of a population
Astatistical inference = an estimate or prediction
or some other generalization about a population
based on information contained in a sample A measure of reliability = a statement (usually
quantified) about the degree of uncertainty
associated with a statistical inference.
7/29/2019 BA-course 1-8.10.2012
6/29
Types of Data
Quantitative data are measurements that
are recorded on a naturally occurring
numerical scale.
Qualitative data can only be classified into
categories.
The statistical methods for describing,
reporting and analyzing the data, depend on
the data type (quantitative or qualitative).
7/29/2019 BA-course 1-8.10.2012
7/29
Describing Qualitative Data
Aclass is one of the categories into which qualitative datacan be classified.
Theclass frequency is the number of observations in the
data set falling into a particular class. Theclass relativefrequency (class percentage) is the class frequency dividedby the total number of observations in the data set (*100).
Summary of Graphical Descriptive Methods for qualitativedata: Bar graph,
Pie chart,
Pareto diagrams: a column graph with the categories of thequalitative variable (the columns) arranged by height in descendingorder from left to right.
7/29/2019 BA-course 1-8.10.2012
8/29
Bar graphFunction Asistent Lecturer Assistant
professor
Professor total
fi 82 82 49 57 270
The distribution of academic staff in a survey, at Transilvania
University, in 2009
82 82
49
57
0
10
20
30
40
50
60
70
80
90
Assistant Lecturer Ass. Professor Professor
functions
number
7/29/2019 BA-course 1-8.10.2012
9/29
Pie chartFunction Asistent Lecturer Assistant
professorProfessor total
fi 30.4% 30.4% 18.1% 21.1% 100%
The structure of the academic staff survey, at Transilvania
University, in 2009
Assistant
31%
Ass. Professor
18%
Lecturer
30%
Professor
21%
7/29/2019 BA-course 1-8.10.2012
10/29
Pareto diagram
The Pareto diagram of the academic staff distribution in a survey, atTransilvania University, in 2009
30% 30%
21%
18%
0%
5%
10%
15%
20%
25%
30%
35%
Assistant Lecturer Professor Ass. Professor
functions
number
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
7/29/2019 BA-course 1-8.10.2012
11/29
The 40 Best Paid Executives
fifi*
Bachelors 820%
Law 410%
Masters 410%
MBA 2050%
None 2
5%
PhD 25%
Total 40 100%
fi fi* fi*c
MBA20 50% 50%
Bachelors
8 20% 70%
Law4 10% 80%
Masters4 10% 90%
None2 5% 95%
PhD2 5% 100%
Total
40 100%
Source: Forbes, May 8, 2006
7/29/2019 BA-course 1-8.10.2012
12/29
The 40 Best Paid Executives
The Pareto diagram for degrees of 40 CEOs, in 2005
50%
20%
10% 10%5% 5%
0%
10%
20%
30%
40%
50%
60%
MBA Bachelors Law Masters None PhD
degrees
frequency(%)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Source: Forbes, May 8, 2006
7/29/2019 BA-course 1-8.10.2012
13/29
The Pareto Principle
Vilfredo Pareto (1843 1923) Born in Paris, University of Turin: engineering
and mathematics
Univ. of Lausanne in Switzerland (1896) - Cours
deconomie politique proved that the distribution of income and
wealth in society is not random
The pattern appears throughout history in all
societies: approximately 80% of the totalwealth in a society lies with only 20% of thefamilies.
vital few and the trivial many - the Pareto
principle in economics
7/29/2019 BA-course 1-8.10.2012
14/29
Graphical Methods for
Describing Quantitative Data
Dot plots
Stem-and-leaf displays
Histograms
7/29/2019 BA-course 1-8.10.2012
15/29
the use of a diagram
A diagram can be used for the purposes:
to summarize large sets of data (structure), or
to focus attention on some aspectof the data, or
to display a trendin the data over time.
A good diagram enables the viewer to grasp in a single
glance the relevant features of the data, features thatwouldn't be obvious from the raw numbers themselves.
The power that diagrams have to give us an instant
impression of the data can also be abused. Diagramscan be constructed to give the impression that the datahave a feature that they don't really have, or somecommon ways of pictorially representing (andmisrepresenting) data.
7/29/2019 BA-course 1-8.10.2012
16/29
Hong Kong's soaring population
7/29/2019 BA-course 1-8.10.2012
17/29
Example
Years Newspaper A(thou. pieces)
Newspaper B(thou. pieces)
1990 510 19111991 621 1829
1992 624 1636
1993 654 1555
1994 732 1490
S l i i
7/29/2019 BA-course 1-8.10.2012
18/29
Scales originesusing two Y Axis, one for each series
Number of copies evolution for newspaprs A and B,during 1990-1994
0
500
1000
1500
2000
2500
1990 1991 1992 1993 1994
years
n
r.ofpiecesB
(thou.)
500
550
600
650
700
750
n
r.ofpiecesA(
thou.)
B A
7/29/2019 BA-course 1-8.10.2012
19/29
Using two Y Axis (a)
Number of copies evolution for newspaprs A and B,during 1990-1994
0
500
1000
1500
2000
2500
1990 1991 1992 1993 1994
years
nr.ofpiecesB(thou.)
0
100
200
300
400
500
600
700
800
n
r.ofpiecesA
(thou.)
B A
7/29/2019 BA-course 1-8.10.2012
20/29
Using two Y Axis (b)
Number of copies evolution for newspaprs A and B,
during 1990-1994
1250
1350
1450
1550
1650
1750
1850
1950
1990 1991 1992 1993 1994
years
nr.ofpiecesB(thou.)
400
450
500
550
600
650
700
750
nr.ofpiecesA
(thou.)
B A
7/29/2019 BA-course 1-8.10.2012
21/29
Using two Y Axis (c)
Number of copies evolution for newspaprs A and B,
during 1990-1994
1450
1500
1550
1600
1650
1700
1750
1800
1850
1900
1950
1990 1991 1992 1993 1994
years
nr.ofpiecesB(thou.)
500
550
600
650
700
750
nr
.ofpiecesA
(thou.)
B A
7/29/2019 BA-course 1-8.10.2012
22/29
Correct graph
Number of copies evolution for newspaprs A and B,during 1990-1994
0
500
1000
1500
2000
2500
1990 1991 1992 1993 1994
years
nr.of
pieces
(tho
u.
)
A B
Comparative evolution of newspapers A si B, during 1990-1994
7/29/2019 BA-course 1-8.10.2012
23/29
7/29/2019 BA-course 1-8.10.2012
24/29
Using plane images
Doubling the production
1996 1999
7/29/2019 BA-course 1-8.10.2012
25/29
The mobile phone revolution
7/29/2019 BA-course 1-8.10.2012
26/29
Using spatial images
2 errors: the dimensions and the inflation
7/29/2019 BA-course 1-8.10.2012
27/29
Tricky comparisons
Absolute values
Governmental expenditure evolution in U.S.A. during 1930-1984
(Wonnacott, ediia a4 -a. pag.64)
0
100
200
300
400
500
600
700
800
900
bilioane
$
1930193619421948195419601966197219781984
anii
7/29/2019 BA-course 1-8.10.2012
28/29
Correct graph Relative values (%)
Governmental expenditure evolution in U.S.A. (% din PNB) during 1930-1984(Wonnacott, ediia a4 -a. pag.64)
0
10
20
30
40
50
60
procentedinPN
B(%)
1930193619421948195419601966197219781984
anii
7/29/2019 BA-course 1-8.10.2012
29/29
An effective campaign?
In 1956, the U.S.A.state of Connecticut
began a severe
crackdown on
speeding drivers.
The following
graph shows the
annual number oftraffic fatalities
before and after the
crackdown.