7
1) What do you mean by decision science ? how stats Is important is DS ? 2) Arithmetic mean : The arithmetic mean of a set of data is found by taking the sum of the data, and then dividing the sum by the total number of values in the set. A mean is commonly referred to as an average. Geometric mean : Geometric mean is a kind of average of a set of numbers that is different from the arithmetic average. The geometric mean is well defined only for sets of positive real numbers. This is calculated by multiplying all the numbers (call the number of numbers n), and taking the nth root of the total. A common example of where the geometric mean is the correct choice is when averaging growth rates. Harmonic mean : Harmonic mean is used to calculate the average of a set of numbers. Here the number of elements will be averaged and divided by the sum of the reciprocals of the elements. The Harmonic mean is always the lowest mean. Median : The middle number in a sorted list o f numbers. To determine the median value in a sequence of numbers, the numbers must first be arranged in value order from lowest to highest. If there is an odd amount of numbers, t he median value is the number that is in the middle, with the same amount of numbers below and above. If there is an even amount of numbers in the list, the middle pair must be determined, added together and divided by two to find the me dian value. The median can be used to determine an approximate average Mode : A statistical term t hat refers to the most frequently oc curring number found in a set of numbers. The mode is found by collecting and organizing the data in order to count the frequency of each result. The result with the highest occurrences is the mode of the set. 3) Sampling and its process A Sampling is a part of the total population. It can be an individual element or a group of elements selected from the population. Although it is a subset, it is representative of the  population and suitable for research in terms of cost, convenience, and time. The sample group can be selected based on a probability or a non probability approach. A sample usually consists of various units of the population. The size of the sample is represented b y “n”. Sampling is the act, process, or technique o f selecting a representative part of a pop ulation for the purpose of determining the characteristics of the whole population. In other words, the process of selecting a sample from a population using special sampling techniques called sampling. It should be ensured in the sampling process itself that the sample selected is representative of the population. Steps in Sampling Process:  An operational sampling process can be divided into seven steps as given below: 1. Defining the target population. 2. Specifying the sampling frame. 3. Specifying the sampling unit.

theory ds

Embed Size (px)

Citation preview

Page 1: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 1/7

1)  What do you mean by decision science ? how stats Is important is DS ?

2)  Arithmetic mean : The arithmetic mean of a set of data is found by taking the sum of the data,

and then dividing the sum by the total number of values in the set. A mean is commonly

referred to as an average.

Geometric mean : Geometric mean is a kind of average of a set of numbers that is differentfrom the arithmetic average. The geometric mean is well defined only for sets of positive real

numbers. This is calculated by multiplying all the numbers (call the number of numbers n), and

taking the nth root of the total. A common example of where the geometric mean is the correct

choice is when averaging growth rates.

Harmonic mean : Harmonic mean is used to calculate the average of a set of numbers. Here the

number of elements will be averaged and divided by the sum of the reciprocals of the elements.

The Harmonic mean is always the lowest mean.

Median : The middle number in a sorted list of numbers. To determine the median value in a

sequence of numbers, the numbers must first be arranged in value order from lowest to highest.

If there is an odd amount of numbers, the median value is the number that is in the middle, with

the same amount of numbers below and above. If there is an even amount of numbers in the

list, the middle pair must be determined, added together and divided by two to find the median

value. The median can be used to determine an approximate average

Mode : A statistical term that refers to the most frequently occurring number found in a set of 

numbers. The mode is found by collecting and organizing the data in order to count the

frequency of each result. The result with the highest occurrences is the mode of the set.

3)  Sampling and its process

A Sampling is a part of the total population. It can be an individual element or a group of 

elements selected from the population. Although it is a subset, it is representative of the population and suitable for research in terms of cost, convenience, and time. The sample

group can be selected based on a probability or a non probability approach. A sample

usually consists of various units of the population. The size of the sample is represented by

“n”. 

Sampling is the act, process, or technique of selecting a representative part of a populationfor the purpose of determining the characteristics of the whole population. In other words,

the process of selecting a sample from a population using special sampling techniques

called sampling. It should be ensured in the sampling process itself that the sample selected

is representative of the population.

Steps in Sampling Process: 

An operational sampling process can be divided into seven steps as given below:

1.  Defining the target population.

2.  Specifying the sampling frame.

3.  Specifying the sampling unit.

Page 2: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 2/7

4.  Selection of the sampling method.

5.  Determination of sample size.

6.  Specifying the sampling plan.7.  Selecting the sample.

4) 

Methods of collection of data

Ans : The choice of method is influenced by the data collection strategy, the type of variable, the

accuracy required, the collection point and the skill of the enumerator. Links between a variable,

its source and practical methods for its collection (Table 6.1, Table 6.2 and Table 6.3) can help inchoosing appropriate methods. The main data collection methods are:

· Registration: registers and licences are particularly valuable for complete enumeration, but arelimited to variables that change slowly, such as numbers of fishing vessels and their 

characteristics.

· Questionnaires: forms which are completed and returned by respondents. An inexpensivemethod that is useful where literacy rates are high and respondents are co-operative.

· Interviews: forms which are completed through an interview with the respondent. More

expensive than questionnaires, but they are better for more complex questions, low literacy or 

less co-operation.

· Direct observations: making direct measurements is the most accurate method for many

variables, such as catch, but is often expensive. Many methods, such as observer programmes,are limited to industrial fisheries.

· Reporting: the main alternative to making direct measurements is to require fishers and othersto report their activities. Reporting requires literacy and co-operation, but can be backed up by a

legal requirement and direct measurements.

5)  Merits & Limitations of Arithmetic mean

Ans : Advantage

1: Fast and easy to calculate

.2: Easy to work with and use in further analysis

Disadvantage

1: Sensitive to extreme values

2: Not suitable for time series type of data

3: Works only when all values are equally important

Page 3: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 3/7

 

6)  Assumptions in lpp

1.  Ans : The constraints and objective function are linear.

o  This requires that the value of the objective function and the response of each

resource expressed by the constraints is proportional to the level of each activityexpressed in the variables.

o  Linearity also requires that the effects of the value of each variable on the values

of the objective function and the constraints are additive. In other words, there can

 be no interactions between the effects of different activities; i.e., the level of activity X 1 should not affect the costs or benefits associated with the level of 

activity X 2.

2.  Divisibility -- the values of decision variables can be fractions. Sometimes these values

only make sense if they are integers; then we need an extension of linear programmingcalled integer programming.

3.  Certainty -- the model assumes that the responses to the values of the variables are

exactly equal to the responses represented by the coefficients.4.  Data -- formulating a linear program to solve a problem assumes that data are available tospecify the problem.

7)  Difference between mean deviation and standard deviation

Ans : The standard deviation of a probability distribution, random variable, or population or

multiset of values is a measure of the spread of its values. It is usually denoted with the letter σ

(lower case sigma). It is defined as the square root of the variance.

The arithmetic mean (or simply the mean) of a list of numbers is the sum of all the members of 

the list divided by the number of items in the list. The arithmetic mean is what students are

taught very early to call the "average"8)  Roles of computers in DS

9)  Types of correlation

PositiveCorrelation

Positive correlation occurs when an increase in one variable increases the value in another.

The line corresponding to the scatter plot is an increasing line.

Page 4: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 4/7

 

NegativeCorrelation

 Negative correlation occurs when an increase in one variable decreases the value of another.

The line corresponding to the scatter plot is a decreasing line.

NoCorrelation

 No correlation occurs when there is no linear dependency between the variables.

Page 5: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 5/7

 

PerfectCorrelation

Perfect correlation occurs when there is a funcional dependency between the variables.

In this case all the points are in a straight line.

StrongCorrelation

A correlation is stronger the closer the points are located to one another on the line.

WeakCorrelation

A correlation is weaker the farther apart the points are located to one another on the line.

Page 6: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 6/7

 

10) Merits and limitations of scattered diagram

11) Assumptions of karl pearson coeff of correlation

Ans The Pearson an coefficient of correlation rests on two assumptions. I lie first is that a largenumber of independent contributory causes are operating in each of the two series correlated soas to produce normal or probability distribution. We know that such causes always operate in

chance phenomena like tossing of coin or throw of a dice. They also operate in other types of 

data. For example, such forces are usually found operating in phenomena like indices of price

and supply, ages of husbands and wives and heights of fathers and sons, etc.

The second assumption is that the forces so operating are not independent of each other but are

related in a casual fashion. If the forces are entirely independent and unrelated there cannot beany correlation between the two series. The forces must be common to both the series. The

height of an individual during the last ten years may show an upward trend and" his income

during this period may also show a similar tendency but there cannot be any correlation betweenthe two series because the forces affecting the two series are entirely unconnected with each

other. If the coefficient of correlation in such series is calculated it may even be +.8 indicating a

very high degree of positive correlation, but such correlation is usually termed as nonsense

correlation because the two series are affected by such sets of forces which are entirelyunconnected with each other 

12) Merits and limitations of spearman regression analysis

13) Difference between correlation and regression

Ans : Correlation, as the name suggest, is a measure of the degree to which two variables are

related. For instance, if x and y are two variables, correlation would be a linear associationbetween them. Regression, on the other hand, tells us the exact kind of linear association that

exists between those two variables.

14) What do you mean by probability and how do you calculate it?

Ans Probability is the chance that something will happen - how likely it is that some event will

happen. Sometimes you can measure a probability with a number: "10% chance of rain", or you

can use words such as impossible, unlikely, possible, even chance, likely and certain.

Page 7: theory ds

7/27/2019 theory ds

http://slidepdf.com/reader/full/theory-ds 7/7