31
Dr. John P. Abraham Professor University of Texas Pan American Statistics in Theses

Statistics in Theses

  • Upload
    shira

  • View
    43

  • Download
    0

Embed Size (px)

DESCRIPTION

Statistics in Theses. Dr. John P. Abraham Professor University of Texas Pan American. Describe an egg. Students try to do this. Differences in description. Children’s view Adults’ view Shopper’s view Seller’s view Producer’s view Chicken’s view Biologist’s view Dietician’s view - PowerPoint PPT Presentation

Citation preview

Page 1: Statistics in Theses

Dr. John P. AbrahamProfessor

University of Texas Pan American

Statistics in Theses

Page 2: Statistics in Theses

Describe an eggStudents try to do this

Page 3: Statistics in Theses

Differences in descriptionChildren’s viewAdults’ viewShopper’s viewSeller’s viewProducer’s viewChicken’s viewBiologist’s viewDietician’s viewChemist’s view

Page 4: Statistics in Theses

MeasurementsYou need to describe using some

measurementsErrors in measurements

Page 5: Statistics in Theses

Descriptive statisticssummarizing a collection of data in a clear

and understandable way.NumericalGraphical

Page 6: Statistics in Theses

Numerical descriptive statisticsSpread

RangeSemi-interquartile rangeStd deviation

central tendencyMeanMedianMode

Page 7: Statistics in Theses

Inferential StatisticsInfer about a population based on a sampleInfer about the future based on past

Page 8: Statistics in Theses

Hypothesis testing using variablesA variable is characteristic of an object of a

study that can be measured.The measurements will be different for

different objects.Can be quantitative or qualitativeCan be independent or dependentContinuous or discrete (when we create a 1

to 5 ranking)

Page 9: Statistics in Theses

Necessity for controlWhat is a control group

A control group study uses a control group to compare to an experimental group in a test of a causal hypothesis.

The control and experimental groups must be identical in all relevant ways except for the introduction of a suspected causal agent into the experimental group.

For example, if 'C' causes 'E', when we introduce 'C' into the experimental group but not into the control group, we should find 'E' occurring in the experimental group at a significantly greater rate than in the control group.

Significance is measured by relation to chance: if an event is not likely due to chance, then its occurrence is significant.

Page 10: Statistics in Theses

Double blind studya control group test where neither the

evaluator nor the subject knows which items are controls

A randomized test is one that randomly assigns items to the control and the experimental groups.

The purpose of controls, double-blind, and randomized testing is to reduce error, self-deception and bias.

Page 11: Statistics in Theses

PlaceboMany control group studies use a placebo in

control groups to keep the subjects in the dark as to whether they are being given the causal agent that is being tested.

For example, both the control and experimental groups will be given identical looking pills in a study testing the effectiveness of a new drug. Only one pill will contain the agent being tested; the other pill will be a placebo.

In a double-blind study, the evaluator of the results would not know which subjects got the placebo until his or her evaluation of observed results was completed. This is to avoid evaluator bias from influencing observations and measurements.

Page 12: Statistics in Theses

Inferential statisticswe use inferential statistics to make

inferences from our data to more general conditions; we use descriptive statistics simply to describe what's going on in our data.

we use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study.

Page 13: Statistics in Theses

T-testcompare the average performance of two

groups on a single measure to see if there is a difference.

You might want to know whether there is a difference between girls and boys in their math abilities.

Whenever you wish to compare the average performance between two groups you should consider the t-test for differences between groups.

Page 14: Statistics in Theses

Is there a difference?

Page 15: Statistics in Theses

How about here?

Page 16: Statistics in Theses

T-test exampleThe Acme Company has developed a new battery. The

engineer in charge claims that the new battery will operate continuously for at least 7 minutes longer than the old battery.

To test the claim, the company selects a simple random sample of 100 new batteries and 100 old batteries. The old batteries run continuously for 190 minutes with a standard deviation of 20 minutes; the new batteries, 200 minutes with a standard deviation of 40 minutes.

Test the engineer's claim that the new batteries run at least 7 minutes longer than the old. Use a 0.05 level of significance. (Assume that there are no outliers in either sample.)

See next slie

Page 17: Statistics in Theses

4 steps needed (1) state the hypotheses, (2) formulate an analysis

plan, (3) analyze sample data, and (4) interpret results State the hypotheses. The first step is to state the

null hypothesis and an alternative hypothesis.Null hypothesis: μ1 - μ2 >= 7

Alternative hypothesis: μ1 - μ2 < 7Formulate an analysis plan. For this analysis, the

significance level is 0.05. Using sample data, we will conduct a two-sample t-test of the null hypothesis.

Analyze sample data. Using sample data, we compute the standard error (SE), degrees of freedom (DF), and the t-score test statistic (t). t = [ (x1 - x2) - d ] / SE = [(200 - 190) - 7] / 4.472 = 3/4.472 = 0.67

Interpret results. Since the P-value (0.75) is greater than the significance level (0.05), we cannot reject the null hypothesis.

Page 18: Statistics in Theses

Standard ScoreProblem A national achievement test is administered

annually to 3rd graders. The test has a mean score of 100 and a standard deviation of 15. If Jane's z-score is 1.20, what was her score on the test?

From the z-score equation, we know z = (X - μ) / σwhere z is the z-score, X is the value of the

element, μ is the mean of the population, and σ is the standard deviation.

Solving for Jane's test score (X), we getX = ( z * σ) + 100 = ( 1.20 * 15) + 100 = 18 + 100

= 118

Page 19: Statistics in Theses

ProbabilityMathematically, the probability that an event will occur

is expressed as a number between 0 and 1. Notationally, the probability of event A is represented by

P(A). A coin is tossed three times. What is the probability that

it lands on heads exactly one time? If you toss a coin three times, there are a total of eight

possible outcomes. They are: HHH, HHT, HTH, THH, HTT, THT, TTH, and TTT. Of the eight possible outcomes, three have exactly one head. They are: HTT, THT, and TTH. Therefore, the probability that three flips of a coin will produce exactly one head is 3/8 or 0.375.

Page 20: Statistics in Theses

ANOVA (Analysis of Variance)gives a statistical test of whether the

means of several groups are all equalMANOVA (multivariate analysis of variance)

Multivariate analysis of variance (MANOVA) is used when there is more than one dependent variable.

Page 21: Statistics in Theses

CorrelationStatistical correlation is a statistical technique

which tells us if two variables are related.

If the change in one variable is accompanied by a change in the other, then the variables are said to be correlated. We can therefore say that family income and family expenditure, price and demand are correlated.

You should measure manipulated variables rather than: one could compute 'r' between the size of shoe and intelligence of individuals, heights and income. Irrespective of the value of 'r', it makes no sense and is hence termed chance or non–sense correlation.

Page 22: Statistics in Theses

r Value

In general, r > 0 indicates positive relationship, r < 0 indicates negative relationship while r = 0 indicates no relationship (or that the variables are independent and not related). Here r = +1.0 describes a perfect positive correlation and r = -1.0 describes a perfect negative correlation.

value of rStrength of relationship-1.0 to –0.5 or 1.0 to 0.5Strong-0.5 to –0.3 or 0.3 to 0.5Moderate-0.3 to –0.1 or 0.1 to 0.3Weak–0.1 to 0.1None or very weak

Page 23: Statistics in Theses
Page 24: Statistics in Theses

Analysis of CovarianceAnova mixed with regression analysisANCOVA tests whether certain factors have an

effect on the outcome variable after removing the variance for which quantitative predictors (covariates) account.

Suppose you analyze the results of a clinical trial of three types of treatment of a disease - "Placebo", "Drug 1", and "Drug 2". The results are three sets of survival times, corresponding to patients from the three treatment groups. The question of interest is whether there is a difference between the three types of treatment in the average survival time.

Page 25: Statistics in Theses

ANCOVA cont.You might use analysis of variance to answer

this question. But, if you have supplementary information, for example, each patient's age, then analysis of covariance allows you to adjust the treatment effect (survival time, in this case) to a particular age, say, the mean age of all patients. Age in this case is a "covariate" - it is not related to treatment, but can affect the survival time. This adjustment allows you to reduce the observed variation between the three groups caused not by the treatment itself but by variation of age.

Page 26: Statistics in Theses

Regression AnalysisRegression analysis provides a "best-fit"

mathematical equation for the relationship between the dependent variable (response) and independent variable(s) (covariates).

In linear regression, the function is a linear (straight-line) equation. For example, if we assume the value of an automobile decreases by a constant amount each year after its purchase, and for each mile it is driven, we can create a formula to find the value.

Page 27: Statistics in Theses

Summarize the courseWhy use share point services?

You will have several faculty members on your committee

All will have to comment on your thesis and correct.

Best way to make appointments with many people

One central repository for all your files.Different versions are kept. In case of a

mistaken edit can go back.

Page 28: Statistics in Theses

Why review different thesesDiscussed styleDiscussed chaptersDiscussed contentHow to get ideas for your research from

suggestions

Page 29: Statistics in Theses

ReferencesDiscussed different types of references and

what is acceptable and what is not.Discussed plagiarism at lengthDiscussed how to quote and how to cite

Page 30: Statistics in Theses

Theses and ProjectDifferencesSimilaritiesReport writing

Page 31: Statistics in Theses

Formal research studiesHypothesis formulationCollect raw dataConduct statistical analysisMake concultionsReport