Probability models- the Normal especially

• probability models- the Normal especially

• checking distributional assumptions

Histogram of FS

SEPA location code: 4556FS/100ml

0 20 40 60 80 100

-2 -1 0 1 2

Normal Q-Q Plot

Theoretical Quantiles

Histogram of log10(FS)

SEPA location code: 4556log10(FS)/100ml

0.0 0.5 1.0 1.5 2.0

-2 -1 0 1 2

Normal Q-Q Plot

Theoretical Quantiles

0.0 0.5 1.0 1.5 2.0

FS: Site 9320

Theoretical Percentile (log10 scale): 1.47x

1.47 1.75

Theoretical Percentile

Empirical Percentile

Log scale 1.47 1.75

Directive Scale

29.5 56.2

Modelling Continuous Variables checking normality

• Normal probability plot

• Should show a straight line

• p-value of test is also reported (null: data are Normally distributed)C1

43210-1-2-3

80706050403020

0.1211StDev 1.015N 100AD 0.361P-Value

Probability Plot of C1Normal

• another statistic- the estimated standard error

Statistical inference

• Confidence intervals

• Hypothesis testing and the p-value

• Statistical significance vs real-world importance

• a formal statistical procedure- confidence intervals

Confidence intervals- an alternative to hypothesis testing

• A confidence interval is a range of credible values for the population parameter. The confidence coefficient is the percentage of times that the method will in the long run capture the true population parameter.

• A common form is sample estimator 2* estimated standard error

• another formal inferential procedure- hypothesis testing

Hypothesis Testing

• Null hypothesis: usually ‘no effect’

• Alternative hypothesis: ‘effect’

• Make a decision based on the evidence (the data)

• There is a risk of getting it wrong!

• Two types of error:-– reject null when we shouldn’t

- Type I– don’t reject null when we should

- Type II

Significance Levels

• We cannot reduce probabilities of both Type I and Type II errors to zero.

• So we control the probability of a Type I error.

• This is referred to as the Significance Level or p-value.

• Generally p-value of <0.05 is considered a reasonable risk of a Type I error.(beyond reasonable doubt)

Statistical Significance vs. Practical Importance

• Statistical significance is concerned with the ability to discriminate between treatments given the background variation.

• Practical importance relates to the scientific domain and is concerned with scientific discovery and explanation.

Power is related to Type II error

probability of power = 1 - making a Type II

to keep power as high as possible

Statistical models

• Outcomes or Responsesthese are the results of the practical work and are sometimes referred to as ‘dependent variables’.

• Causes or Explanationsthese are the conditions or environment within which the outcomes or responses have been observed and are sometimes referred to as ‘independent variables’, but more commonly known as covariates.

• relationships- linear or otherwise

Correlations and linear relationships

• pearson correlation

• Strength of linear relationship

• Simple indicator lying between –1 and +1

• Check your plots for linearity

gene correlations

1.11.00.90.80.70.60.50.4

mBadSpl

corr 0.9

1312111098765

mBcl2Sp

corr 0.5

0.150.100.050.00

mBclxLNR

corr 0.03

0.90.80.70.60.50.4

mBadLN

corr -0.56

Interpreting correlations

• The correlation coefficient is used as a measure of the linear relationship between two variables,

• The correlation coefficient is a measure of the strength of the linear association between two variables. If the relationship is non-linear, the coefficient can still be evaluated and may appear sensible, so beware- plot the data first.

A matrix plot

1209060 1.00.50.0 16808.8

0.00.4

8.88.07.2

840 0.40.20.0

pH (pH units)

O2 -%sat (%)

BOD (ATU) (mg/ L)

Ammonia as N (mg/ L)

o-Phos as P (mg/ L)

Fe (mg/ L)

Matrix Plot of pH (pH units, O2 -% sat (% ), BOD (ATU) (m, ...

181614121086420

Fe (mg/ L)

Scatterplot of Ammonia as N (mg/ L) vs Fe (mg/ L)

181614121086420

Fe (mg/ L)

Scatterplot of o-Phos as P (mg/ L) vs Fe (mg/ L)

1.00.80.60.40.20.0

Ammonia as N (mg/ L)

Scatterplot of o-Phos as P (mg/ L) vs Ammonia as N (mg/ L)

Correlations

• P and N, 0.228 (p-value 0.001)

• Fe and N, 0.174 (p-value 0.008)

• Fe and P, 0.605 (p-value 0.000)

• all highly significant, but do the scatterplots support this interpretation?

• points tend to be clustered in bottom left corner of plot,

• there are one or two observations well separated from the cluster

• both might suggest a transformation (try logs)

3210-1

log Fe

Scatterplot of log P vs log Fe

3210-1

log Fe

Scatterplot of log N vs log Fe

0-1-2-3-4-5-6

Scatterplot of log P vs log N

Correlations

• logP, logN 0.167 (p-value 0.012)

• logFe, LogN 0.134 (p-value 0.043)

• logP, log Fe, 0.380 (p-value 0.000)

• what is a statistical model?

Statistical models

• In experiments many of the covariates have been determined by the experimenter but some may be aspects that the experimenter has no control over but that are relevant to the outcomes or responses.

• In observational studies, these are usually not under the control of the experimenter but are recorded as possible explanations of the outcomes or responses.

Specifying a statistical models

• Models specify the way in which outcomes and causes link together, eg.

• Metabolite = Temperature• The = sign does not indicate equality in a

mathematical sense and there should be an additional item on the right hand side giving a formula:-

• Metabolite = Temperature + Error

statistical model interpretation

• Metabolite = Temperature + Error

• The outcome Metabolite is explained by Temperature and other things that we have not recorded which we call Error.

• The task that we then have in terms of data analysis is simply to find out if the effect that Temperature has is ‘large’ in comparison to that which Error has so that we can say whether or not the Metabolite that we observe is explained by Temperature.

summary

• hypothesis tests and confidence intervals are used to make inferences

• we build statistical models to explore relationships and explain variation

• the modelling framework is a general one – general linear models, generalised additive models

• assumptions should be checked.

Probability models- the Normal especially

Documents

The Normal Probability Distribution

1 Chapter 6 Continuous Probability Distributions Uniform Probability Distribution Normal Probability Distribution Exponential Probability Distribution

udrc.lkouniv.ac.in · Web view2021. 2. 19. · Probability Distributions and Graphs Normal Curve. Meaning of probability- Principles of normal curve – Properties of normal curve

L2 Normal Distribution & Probability M. Guttormson Normal distribution and probability Normal Distribution –IntroductionIntroduction –Normal CurveNormal

Standard Normal Cumulative Probability Table

Essential Question: What makes conditional probability different from normal probability?

Chapter 5. NorChapter 5. Normal Probability Distributions.pdfmal Probability Distributions

LECTURE UNIT 4.3 Normal Random Variables and Normal Probability Distributions

Chapter 6: Normal Probability Distributions

4.3 NORMAL PROBABILITY DISTRIBUTIONS

Chapter 6: Normal Probability Distributionsthompson/200/Fall05/PowerPoint/PDF/chap6.pdfnormal distribution: the standard normal distribution. 4. Transform all other normal probability

5 NORMAL PROBABILITY DISTRIBUTIONS - PBworksruckdeschel.pbworks.com/w/file/fetch/104397085/Chapter 5.pdf · 5.1 Introduction to Normal Distributions ... 5 CHAPTER NORMAL PROBABILITY

© aSup -2006 Probability and Normal Distribution 1 PROBABILITY

Probability and the normal curve

The Normal Probability Distribution The Normal Probability Distribution Chapter

4.3 NORMAL PROBABILITY DISTRIBUTIONS The Most Important Probability Distribution in Statistics

7: Normal Probability Distributions1August 14. 7: Normal Probability Distributions2 In Chapter 7: 7.1 Normal Distributions 7.2 Determining Normal Probabilities

Chapter 5 Normal Probability Distributions. Chapter 5 Normal Probability Distributions Section 5-1 – Introduction to Normal Distributions and the Standard

Normal Probability Distributions. Intro to Normal Distributions the STANDARD Normal Distribution

(c) 2007 IUPUI SPEA K300 (4392) Outline Normal Probability Distribution Standard Normal Probability Distribution Standardization (Z-score) Illustrations