Mathematical Statistics - Wiki

Embed Size (px)

Citation preview

  • 8/16/2019 Mathematical Statistics - Wiki

    1/5

    Illustration of linear regression on a data set.

    Regression analysis is an important part of 

    mathematical statistics.

    Mathematical statisticsFrom Wikipedia, the free encyclo pedia

    Mathematical statistics is the application of 

    mathematics to statistics, which was originally

    conceived as the science of the state — the collection

    and analysis of facts about a country: its economy, land,

    military, population, and so forth. Mathematical

    techniques which are used for this include mathematical

    analysis, linear algebra, stochastic analysis, differential

    equations, and measure-theoretic probability theory.[1][2]

    Contents

    1 Introduction

    2   Topics2.1 Probability distributions

    2.1.1 Special distributions2.2 Statistical inferences2.3 Regression2.4 Nonparametric statistics

    3 Statistics, mathematics, and mathematicalstatistics

    4   See also5 References

    6 Additional reading

    Introduction

    Statistical science is concerned with the planning of studies, especially with the design of r andomized

    experiments and with the planning of surveys using r andom sampling. The initial analysis of the data from

     properly randomized studies often follows the study protocol.

    Of course, the data from a randomized study can be analyzed to consider secondary hypotheses or to

    suggest new ideas. A secondary analysis of the data from a planned study uses tools from data analysis.

    Data analysis is divided into:

    descriptive statistics - the part of statistics that describes data, i.e. summarises the data and their typical properties.inferential statistics - the part of statistics that draws conclusions from data (using some model for thdata): For example, inferential statistics involves selecting a model for the data, checking whether thdata fulfill the conditions of a particular model, and with quantifying the involved uncertainty (e.g.using confidence intervals).

    https://en.wikipedia.org/wiki/Confidence_intervalhttps://en.wikipedia.org/wiki/Inferential_statisticshttps://en.wikipedia.org/wiki/Inferential_statisticshttps://en.wikipedia.org/wiki/Descriptive_statisticshttps://en.wikipedia.org/wiki/Differential_equationshttps://en.wikipedia.org/wiki/File:Linear_regression.svghttps://en.wikipedia.org/wiki/File:Linear_regression.svghttps://en.wikipedia.org/wiki/Mathematicshttps://en.wikipedia.org/wiki/Statisticshttps://en.wikipedia.org/wiki/File:Linear_regression.svghttps://en.wikipedia.org/wiki/Confidence_intervalhttps://en.wikipedia.org/wiki/Inferential_statisticshttps://en.wikipedia.org/wiki/Descriptive_statisticshttps://en.wikipedia.org/wiki/Random_samplinghttps://en.wikipedia.org/wiki/Statistical_surveyhttps://en.wikipedia.org/wiki/Design_of_experimentshttps://en.wikipedia.org/wiki/Measure-theoretic_probability_theoryhttps://en.wikipedia.org/wiki/Differential_equationshttps://en.wikipedia.org/wiki/Stochastic_analysishttps://en.wikipedia.org/wiki/Linear_algebrahttps://en.wikipedia.org/wiki/Mathematical_analysishttps://en.wikipedia.org/wiki/Statisticshttps://en.wikipedia.org/wiki/Mathematicshttps://en.wikipedia.org/wiki/Regression_analysishttps://en.wikipedia.org/wiki/File:Linear_regression.svg

  • 8/16/2019 Mathematical Statistics - Wiki

    2/5

    While the tools of data analysis work best on data from randomized studies, they are also applied to other 

    kinds of data --- for example, from natural experiments and observational studies, in which case the

    inference is dependent on the model chosen by the statistician, and so subjective.[3]

    Mathematical statistics has been inspired by and has extended many options in applied statistics.

    Topics

    The following are some of the important topics in mathematical statistics:[4][5]

    Probability distributions

    A probability distribution assigns a probability to each measurable subset of the possible outcomes of a

    random experiment, survey, or procedure of statistical inference. Examples are found in experiments whos

    sample space is non-numerical, where the distribution would be a categorical distribution; experiments

    whose sample space is encoded by discrete random variables, where the distribution can be specified by a

     probability mass function; and experiments with sample spaces encoded by continuous random variables,

    where the distribution can be specified by a probability density function. More complex experiments, suchas those involving stochastic processes defined in continuous time, may demand the use of more general

     probability measures.

    A probability distribution can either be univariate or multivariate. A univariate distribution gives the

     probabilities of a single random variable taking on various alternative values; a multivariate distribution (a

    oint probability distribution) gives the probabilities of a random vector—a set of two or more random

    variables—taking on various combinations of values. Important and commonly encountered univariate

     probability distributions include the binomial distribution, the hypergeometric distribution, and the normal

    distribution. The multivariate normal distribution is a commonly encountered multivariate distribution.

    Special distributions

     Normal distribution (Gaussian distribution), the most common continuous distributionBernoulli distribution, for the outcome of a single Bernoulli trial (e.g. success/failure, yes/no)Binomial distribution, for the number of "positive occurrences" (e.g. successes, yes votes, etc.) givena fixed total number of independent occurrences

     Negative binomial distribution, for binomial-type observations but where the quantity of interest isthe number of failures before a given number of successes occursGeometric distribution, for binomial-type observations but where the quantity of interest is the

    number of failures before the first success; a special c*Discrete uniform distribution, for a finite set ovalues (e.g. the outcome of a fair die)Continuous uniform distribution, for continuously distributed valuesPoisson distribution, for the number of occurrences of a Poisson-type event in a given period of timeExponential distribution, for the time before the next Poisson-type event occursGamma distribution, for the time before the next k Poisson-type events occur Chi-squared distribution, the distribution of a sum of squared standard normal variables; useful e.g.for inference regarding the sample variance of normally distributed samples (see chi-squared test)Student's t distribution, the distribution of the ratio of a standard normal variable and the square rootof a scaled chi squared variable; useful for inference regarding the mean of normally distributedsamples with unknown variance (see Student's t-test)

    https://en.wikipedia.org/wiki/Student%27s_t-testhttps://en.wikipedia.org/wiki/Meanhttps://en.wikipedia.org/wiki/Chi_squared_distributionhttps://en.wikipedia.org/wiki/Standard_normalhttps://en.wikipedia.org/wiki/Student%27s_t_distributionhttps://en.wikipedia.org/wiki/Chi-squared_testhttps://en.wikipedia.org/wiki/Sample_variancehttps://en.wikipedia.org/wiki/Standard_normalhttps://en.wikipedia.org/wiki/Chi-squared_distributionhttps://en.wikipedia.org/wiki/Gamma_distributionhttps://en.wikipedia.org/wiki/Exponential_distributionhttps://en.wikipedia.org/wiki/Poisson_distributionhttps://en.wikipedia.org/wiki/Continuous_uniform_distributionhttps://en.wikipedia.org/wiki/Discrete_uniform_distributionhttps://en.wikipedia.org/wiki/Geometric_distributionhttps://en.wikipedia.org/wiki/Negative_binomial_distributionhttps://en.wikipedia.org/wiki/Independent_(statistics)https://en.wikipedia.org/wiki/Binomial_distributionhttps://en.wikipedia.org/wiki/Bernoulli_distributionhttps://en.wikipedia.org/wiki/Gaussian_distributionhttps://en.wikipedia.org/wiki/Normal_distributionhttps://en.wikipedia.org/wiki/Multivariate_normal_distributionhttps://en.wikipedia.org/wiki/Normal_distributionhttps://en.wikipedia.org/wiki/Hypergeometric_distributionhttps://en.wikipedia.org/wiki/Binomial_distributionhttps://en.wikipedia.org/wiki/Random_vectorhttps://en.wikipedia.org/wiki/Random_variablehttps://en.wikipedia.org/wiki/Multivariate_distributionhttps://en.wikipedia.org/wiki/Univariate_distributionhttps://en.wikipedia.org/wiki/Probability_measurehttps://en.wikipedia.org/wiki/Continuous_timehttps://en.wikipedia.org/wiki/Stochastic_processeshttps://en.wikipedia.org/wiki/Probability_density_functionhttps://en.wikipedia.org/wiki/Probability_mass_functionhttps://en.wikipedia.org/wiki/Random_variableshttps://en.wikipedia.org/wiki/Categorical_distributionhttps://en.wikipedia.org/wiki/Sample_spacehttps://en.wikipedia.org/wiki/Statistical_inferencehttps://en.wikipedia.org/wiki/Survey_methodologyhttps://en.wikipedia.org/wiki/Experiment_(probability_theory)https://en.wikipedia.org/wiki/Measure_(mathematics)https://en.wikipedia.org/wiki/Probabilityhttps://en.wikipedia.org/wiki/Probability_distributionhttps://en.wikipedia.org/wiki/Observational_studieshttps://en.wikipedia.org/wiki/Natural_experiments

  • 8/16/2019 Mathematical Statistics - Wiki

    3/5

    Beta distribution, for a single probability (real number between 0 and 1); conjugate to the Bernoullidistribution and binomial distribution

    Statistical inferences

    Statistical inference is the process of drawing conclusions from data that are subject to random variation,

    for example, observational errors or sampling variation.[6] Initial requirements of such a system of 

     procedures for inference and induction are that the system should produce reasonable answers when applieto well-defined situations and that it should be general enough to be applied across a range of situations.

    Inferential statistics are used to test hypotheses and make estimations using sample data. Whereas

    descriptive statistics describe a sample, inferential statistics infer predictions about a larger population that

    the sample represents.

    The outcome of statistical inference may be an answer to the question "what should be done next?", where

    this might be a decision about making further experiments or surveys, or about drawing a conclusion befor

    implementing some organizational or governmental policy. For the most part, statistical inference makes

     propositions about populations, using data drawn from the population of interest via some form of random

    sampling. More generally, data about a random process is obtained from its observed behavior during a

    finite period of time. Given a parameter or hypothesis about which one wishes to make inference, statistica

    inference most often uses:

    a statistical model of the random process that is supposed to generate the data, which is known whenrandomization has been used, anda particular realization of the random process; i.e., a set of data.

    Regression

    In statistics, regression analysis is a statistical process for estimating the relationships among variables. It

    includes many techniques for modeling and analyzing several variables, when the focus is on therelationship between a dependent variable and one or more independent variables. More specifically,

    regression analysis helps one understand how the typical value of the dependent variable (or 'criterion

    variable') changes when any one of the independent variables is varied, while the other independent

    variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the

    dependent variable given the independent variables – that is, the average value of the dependent variable

    when the independent variables are fixed. Less commonly, the focus is on a quantile, or other location

     parameter of the conditional distribution of the dependent variable given the independent variables. In all

    cases, the estimation target is a function of the independent variables called the regression function. In

    regression analysis, it is also of interest to characterize the variation of the dependent variable around the

    regression function which can be described by a probability distribution.

    Many techniques for carrying out regression analysis have been developed. Familiar methods such as linea

    regression and ordinary least squares regression are parametric, in that the regression function is defined in

    terms of a finite number of unknown parameters that are estimated from the data. Nonparametric regressio

    refers to techniques that allow the regression function to lie in a specified set of functions, which may be

    infinite-dimensional.

    Nonparametric statistics

    https://en.wikipedia.org/wiki/Dimensionhttps://en.wikipedia.org/wiki/Function_(mathematics)https://en.wikipedia.org/wiki/Nonparametric_regressionhttps://en.wikipedia.org/wiki/Datahttps://en.wikipedia.org/wiki/Parameterhttps://en.wikipedia.org/wiki/Parametric_statisticshttps://en.wikipedia.org/wiki/Ordinary_least_squareshttps://en.wikipedia.org/wiki/Linear_regressionhttps://en.wikipedia.org/wiki/Probability_distributionhttps://en.wikipedia.org/wiki/Function_(mathematics)https://en.wikipedia.org/wiki/Location_parameterhttps://en.wikipedia.org/wiki/Quantilehttps://en.wikipedia.org/wiki/Average_valuehttps://en.wikipedia.org/wiki/Conditional_expectationhttps://en.wikipedia.org/wiki/Independent_variablehttps://en.wikipedia.org/wiki/Dependent_variablehttps://en.wikipedia.org/wiki/Statisticshttps://en.wikipedia.org/wiki/Statistical_modelhttps://en.wikipedia.org/wiki/Descriptive_statisticshttps://en.wikipedia.org/wiki/Inductive_reasoninghttps://en.wikipedia.org/wiki/Inferencehttps://en.wikipedia.org/wiki/Statistical_inferencehttps://en.wikipedia.org/wiki/Binomial_distributionhttps://en.wikipedia.org/wiki/Bernoulli_distributionhttps://en.wikipedia.org/wiki/Beta_distribution

  • 8/16/2019 Mathematical Statistics - Wiki

    4/5

    Nonparametric statistics are statistics not based on parameterized families of probability distributions.

    They include both descriptive and inferential statistics. The typical parameters are the mean, variance, etc.

    Unlike parametric statistics, nonparametric statistics make no assumptions about the probability

    distributions of the variables being assessed.

     Non-parametric methods are widely used for studying populations that take on a ranked order (such as

    movie reviews receiving one to four stars). The use of non-parametric methods may be necessary when da

    have a ranking but no clear numerical interpretation, such as when assessing preferences. In terms of level

    of measurement, non-parametric methods result in "ordinal" data.

    As non-parametric methods make fewer assumptions, their applicability is much wider than the

    corresponding parametric methods. In particular, they may be applied in situations where less is known

    about the application in question. Also, due to the reliance on fewer assumptions, non-parametric methods

    are more robust.

    Another justification for the use of non-parametric methods is simplicity. In certain cases, even when the

    use of parametric methods is justified, non-parametric methods may be easier to use. Due both to this

    simplicity and to their greater robustness, non-parametric methods are seen by some statisticians as leaving

    less room for improper use and misunderstanding.

    Statistics, mathematics, and mathematical statistics

    Mathematical statistics has substantial overlap with the discipline of statistics. Statistical theorists study an

    improve statistical procedures with mathematics, and statistical research often raises mathematical

    questions. Statistical theory relies on probability and decision theory.

    Mathematicians and statisticians like Gauss, Laplace, and C. S. Peirce used decision theory with probabilit

    distributions and loss functions (or utility functions). The decision-theoretic approach to statistical inferenc

    was reinvigorated by Abraham Wald and his successors,[7][8][9][10][11][12][13] and makes extensive use of 

    scientific computing, analysis, and optimization; for the design of experiments, statisticians use algebra an

    combinatorics.

    See also

    Asymptotic theory (statistics)

    References

    1. Lakshmikantham,, ed. by D. Kannan,... V. (2002). Handbook of stochastic analysis and applications. New York

    M. Dekker. ISBN 0824706609.

    2. Schervish, Mark J. (1995). Theory of statistics  (Corr. 2nd print. ed.). New York: Springer. ISBN 0387945466.

    3. Freedman, D.A. (2005) Statistical Models: Theory and Practice, Cambridge University Press. ISBN 978-0-521

    67105-7

    4. Hogg, R. V., A. Craig, and J. W. McKean. "Intro to Mathematical Statistics." (2005).

    5. Larsen, Richard J. and Marx, Morris L. "An Introduction to Mathematical Statistics and Its Applications"

    (2012). Prentice Hall.

    6. Upton, G., Cook, I. (2008) Oxford Dictionary of Statistics, OUP. ISBN 978-0-19-954145-4

    7. Wald, Abraham (1947). Sequential analysis. New York: John Wiley and Sons. ISBN 0-471-91806-7. "See Dove

    https://en.wikipedia.org/wiki/Special:BookSources/0486439127https://en.wikipedia.org/wiki/Special:BookSources/0-471-91806-7https://en.wikipedia.org/wiki/International_Standard_Book_Numberhttps://en.wikipedia.org/wiki/Abraham_Waldhttps://en.wikipedia.org/wiki/Special:BookSources/9780199541454https://en.wikipedia.org/wiki/Special:BookSources/9780521671057https://en.wikipedia.org/wiki/David_A._Freedman_(statistician)https://en.wikipedia.org/wiki/Special:BookSources/0387945466https://en.wikipedia.org/wiki/International_Standard_Book_Numberhttps://en.wikipedia.org/wiki/Special:BookSources/0824706609https://en.wikipedia.org/wiki/International_Standard_Book_Numberhttps://en.wikipedia.org/wiki/Asymptotic_theory_(statistics)https://en.wikipedia.org/wiki/Combinatorial_designhttps://en.wikipedia.org/wiki/Algebraic_statisticshttps://en.wikipedia.org/wiki/Design_of_experimentshttps://en.wikipedia.org/wiki/Optimization_(mathematics)https://en.wikipedia.org/wiki/Mathematical_analysishttps://en.wikipedia.org/wiki/Scientific_computinghttps://en.wikipedia.org/wiki/Abraham_Waldhttps://en.wikipedia.org/wiki/Utility_functionhttps://en.wikipedia.org/wiki/Loss_functionhttps://en.wikipedia.org/wiki/Probability_distributionhttps://en.wikipedia.org/wiki/Optimal_decisionhttps://en.wikipedia.org/wiki/Charles_Sanders_Peircehttps://en.wikipedia.org/wiki/Laplacehttps://en.wikipedia.org/wiki/Gausshttps://en.wikipedia.org/wiki/Optimal_decisionhttps://en.wikipedia.org/wiki/Probability_theoryhttps://en.wikipedia.org/wiki/Statisticianshttps://en.wikipedia.org/wiki/Statisticshttps://en.wikipedia.org/wiki/Robust_statistics#Introductionhttps://en.wikipedia.org/wiki/Level_of_measurementhttps://en.wikipedia.org/wiki/Preferenceshttps://en.wikipedia.org/wiki/Rankinghttps://en.wikipedia.org/wiki/Probability_distributionhttps://en.wikipedia.org/wiki/Parametric_statisticshttps://en.wikipedia.org/wiki/Statistical_inferencehttps://en.wikipedia.org/wiki/Descriptive_statisticshttps://en.wikipedia.org/wiki/Probability_distributionhttps://en.wikipedia.org/wiki/Parametrizationhttps://en.wikipedia.org/wiki/Statistics

  • 8/16/2019 Mathematical Statistics - Wiki

    5/5

    reprint, 2004: ISBN 0-486-43912-7"

    8. Wald, Abraham (1950). Statistical Decision Functions. John Wiley and Sons, New York.

    9. Lehmann, Erich (1997). Testing Statistical Hypotheses (2nd ed.). ISBN 0-387-94919-4.

    10. Lehmann, Erich; Cassella, George (1998). Theory of Point Estimation  (2nd ed.). ISBN 0-387-98502-6.

    11. Bickel, Peter J.; Doksum, Kjell A. (2001).  Mathematical Statistics: Basic and Selected Topics 1 (Second (updat

     printing 2007) ed.). Pearson Prentice-Hall.

    12. Le Cam, Lucien (1986). Asymptotic Methods in Statistical Decision Theory. Springer-Verlag. ISBN 0-387-

    96307-3.

    13. Liese, Friedrich and Miescke, Klaus-J. (2008). Statistical Decision Theory: Estimation, Testing, and Selection.

    Springer.

    Additional reading

    Borovkov, A. A. (1999). Mathematical Statistics. CRC Press. ISBN 90-5699-018-7Virtual Laboratories in Probability and Statistics (Univ. of Ala.-Huntsville) (http://www.math.uah.edu/stat/)StatiBot (http://www.trigonella.ch/statibot/english/), interactive online expert system on statisticaltests.

    Retrieved from "https://en.wikipedia.org/w/index.php?title=Mathematical_statistics&oldid=721010747"

    Categories: Statistical theory Actuarial science

    This page was last modified on 19 May 2016, at 07:41.Text is available under the Creative Commons Attribution-ShareAlike License; additional terms mayapply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is aregistered trademark of the Wikimedia Foundation, Inc., a non-profit organization.

    https://www.wikimediafoundation.org/https://wikimediafoundation.org/wiki/Privacy_policyhttps://wikimediafoundation.org/wiki/Terms_of_Usehttps://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_Licensehttps://en.wikipedia.org/wiki/Category:Actuarial_sciencehttps://en.wikipedia.org/wiki/Category:Statistical_theoryhttps://en.wikipedia.org/wiki/Help:Categoryhttps://en.wikipedia.org/w/index.php?title=Mathematical_statistics&oldid=721010747http://www.trigonella.ch/statibot/english/http://www.math.uah.edu/stat/https://en.wikipedia.org/wiki/Special:BookSources/9056990187https://en.wikipedia.org/wiki/Special:BookSources/0-387-96307-3https://en.wikipedia.org/wiki/International_Standard_Book_Numberhttps://en.wikipedia.org/wiki/Lucien_Le_Camhttps://en.wikipedia.org/wiki/Peter_J._Bickelhttps://en.wikipedia.org/wiki/Special:BookSources/0-387-98502-6https://en.wikipedia.org/wiki/International_Standard_Book_Numberhttps://en.wikipedia.org/wiki/Erich_Leo_Lehmannhttps://en.wikipedia.org/wiki/Special:BookSources/0-387-94919-4https://en.wikipedia.org/wiki/International_Standard_Book_Numberhttps://en.wikipedia.org/wiki/Erich_Leo_Lehmannhttps://en.wikipedia.org/wiki/Abraham_Waldhttps://en.wikipedia.org/wiki/Special:BookSources/0486439127