Variables estadísticas, distribuciones conjuntas, marginales...

Variables estadísticas, distribuciones conjuntas,

marginales y condicionales

Mike Wiper

Departamento de Estadística

Universidad Carlos III de Madrid

Grado en Estadística y Empresa

Mike Wiper Métodos bayesianos Estadística y Empresa 1 / 24

Objetivo

Refamiliarizarnos con las variables estadísticas y la ley de la probabilidad total y elteorema de Bayes para variables

Variables aleatorias

Una variable aleatoria es una función que ascribe un valor (numérico) al resultadode un experimento.

Variables discretas

Para una variable discreta, X , tomando valores x1, x2, ... se de�nen:

La función de probabilidad o masa P(X = x) tal que∑

i P(X = xi ) = 1.

La función de distribución FX (x) tal que:

FX (x) = P(X ≤ x) =∑xi<x

P(X = xi ).

Los momentos: E [g(X )] =∑

i g(xi )P(X = xi ).

Ejemplo: la distribución de Poisson

Una variable discreta, X , sigue una distribución de Poisson con parámetro λ > 0 si

P(X = x) =λxe−λ

x!para x = 0, 1, 2, ...

Se tiene E [X ] = V [X ] = λ.

Variables continuas

Para una variable continua, Y , tomando valores x ∈ R, se tiene:

La función de distribución FY (y) tal que:

FY (y) = P(Y ≤ y).

La función de densidad fY (y) = dF (y)dy tal que∫ y

−∞fY (y) dy = FY (y).

Momentos: E [g(Y )] =∫∞−∞ g(y)fY (y) dy .

Ejemplo: la distribución gamma

La variable continua, Y sigue una distribución gamma con parámetros α, β > 0 si

fY (y) =βα

Γ(α)yα−1e−βy para y > 0

donde Γ(·) es la función gamma:

Γ(α) =∫∞0

uα−1e−udu.

Γ(α + 1) = αΓ(α) y Γ(α) = (α− 1)! si α es un número entero.

E [Y ] =α

βV [Y ] =

Distribuciones conjuntas

Para dos (o más) variables discretas, la distribución conjunta es la funciónP(X = x ,Y = y) tal que

P(X = x ,Y = y) = 1,

P(X = x ,Y = y) = P(Y = y), la distribución marginal de Y ,∑y

P(X = x ,Y = y) = P(X = x), la distribución marginal de X .

En el caso de variables continuas, sustituimos sus respectivos sumatorios porintegrales.

Ejemplo

Sea Y > 0 una variable continua y X ∈ {0, 1, 2, ...,∞} una variable discreta yde�nimos la distribución conjunta de X e Y :

fX ,Y (x , y) =βα

Γ(α)

yα+x−1e−(β+1)y

x!, donde α, β > 0.

¾Cómo sabemos que es una distribución válida?

¾Es no negativa? X

Si sumamos sobre x e integramos sobre y , tiene que dar 1.

Ejemplo

Sea Y > 0 una variable continua y X ∈ {0, 1, 2, ...,∞} una variable discreta yde�nimos la distribución conjunta de X e Y :

fX ,Y (x , y) =βα

Γ(α)

x!, donde α, β > 0.

¾Cómo sabemos que es una distribución válida?

¾Es no negativa? X

Si sumamos sobre x e integramos sobre y , tiene que dar 1.

Distribuciones condicionales

La distribución condicional de una variable (discreta) X dada otra variable(discreta) Y es:

P(X = x |Y = y) =P(X = x ,Y = y)

P(Y = y),

suponiendo que P(Y = y) > 0.

La esperanza condicional de una función g(x , y) es

E [g(x , y)|Y = y ] =∑x

g(x , y)P(X = x |Y = y).

Ejemplo

Sea Y ∼ Gamma(α, β) y X |Y = y ∼ Poisson(y). Luego, por la ley de lamultiplicación:

fX ,Y (x , y) = P(X = x |Y = y)fY (y)

=y xe−y

Γ(α)yα−1e−βy

Γ(α)

nuestra distribución conjunta anterior.

La media y varianza marginal

Supongamos que en nuestro ejemplo, queremos calcular E [X ] y V [X ].

Suena complicado porqué en principio, tendríamos que evaluar la distribuciónmarginal de X primero o computar a través de la distribución conjunta. Porejemplo:

E [X ] =∑x

∫ ∞∞

xfX ,Y (x , y) dy .

¾Existe una manera más fácil de hacer el cálculo?

La ley de las esperanzas iteradas

Para dos variables X e Y , la ley de las esperanzas iteradas dice que

E [X ] = E [E [X |Y ]].

Demostración

Existe otra descomposición semejante para la varianza:

V [X ] = E [V [X |Y ]] + V [E [X |Y ]].

Demostración

Ejemplo

Sea X |Y = y ∼ Poisson(y) con Y ∼ Gamma(α, β). Calculamos la media yvarianza de X .

E [X ] = E [E [X |Y ]]

= E [Y ] ¾Por qué?

V [X ] = E [V [X |Y ]] + V [E [X |Y ]]

= E [Y ] + V [Y ]

=α(1 + β)

Ejemplo

E [X ] = E [E [X |Y ]]

V [X ] = E [V [X |Y ]] + V [E [X |Y ]]

= E [Y ] + V [Y ]

=α(1 + β)

Ejemplo

E [X ] = E [E [X |Y ]]

V [X ] = E [V [X |Y ]] + V [E [X |Y ]]

= E [Y ] + V [Y ]

=α(1 + β)

Ejemplo

E [X ] = E [E [X |Y ]]

V [X ] = E [V [X |Y ]] + V [E [X |Y ]]

= E [Y ] + V [Y ]

=α(1 + β)

La ley de la probabilidad total para variables

P(X = x) =∑y

P(X = x |Y = y)P(Y = y) si Y es discreta,

P(X = x) =

∫ ∞−∞

P(X = x |Y = y)fY (y) dx si Y es continua.

El segundo caso es más interesante en la mayoría de aplicaciones.

Ejemplo

Sea X |Y ∼ Poisson(Y ) con Y ∼ Gamma(α, β). Calculamos la distribuciónmarginal de X .

P(X = x) =

∫P(X = x |Y = y)fY (y) dy

∫ ∞0

y xe−y

Γ(α)yα−1e−βy dy

Γ(α)x!

∫ ∞0

yα+x−1e−(β+1)y dy

¾El integrando parece a una fórmula que hemos visto?

Ejemplo

P(X = x) =βα

Γ(α)x!

∫ ∞0

yα∗−1e−β

∗y dy

donde α∗ = α + x , β∗ = β + 1.

El integrando es otra distribución gamma sin su constante de integración.

P(X = x) =βα

Γ(α)x!

Γ(α∗)

β∗α∗

∫ ∞0

β∗α∗

Γ(α∗)xα

∗−1e−β∗x dy

=Γ(α∗)

Γ(α)x!

β∗α∗

=Γ(α + x)

Γ(α)x!

(1− 1

β + 1

para x = 0, 1, 2, ...

que es una distribución binomial negativa. DBN

Ejemplo

P(X = x) =βα

Γ(α)x!

∫ ∞0

yα∗−1e−β

∗y dy

donde α∗ = α + x , β∗ = β + 1.

P(X = x) =βα

Γ(α)x!

Γ(α∗)

β∗α∗

∫ ∞0

β∗α∗

Γ(α∗)xα

=Γ(α∗)

Γ(α)x!

β∗α∗

=Γ(α + x)

Γ(α)x!

(1− 1

β + 1

para x = 0, 1, 2, ...

Ejemplo

P(X = x) =βα

Γ(α)x!

∫ ∞0

yα∗−1e−β

∗y dy

donde α∗ = α + x , β∗ = β + 1.

P(X = x) =βα

Γ(α)x!

Γ(α∗)

β∗α∗

∫ ∞0

β∗α∗

Γ(α∗)xα

=Γ(α∗)

Γ(α)x!

β∗α∗

=Γ(α + x)

Γ(α)x!

(1− 1

β + 1

para x = 0, 1, 2, ...

El teorema de Bayes para variables

P(Y = y |X = x) =P(X = x |Y = y)P(Y = y)

P(X = x)

=P(X = x |Y = y)P(Y = y)∑i P(X = x |Y = yi )P(Y = yi )

fY |X (y |X = x) =P(X = x |Y = y)fY (Y = y)

P(X = x)

=P(X = x |Y = y)fY (Y = y)∫∞−∞ P(X = x |Y = y)fY (y) dy

Ejemplo

En nuestro ejemplo:

fY |X (y |x) =P(X = x |Y = y)fY (Y = y)

P(X = x)

y xe−y

x!βα

Γ(α+x)Γ(α)x!

(1− β

)xββ+1

=(β + 1)α+x

Γ(α + x)yα+x−1e−(β+1)y

¾Qué distribución es?

Y |X = x ∼ Gamma(α + x , β + 1).

Ejemplo

En nuestro ejemplo:

fY |X (y |x) =P(X = x |Y = y)fY (Y = y)

P(X = x)

y xe−y

x!βα

Γ(α+x)Γ(α)x!

(1− β

)xββ+1

=(β + 1)α+x

Γ(α + x)yα+x−1e−(β+1)y

¾Qué distribución es?

Y |X = x ∼ Gamma(α + x , β + 1).

Resumen y siguiente sesión

En esta sección hemos repasado algunos de los cálculos necesarios para evaluardistribuciones conjuntas, condicionales y marginales.

En la siguiente sesión vamos a recordar las características de la inferenciafrecuentista.

Apéndice: demostración de la ley de esperanzas

iteradas

Supongamos que X e Y son continuas. Entonces:

E [X ] =

∫xfX (x) dx

∫fX ,Y (x , y) dy dx

∫ ∫xfX |Y (x |y)fY (y) dy dx

∫ {∫xfX |Y (x |y) dx

}fY (y) dy

∫E [X |Y = y ]fY (y) dy

= E [E [X |Y ]]

Apéndice: demostración de descomposición de la

varianza

Supongamos que X e Y son continuas. Entonces:

V [X ] =

∫(x − E [X ])2fX (x) dx

∫ ∫(x − E [X ])2fX |Y (x |y)fY (y) dy dx

∫ {∫(x − E [X |y ] + E [X |y ]− E [X ])2fX |Y (x |y) dx

}fY (y) dy

∫ {∫(x − E [X |y ])2fX |Y (x |y) dx

}f (y) dy

∫ {∫(x − E [X |y ])(E [X |y ]− E [X ])fX |Y (x |y) dx

}fY (y) dy

∫ {∫(E [X |y ]− E [X ])2fX |Y (x |y) dx

}fY (y) dy

V [X ] =

∫V [X |y ]fY (y) dy +

∫(E [X |y ]− E [X ])

{∫(x − E [X |y ])fX |Y (x |y) dx

}fY (y) dy +∫

(E [X |y ]− E [X ])2fY (y) dy

= E [V [X |Y ]] + 0 +

∫(E [X |y ]− E [E [X |Y ]])2fY (y) dy

= E [V [X |Y ]] + V [E [X |Y ]]

Apéndice: la distribución binomial negativa

Una variable discreta, X , sigue una distribución binomial negativa con parámetrosr > 0 y 0 < p < 1 si:

P(X = x) =Γ(r + x)

Γ(r)x!(1− p)rpx para x = 0, 1, 2, ...

Cuando r ∈ {0, 1, 2, ...} se tiene:

P(X = x) =

(x + r − 1

)(1− p)rpx .

Se tiene E [X ] = r p1−p y V [X ] = r p

(1−p)2 .

Variables estadísticas, distribuciones conjuntas, marginales...

Documents

AL x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Outlier Detection in Multivariate Time Series by ...halweb.uc3m.es/.../ingles/2006JASA_galeano_tsay.pdf · Outlier Detection in Multivariate Time Series ... For multivariate time

X Priloaje kUkXO $OXQL BRSPREORU - sancraiu.ro · 2014. 12. 10. · x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Statistical Education for Engineers: An Initial Task Force ...halweb.uc3m.es/esp/Personal/personas/dpena/publications/ingles/... · Taylor & Francis, Ltd. and American ... gineers

Outlier detection in ARIMA and seasonal ARIMA models by ...halweb.uc3m.es/.../ingles/2012Scottbook_Outliers_galeano.pdf · Outlier detection in ARIMA and seasonal ARIMA models by

Tema 5 An alisis de regresi on (segunda parte)halweb.uc3m.es/esp/Personal/personas/aarribas/eng/docs/...Ilustraremos su uso con un ejemplo cl asico. Consideremos los cuatro conjuntos

Persistence and Kurtosis in GARCH and Stochastic ...halweb.uc3m.es/esp/Personal/personas/dpena/articles/...GARCH models are very popular for representing the dynamic evolution of the

Class 4: Analysis of univariate data: measures of …halweb.uc3m.es/esp/Personal/personas/mwiper/docencia...Statistics Measures of location There are three standard measures of location

Chapter 5: Probability models - UC3Mhalweb.uc3m.es/esp/Personal/personas/mwiper/docencia/English/... · Applied Statistics Chapter 5: Probability models 1. ... 1 2 3 n 1 Every probability

Stratiﬁed Random Sampling - Departamento de …halweb.uc3m.es/esp/Personal/personas/jmmarin/esp/MetQ/Talk3.pdf · Stratiﬁed Random Sampling •Sometimes in survey sampling certain

[ZÌ Mx x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Rural camp report new version2 - Loyola College, Chennai...î ,qwurgxfwlrq x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Wegweiserstandort (Pfeilwegweiser) ) Wegweiserstandort ...€¦ · x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x xx x x xx x xx x x x x x x x x x x x x x x x x x

A revisit to low volatility anomaly 2€¦ · &217(176 ,1752'8&7,21 x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

æ ò Y - WKO.at9714]-NEKP... · ï d ] o í x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Recomendado!!halweb.uc3m.es/esp/Personal/personas/aarribas/eng/docs...spss 12.0 for Members & Donors Mailing Lists Bucr Track-mo Developer Page Conferences Search Documentation Manuals

Dynamic priority allocation via restless bandit marginal ...halweb.uc3m.es/jnino/eng/pubs/top07.pdf162 J. Niño-Mora Keywords Priority allocation · Stochastic scheduling ·Index policies

Chapter 3: Analysis of bivariate data - Departamento …halweb.uc3m.es/esp/Personal/personas/mwiper/docencia/...Introduction to Statistics Chapter 3: Analysis of bivariate data 1

Regresión y modelos lineales [5mm] [height=1.5in]../Images ...halweb.uc3m.es/esp/Personal/personas/mwiper/docencia/Spanish/In… · Regresión y modelos lineales Mike Wiper Departamento

Probability - UC3Mhalweb.uc3m.es/esp/Personal/personas/mwiper/docencia/English/PhD... · Probability Chance is a part of our everyday lives. Everyday we make judgements based on probability: