Upload
vanquynh
View
224
Download
1
Embed Size (px)
Citation preview
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Uncertain Inference and Artificial Intelligence
Chuanhai Liu1
March 3, 2011
1Prepared for a Purdue Machine Learning Seminar
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Acknowledgement
◮ Prof. A. P. Dempster for intensive collaborations on the Dempster-Shafer theory.
◮ Jianchun Zhang, Ryan Martin, Duncan Ermini Leaf, Zouyi Zhang, Huiping Xu,Jing-Shiang Hwang, Jun Xie, and Hyokun Yun for collaborations on a variety ofIM research projects.
◮ NSF support for a joint project with Jun Xie on large-scale multinomialinference and its applications in genome-wide association studies.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
References
Martin, R. and Liu, C. (2011, Inferential Models) and the references therein.
A possible tbook (Liu and Martin, 2012+; Inferential Models — Reasoning with Uncertainty)having the futures:
◮ A prior-free and valid probabilistic inference system, which is promising forserious applications of statistics.
◮ Fully developed valid probabilistic inferential methods for textbook problems
◮ A large collection of applications to modern, challenging, and large-scalestatistical problems
◮ Deeper understanding of existing schools of thought and their strengths andweaknesses.
◮ Satisfactory solutions to well-known benchmark problems, including Stein’sparadox and the Behrens-Fisher problem
◮ A direct attack on the source of uncertainty, which makes learning and teachingeasier and more enjoyable
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Abstract
It is difficult, perhaps, to believe that artificial intelligence canbe made intelligent enough without a valid probabilisticinferential system as a critical module. After a brief review ofexisting schools of thought on uncertain inference, we introducea valid probabilistic inferential framework termed inferentialmodels (IMs). With several simple and benchmark examples,we discuss potential applications of IMs in artificial intelligence(in general and machine learning in particular).
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Artificial intelligenceMachine learningLearning from data
What is it? An answer from the web
Artificial Intelligence (AI) is the area of computer science focusing on creatingmachines that can engage on behaviors that humans consider intelligent.
The ability to create intelligent machines has intrigued humans since ancient times,
and today with the advent of the computer and 50 years of research into AI
programming techniques, the dream of smart machines is becoming a reality.
Researchers are creating systems which can mimic human thought, understand speech,
beat the best human chess player, and countless other feats never before possible.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Artificial intelligenceMachine learningLearning from data
Is the answer precise?
If not, blame on Google’s machine learning algorithms
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Artificial intelligenceMachine learningLearning from data
What is it? An answer from the web
Machine learning has been central to AI research from the beginning. Unsupervised
learning is the ability to find patterns in a stream of input. Supervised learning
includes both classification and numerical regression. Classification is used to
determine what category something belongs in, after seeing a number of examples of
things from several categories. Regression takes a set of numerical input/output
examples and attempts to discover a continuous function that would generate the
outputs from the inputs. In reinforcement learning the agent is rewarded for good
responses and punished for bad ones. These can be analyzed in terms of decision
theory, using concepts like utility. The mathematical analysis of machine learning
algorithms and their performance is a branch of theoretical computer science known as
computational learning theory.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Artificial intelligenceMachine learningLearning from data
The inference problem
◮ Input
1. Data x — observed observable quantities X ∈ X.2. Assertion A — statements on θ ∈ Θ, unknown quantities.3. Association between X and θ.
For example, x is a sample of the population characterized by the cdf
Fθ(.).
◮ Output:
1. Probabilistic uncertainty assessments on the truth or the falsityof A given X = x .
2. Plausible regions for θ and its functions.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Intelligence and uncertaintyProbability modelsStatistical modelsExisting schools of thought
Uncertain inference
is critical to AI — No?
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Intelligence and uncertaintyProbability modelsStatistical modelsExisting schools of thought
One (simple) kind of uncertain inference
Probability models A probability model has a meaningful/validprobability distribution assumed to be adequate foreverything. In particular, θ has a valid marginaldistribution that can be operated via the usualprobability calculus to derive valid, e.g., marginal andconditional posterior distributions.
Subjective Bayesian Philosophically, every Bayesian is subjective.
◮ Bayes was not Bayesian.◮ What’s wrong? Nothing is wrong — you make
the decision and (you or your clients) shouldtake the consequence.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Intelligence and uncertaintyProbability modelsStatistical modelsExisting schools of thought
Statistical models
Statistical models In what follows, we consider the cases whereyou don’t have valid distributions for everything,which we refer to as Statistical Models.
θ is taken to be unknown.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Intelligence and uncertaintyProbability modelsStatistical modelsExisting schools of thought
“Objective” Bayesian — a personal view
The idea can be viewed as to use magic priors to approximate(ideal) frequentist results.
Remarks:
◮ Assertion-specific priors: Certain priors can work for certain assertions on θ.
◮ Large-sample theory: It is really on the case when uncertainty goes away;thinking about both normality and vanishing variances in very-high-dimensionalproblems.
◮ Robust Bayesian: The ‘worst case scenario’ thinking ultimately leads theBayesian to a non-Bayesian school.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Intelligence and uncertaintyProbability modelsStatistical modelsExisting schools of thought
Existing schools of thought
◮ Bayes: for it to work, it really requires valid priors.
◮ Fiducial: it is very interesting. It is wrong (but better than Bayes[?]).
◮ Dempster-Shafer: as an extension of both Bayes and fiducial, itrequires valid independent individual components that areprobabilistically meaningful.
For example, individual components are specified with fiducial probabilities.
◮ Frequentist: starting with specified rules and criteria, it invites the“guess and check” approach to uncertain inference. If so, is it veryappealing?
For example, 24+ methods for 2x2 tables and penalty-based methods.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Intelligence and uncertaintyProbability modelsStatistical modelsExisting schools of thought
Remarks
◮ These existing methods are useful.
◮ All these schools of thought fail for many “benchmark”examples, such as, the many-normal-means,Behrens-Fisher, and constrained parameter problems.
◮ Thinking outside the box may be necessary for newgenerations.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
The likelihood insufficiency principle
◮ Likelihood alone is not sufficient for probabilistic inference.
◮ An unobserved but predictable quantity called the auxiliary(a)-variable, must be introduced for predictive/probabilisticinference.
Remark: Bayes makes θ predictable. Is it credible/valid?
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
The “No Validity, No Probability” principle?
◮ Notation: denote by Px(A) the probability for the truth of Agiven the observed data x .
◮ Definition (validity). An inferential framework is said to be valid if ∀A ⊂ Θ,PX (A), as a function of X , satisfies
PX (A)stochastically
≤ Unif (0, 1)
under the falsity of A, i.e., under the truth of Ac , the negation of A.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
The Inferential Model (IM) framework
IM is valid and consists of three steps:
Association-step: Associate X and θ with an a-variable z to obtain the mapping
ΘX (z) ⊆ Θ (z ∼ πz)
consisting of candidate values of θ given X and z .
Prediction-step: Predict z with a credible predictive random set (PRS) Sθ, i.e.,
P (Sθ 6∋ z) ≤ Unif (0, 1), where z ∼ πz .
Combination-step: Combine x and Sθ to obtain Θx (Sθ) = ∪z∈SθΘx (z) and
compute evidence
ex (A) = P (Θx (Sθ) ⊆ A) and ex (Ac ) = P (Θx (Sθ) ⊆ Ac )
with ex (A) = 1 − ex (Ac ) called plausibility.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
X ∼ N(θ, 1)
A-step. X = θ + z , where z ∼ N(0, 1).P-step. Sθ = [−|Z |, |Z |], where Z ∼ N(0, 1).C-step. ex(A) and ex(A
c) with Θx(Sθ) = [x − |Z |, x + |Z |].
Example
−2 0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
θ0
e x( θ 0
)
Figure: Plausibility of assertion A = {θ : θ = θ0}, indexed by θ0, given x = 1.96.Note ex (θ0) = 0.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
X ∼ Binomial(n, θ)
This is a homework problem for Stat 598D.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
Efficiency
See Stat 598D lecture notes on Statistical Inference.Let b(z) be a continuous function and define
S = {z : b(z) ≤ b(Z )} (Z ∼ πz).
ThenP (S 6∋ z) ∼ Unif (0, 1) (z ∼ πz).
We can use this result to construct credible PRS.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
Combining information: Conditional IMs
Example (A textbook example)Consider the association model
Xi = θ + zi (ziiid∼ N(0, 1), i = 1, ...,n).
WriteX̄ = θ + z̄ and Xi − X̄ = (zi − z̄) (i = 1, ...,n).
Predict z̄ conditional on the observed a-quantities {(zi − z̄)}n1 . This leads to simplified
conditional IM:
A-step. X̄ = θ + 1√nu, where u ∼ N(0, 1).
P-step. S = [−|U|, |U|], where U ∼ N(0, 1).
C-step. Θx (S) = [X̄ − |U|/√n, X̄ + |U|/√n].
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
Efficient inference: Marginal IMs
Example (Another textbook example)Consider the association model
Xi = η + σzi (ziiid∼ N(0, 1), i = 1, ...,n).
Let θ = (η, σ2) ∈ Θ = R × R+ and write
X̄ = η + σz̄ , s2x = σ2s2
z , and (X − X̄1)/sx = (z − z̄1)/sz .
Predict z̄ and s2z conditional on the observed a-quantities (z − z̄1)/sz . This leads to
simplified conditional IM:
A-step. X̄ = η + sx√nu and s2
x = σ2s2z , where u ∼ tn−1(0, 1) ⊥ s2
z ∼ χ2n−1.
P-step. S = [−|U|, |U|] × [0,∞], where U ∼ tn−1(0, 1).
C-step. Θx (S) = [X̄ − |U|sx/√
n, X̄ + |U|sx/√
n] × [0,∞].
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
A valid probabilistic inference frameworkTwo simple examplesPredictive random setsOne sample test
Model selection via AI (or by AS — Artificial Statistician)?
Consider choosing a model from a collection of models, including,e.g., normal for simplicity (and efficiency) and non-parametric forrobustness.
See Jianchun Zhang’s PhD thesis for an IM-based method.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
2 × 2 tables
Example (Kidney stone treatment, Steven et al (1994))
Table 1. Small Stones⊕? Table 2. Large Stones
Treatment Success Failure Treatment Success FailureA 81 6 A 192 71B 234 26 B 55 25
For making intelligent decision, there are (at least) two things to consider.
Prediction: Conditional on the Stone type.
Estimation: Combining data if possible.
Thus, check the homogeneity of each of the two tables
Table 3. Treatment A & Table 4. Treatment BStone type Success Failure Stone type Success Failure
Small 81 6 Small 234 26Large 192 71 Large 55 25
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Evidence for and against homogeneity of treatments
For each of Table 3 and Table 4, compute
1. e(homogeneous),
2. e(homogeneous), and
3. 95% plausibility interval for the odd ratio.
Remarks.
1. Simpson’s paradox is related more to wrong statistical analysis, i.e.,modeling, than to inferential method(?) How can this be done in AI?
2. Some relevant statistical thoughts◮ Increase precision of prediction via conditioning, and◮ Increase precision of estimation via pooling.
Can some basics like these be integrated into AI?
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Numerical results
Figure: Plausibilities for log odd ratios Tables 3 and 4, which shows that poolingmakes no sense in this example.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Comparing two normal means with unknown variances
This is a common textbook, controversial, and practically usefulexample (Bayes and fiducial do not work well); See Martin,Hwang, and Liu (2010b).
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Many-normal-means
The association model:
Xi = µi + zi (ziiid∼ N(0, 1), i = 1, ..., n).
The problem of interest is to infer ‖µ‖.
A very important example for understanding inference. (Bayesand fiducial do not work); See Martin, Hwang, and Liu (2010b).
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Many-normal-means
The usual model for the observable X1, ...,Xn:
µiiid∼ N(θ, σ2) (i = 1, ...,n)
and
Xi |µ ind∼ N(µi , s2i ) (i = 1, ...,n)
with known positive s21 , ..., s2
n , where µ = (µ1, ..., µn) and (θ, σ2) ∈ R × R+ unknown.Here, we are interested in inference about σ2.
Since there are really meaningful prior knowledge in practice, it has been tremendous
interest on choosing Bayesian priors.
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Many-normal-means
The sampling model for the observable quantity is
Xiind∼ N(θ, σ2 + s2
i ) (i = 1, ...,n)
For simplicity to motivate ideas, consider the case with known θ = 0, that is,
Xiind∼ N(0, σ2 + s2
i ) (i = 1, ...,n)
An association model is given by
nX
i=1
X 2i
σ2 + s2i
= V
and"
nX
i=1
X 2i
σ2 + s2i
#−1/20
B
@
X1q
σ2 + s2i
, ...,Xn
p
σ2 + s2n
1
C
A= U,
where V ∼ χ2n ⊥ U ∼ Unif (On).
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence
Artificial IntelligenceProbabilistic Inference
Inferential ModelsBenchmark Problems
Simpson’s paradoxThe Behrens-Fisher problemStein’s paradoxA meta-analysis problem
Many-normal-means
Specify the predictive random set, which predicts u⋆ alone,
S = {(v , u) : |Fn(v) − .5| ≤ |Fn(V ) − .5|}
This is a constrained parameter inference problem.
Remark. Validity is not a problem, but efficient inference is not straightforward. Itrequires to consider Generalized Conditional IMs — a challenging topic underinvestigation!
Chuanhai Liu1 Uncertain Inference and Artificial Intelligence