93
Truth-conduciveness Truth-conduciveness Without Reliability: Without Reliability: A Skeptical Derivation of A Skeptical Derivation of Ockham’s Razor Ockham’s Razor Kevin T. Kelly Kevin T. Kelly Department of Philosophy Department of Philosophy Carnegie Mellon University Carnegie Mellon University www.cmu.edu www.cmu.edu

Truth-conduciveness Without Reliability: A Skeptical Derivation of Ockham’s Razor Kevin T. Kelly Department of Philosophy Carnegie Mellon University

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Truth-conduciveness Truth-conduciveness Without Reliability: Without Reliability:

A Skeptical Derivation of A Skeptical Derivation of Ockham’s RazorOckham’s Razor

Kevin T. KellyKevin T. KellyDepartment of PhilosophyDepartment of Philosophy

Carnegie Mellon UniversityCarnegie Mellon Universitywww.cmu.eduwww.cmu.edu

NaiveteNaivete

Lo! An apple.

Skeptical HypothesisSkeptical Hypothesis

Lo! An apple.

Maybe you are a brain in a vat.Everything would look the same.

poof

Maybe you are a brain in a vat.Everything would look the same.

Skeptical HypothesisSkeptical Hypothesis

RetrenchmentRetrenchmentThat’s not a serious possibility

You have the burden of proof.It’s remote.It’s implausible.It’s distant from the actual world.You’re not in my community.Who cares about the worst case?

RetrenchmentRetrenchmentThat’s not a serious possibility

You have the burden of proof.It’s remote.It’s implausible.It’s distant from the actual world.You’re not in my community.Who cares about the worst case?

UnsatisfyingUnsatisfying

Possibilities delimited a priori:Possibilities delimited a priori: circular account.circular account.

Possibilities delimited a posteriori:Possibilities delimited a posteriori: how do we seek knowledge?how do we seek knowledge?

So there!

Zen ApproachZen Approach

Don’t rush to Don’t rush to defeatdefeat the demon. the demon.

Grrrr!

Zen ApproachZen Approach

Don’t rush to Don’t rush to defeatdefeat the demon. the demon. Get to Get to knowknow him extremely well. him extremely well. Justification may be located in the Justification may be located in the

demon’s demon’s powerpower rather than in his rather than in his weaknessweakness..

The Zen of ComputationThe Zen of Computation

Algorithms are justified by Algorithms are justified by efficiencyefficiency.. Efficiency means you Efficiency means you couldn’t do bettercouldn’t do better.. You couldn’t do better due to a You couldn’t do better due to a demonic demonic

argumentargument (the halting problem, etc). (the halting problem, etc).

Scientific Theory ChoiceScientific Theory Choice

Which theory is true?

Ockham Says:Ockham Says:

Choose theSimplest!

Skeptical HypothesisSkeptical HypothesisMaybe a complex theory is true but the data are simple

PuzzlePuzzle

An indicator must be An indicator must be sensitivesensitive to to what it indicates.what it indicates.

simple

PuzzlePuzzle

An indicator must be An indicator must be sensitivesensitive to to what it indicates.what it indicates.

complex

PuzzlePuzzle

But Ockham’s razor always points But Ockham’s razor always points at simplicity.at simplicity.

simple

PuzzlePuzzle

But Ockham’s razor always points But Ockham’s razor always points at simplicity.at simplicity.

complex

MenoMeno

If we know that the truth is simple, If we know that the truth is simple, we we don’t needdon’t need Ockham’s razor. Ockham’s razor.

simple

MenoMeno

If we don’t know that the truth is If we don’t know that the truth is simple, simple, what goodwhat good is Ockam’s razor? is Ockam’s razor?

complex

Some Standard Some Standard ResponsesResponses

Simple Theories are Simple Theories are VirtuousVirtuous

TestableTestable (Popper, Glymour) (Popper, Glymour) Unified Unified (Friedman, Kitcher)(Friedman, Kitcher) ExplanatoryExplanatory (Harman) (Harman) Symmetrical Symmetrical (Malament)(Malament) Compress data Compress data (Rissanen)(Rissanen) InterestingInteresting (Vitanyi) (Vitanyi)

But the Truth Might Not be But the Truth Might Not be VirtuousVirtuous

To conclude that a theory is true To conclude that a theory is true because it is virtuous is wishful because it is virtuous is wishful thinking thinking (van Fraassen).(van Fraassen).

Overfitting Overfitting (Akaike, Sober, Forster)(Akaike, Sober, Forster)

Empirical estimatesEmpirical estimates based on based on complex models have greater mean complex models have greater mean squared distance from the truthsquared distance from the truth

Truth

Overfitting Overfitting (Akaike, Sober, Forster)(Akaike, Sober, Forster)

Empirical estimatesEmpirical estimates based on based on complex models have greater mean complex models have greater mean squared distance from the truth.squared distance from the truth.

Pop!Pop!Pop!Pop!

Overfitting Overfitting (Akaike, Sober, Forster)(Akaike, Sober, Forster)

Empirical estimatesEmpirical estimates based on based on complex models have greater mean complex models have greater mean squared distance from the truth.squared distance from the truth.

clamp

Truth

Overfitting Overfitting (Akaike, Sober, Forster)(Akaike, Sober, Forster)

Empirical estimatesEmpirical estimates based on based on complex models have greater mean complex models have greater mean squared distance from the truth.squared distance from the truth.

clamp

Truth

Pop!Pop!Pop!Pop!

Does Not Aim at True Does Not Aim at True TheoryTheory

...even if the simple theory is ...even if the simple theory is knownknown to be false…to be false…

clamp

Four eyes!

Simple data would be a Simple data would be a miraclemiracle in a in a complex world.complex world.Simple data would be Simple data would be expectedexpected in a in a simple world.simple world.

Miracle Argument Miracle Argument (Putnam, (Putnam, Rosenkrantz)Rosenkrantz)

Planetary retrograde motion

MarsEarth

Sun

Miracle ArgumentMiracle Argument

Miracle ArgumentMiracle Argument

Complex theory

Simple theory

Simple data would be a Simple data would be a miraclemiracle in a in a complex world.complex world.Simple data would be Simple data would be expectedexpected in a in a simple world.simple world.epicycle

lapping

Miracle ArgumentMiracle Argument

Simple theory

lapping

Simple data would be a Simple data would be a miraclemiracle in a in a complex world.complex world.Simple data would be Simple data would be expectedexpected in a in a simple world.simple world.

Complex theory

epicycle

However…However…

Simple data Simple data would would notnot be a be a miracle if the complex theory’s miracle if the complex theory’s parameter were set nearparameter were set near ;;

Complex theory

Simple theory

epicycle

lapping

The Real MiracleThe Real MiracleIgnoranceIgnorance about model:about model:

pp((SS) ) pp((CC); );

++ IgnoranceIgnorance about parameter settings about parameter settings within theories:within theories:

pp((CC(() | ) | CC) ) pp((CC((’ ’ ) | ) | CC).).

== KnowledgeKnowledge about parameter settings about parameter settings across theoriesacross theories

pp((CC(()) << )) << pp((SS).).Is it knognorance or Ignoredge?

CP

The Ellsberg ParadoxThe Ellsberg Paradox

1/3 ? ?

3 ball colors with these frequencies

Urn

The Ellsberg ParadoxThe Ellsberg Paradox

p q r

Human betting Human betting preferencespreferences

p > q

1/3 ? ?

The Ellsberg ParadoxThe Ellsberg Paradox

p > q

Human betting Human betting preferencespreferences

r <p q r

!

p q r

1/3 ? ?

DiagnosisDiagnosis

p q r

1/3 ? ?

ignoranceknowledge

Robust Bayesianism Robust Bayesianism (Levi, (Levi,

Kadane, Seidenfeld)Kadane, Seidenfeld)

1/3 ? ?

knowledge ignorance

1/3 0 2/3

1/3 1/3 1/3

1/3 2/3 0

Choose the act with highest worst-case expected value.

. . .. . .

p q r

Credenceis rangeof probs.

Worst-case Expected Worst-case Expected ValuesValues

1/3 ? ?

>

<

p q r

1/3 > 0

1/3 0 2/3

1/3 ? ?

Whither Ockham?Whither Ockham?

Since you don’t really know that complex worlds won’t produce simple data, shouldn’t your ignoranceinclude distributions concentrated on such possibilities?

I prefer ignoredge.

In Any EventIn Any Event

The coherentist foundations of The coherentist foundations of Bayesianism have Bayesianism have nothing to do with nothing to do with short-run truth-conduciveness.short-run truth-conduciveness.

TemptationTemptationIf only the probabilities If only the probabilities pp((CC((q’ q’ ) | ) | CC) )

were were chanceschances rather than opinions. rather than opinions. Then the Then the allegedalleged miracle would be a miracle would be a properproper miracle. miracle.

Proof of God Proof of God (R. Koons 1999)(R. Koons 1999)

1.1.Natural chance is determined by the Natural chance is determined by the fundamental theory of natural chance.fundamental theory of natural chance.

2.2.If Ockham’s razor reliably infers the theory If Ockham’s razor reliably infers the theory of natural chance, the chance that a of natural chance, the chance that a complex theory of natural chance would complex theory of natural chance would have its parameters set to produce simple have its parameters set to produce simple data must be low. data must be low.

3.3.But since natural chance is determined by But since natural chance is determined by the free parameters of the fundamental the free parameters of the fundamental theory of natural chance, the parameter theory of natural chance, the parameter setting is not governed by natural chance.setting is not governed by natural chance.

4.4.Hence, it must be governed by non-natural Hence, it must be governed by non-natural chance.chance.

5.5.Holy water is available at the exit.Holy water is available at the exit.

MoralMoralThe basic point is right.The basic point is right.

Solution: Solution:

1.1.Keep naturalismKeep naturalism

2.2.Keep fundamental scientific Keep fundamental scientific knowledgeknowledge

3.3.Dump short-run reliabilityDump short-run reliability as as explication of truth-conduciveness.explication of truth-conduciveness.

Externalist MagicExternalist Magic Simplicity Simplicity informsinforms via via hidden causeshidden causes or or

tracking mechanisms.tracking mechanisms.G

Simple B(Simple)

Simple B(Simple)

Simple B(Simple)

Leibniz, evolution

Kant

Ouija board

Practice and data are the same.Practice and data are the same. Knowledge vs. non-knowledge depends on Knowledge vs. non-knowledge depends on hidden hidden

causescauses.. By Ockham’s razor, better to explain Ockham’s razor By Ockham’s razor, better to explain Ockham’s razor

without the hidden causeswithout the hidden causes..

Metaphysicians

for Ockham ?

With Friends Like With Friends Like Those…Those…

The Last Gasp: The Last Gasp: ConvergenceConvergence

Complexity

truth

Bayes (washing out of the prior)BIC (Schwarz)Structural Risk Minimization (Vapnik, Harman)TETRAD (Spirtes, Glymour, Scheines)

The Last Gasp: The Last Gasp: ConvergenceConvergence

Complexity

truth

Plink!

Blam!

The Last Gasp: The Last Gasp: ConvergenceConvergence

Complexity

truth

Plink!Blam!

The Last Gasp: The Last Gasp: ConvergenceConvergence

Complexity

truth

Plink!

Blam!

Logic is BackwardsLogic is Backwards

Ockham methods are Ockham methods are sufficientsufficient for for convergence.convergence.

But But every finite variantevery finite variant of a convergent of a convergent method converges (Salmon).method converges (Salmon).

So Ockham’s razor is not So Ockham’s razor is not necessarynecessary for for convergence.convergence.

Alternative ranking

truth

Truth ConducivenessTruth Conduciveness ReliabilityReliability

Too Too strong: strong: Circles or magic required.Circles or magic required.

Convergence Convergence Too Too weakweak Doesn’t single out simplicityDoesn’t single out simplicity

ComplexSimple

ComplexSimple

Truth ConducivenessTruth Conduciveness Indication or trackingIndication or tracking

Too Too strong: strong: Circles or magic required.Circles or magic required.

Convergence Convergence Too Too weakweak Doesn’t single out simplicityDoesn’t single out simplicity

““Straightest” convergenceStraightest” convergence Just right?Just right?

Complex

Complex

Simple

Simple

ComplexSimple

Truth-conduciveness Truth-conduciveness as as Straightest Straightest ConvergenceConvergence

ComplexSimple

Ancient RootsAncient Roots

"Living in the midst of ignorance and considering themselves intelligent and enlightened, the senseless people go round and round, following crooked courses, just like the blind led by the blind." Katha Upanishad, I. ii. 5, c. 600 BCE.

RetractionRetraction

New output does not entail New output does not entail previous output.previous output.

t t + 1

Retracted Content

Eliminate Eliminate Needless Needless RetractionsRetractions

Truth

Necessary Retractions are Necessary Retractions are VirtuousVirtuous

Truth

Demon’s Role as JustifierDemon’s Role as Justifier

Truth

I can force everyconvergent method to retract this often, so your retractions are justified by my power.

Eliminate Eliminate Needless DelaysNeedless Delays to Retractionsto Retractions

theory

applicationapplicationapplication

applicationcorollary

applicationtheory

applicationapplication

corollary applicationcorollary

Eliminate Eliminate Needless DelaysNeedless Delays to Retractionsto Retractions

Easy ComparisonsEasy Comparisons

at least as bad = at least as many retractions at least as late

time

retr

acti

on

s

Worst-case Retraction Time Worst-case Retraction Time BoundsBounds

. . .(1, 2, ∞)

. . .

. . .

Empirical ComplexityEmpirical ComplexityHopeless ideas:Hopeless ideas:

Syntactic lengthSyntactic length

Computational incompressibilityComputational incompressibility

By what miracle do notational conventions indicate truth?

Empirical ComplexityEmpirical ComplexityClose but no cigar:Close but no cigar:

Free parametersFree parameters

Broken symmetriesBroken symmetries

Meno, I want simplicity itself, not parts of simplicity.

Empirical ComplexityEmpirical ComplexityEmpirical complexityEmpirical complexity of of T T inin ==

the length of the maximum path (the length of the maximum path (T1T1, , …, …, TnTn, , TT) of answers in ) of answers in the the demon can demon can forceforce from an from an arbitraryarbitrary convergent methodconvergent method..

TTT3T3

T2T2T1T1

Keep up!

Polynomial OrderPolynomial Order

Data = open intervals around Data = open intervals around YY at rational values of at rational values of X.X.

Polynomial OrderPolynomial Order

Demon shows flat line until Demon shows flat line until convergent method takes bait.convergent method takes bait.

Zero degree curve

Polynomial OrderPolynomial Order

Demon shows flat line until Demon shows flat line until convergent method takes bait.convergent method takes bait.

Zero degree curve

Polynomial OrderPolynomial Order

Then switches to tilted line until Then switches to tilted line until convergent method takes the convergent method takes the bait.bait.

First degree curve

Polynomial OrderPolynomial Order

Then switches to parabola until Then switches to parabola until convergent method takes the bait convergent method takes the bait ……

Second degree curve

Complexity can be Complexity can be ComplexComplex

T3

T8T5

T7 T4

T2

0

1

2

3

Complexity given e:

Complexity Relative to Complexity Relative to DataData

T3

T8T5

T7 T4

T2

0

1

2

3

Complexity given e + e’:

Complexity Relative to Complexity Relative to DataData

T5 T4

T2

0

1

2

3

T7

Complexity given e + e’:

Timed Retraction BoundsTimed Retraction Bounds

rr((M, e, nM, e, n)) = = the least timed retraction the least timed retraction bound for worlds satisfying theories of bound for worlds satisfying theories of complexity complexity nn and producing finite and producing finite input history input history ee..

Empirical Complexity 0 1 2 3 . . .

. . .

M

MM is is Efficient Efficient at at ee

For each convergent For each convergent M’M’ that agrees with that agrees with MM along finite input history along finite input history ee, ,

for each complexity for each complexity nn::

rr((MM, , ee, , nn) ) rr((M’M’, , ee, , nn))

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

MM is is Strongly Beaten Strongly Beaten at at ee

There exists convergent There exists convergent M’M’ that agrees that agrees with with MM up to the end of up to the end of ee, such that , such that

for each complexity for each complexity nn::

rr((MM, , ee, , nn) > ) > rr((M’M’, , ee, , nn).).

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

MM is is Weakly Beaten Weakly Beaten at at ee

There exists convergent There exists convergent M’M’ that agrees that agrees with with MM up to the end of up to the end of ee, such that, such that

For each For each nn, , rr((MM, , ee, , nn) ) rr((M’M’, , ee, , nn););

Exists Exists nn, , rr((MM, , ee, , nn) > ) > rr((M’M’, , ee, , nn).).

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

Demons for Demons for OckhamOckham

Ockham’s RazorOckham’s Razor

?

Don’t select a theory unless it is Don’t select a theory unless it is uniquely simplest in light of uniquely simplest in light of experience.experience.

T5 T4

T2

0

1

2

3

T7

Ockham’s RazorOckham’s Razor

T7

Don’t select a theory unless it is Don’t select a theory unless it is uniquely simplest in light of uniquely simplest in light of experience.experience.

T2

0

1

2

3

T7

StalwartnessStalwartness Don’t retract your answer while it Don’t retract your answer while it

remains uniquely simplestremains uniquely simplest

T2

0

1

2

3

T7

T7T7,

Argument SketchArgument Sketch No matter what convergent No matter what convergent MM has done has done

in the past, nature can in the past, nature can forceforce MM to to produce each answer down an arbitrary produce each answer down an arbitrary effect path, arbitrarily often.effect path, arbitrarily often.

Nature can also Nature can also forceforce violators of violators of Ockham’s razor or stalwartness either Ockham’s razor or stalwartness either into an into an extraextra retraction or a retraction or a latelate retraction in retraction in eacheach complexity class. complexity class.

Ockham Efficiency Ockham Efficiency TheoremTheorem

Let Let MM converge to the true theory in converge to the true theory in problem problem PP. The following are . The following are equivalentequivalent:: MM is is alwaysalways Ockham and stalwart Ockham and stalwart

in in PP;; MM is is alwaysalways efficient in efficient in PP;; MM is is nevernever weaklyweakly beaten in beaten in PP..

Policy RetractionsPolicy Retractions

Many explanations have been offered to Many explanations have been offered to make sense of the here-today-gone-make sense of the here-today-gone-tomorrow nature of medical wisdom — tomorrow nature of medical wisdom — what we are advised with confidence one what we are advised with confidence one year is reversed the nextyear is reversed the next — but the — but the simplest one is that it is simplest one is that it is the natural the natural rhythm of sciencerhythm of science. .

((Do We Really Know What Makes us Do We Really Know What Makes us HealthyHealthy, NY Times Magazine, Sept. 16, , NY Times Magazine, Sept. 16, 2007). 2007).

Causal InferenceCausal Inference Causal graph theoryCausal graph theory: more correlations : more correlations

more causes.more causes.

Idealized dataIdealized data = list of conditional = list of conditional dependencies discovered so far.dependencies discovered so far.

AnomalyAnomaly = the addition of a conditional = the addition of a conditional dependency to the list.dependency to the list.

partial correlations

S G(S)

Causal Axioms Causal Axioms (Pearl, Glymour)(Pearl, Glymour)

1.1. Screening off:Screening off: X X is statistically is statistically independent of its non-independent of its non-descendents given its parents.descendents given its parents.

2.2. No invisible causes:No invisible causes: The only true The only true independence relations are those independence relations are those entailed by condition 1.entailed by condition 1.

P1 P2N1N1

N2

P2P1

X

D

Forcible Sequence of Causal Forcible Sequence of Causal TheoriesTheories

X2 X3 WX1

Y1

Y2

Forcible Sequence of Causal Forcible Sequence of Causal TheoriesTheories

X2 X3 WX1

Y1

Y2

Y3

Y4

Forcible Sequence of Causal Forcible Sequence of Causal TheoriesTheories

X2 X3 WX1

Y1

Y2

Y3

Y4

Y5

Forcible Sequence of Causal Forcible Sequence of Causal TheoriesTheories

X2 X3 WX1

Y1

Y2

Y3

Y4

Y5 Y4

MoralMoral

In counterfactual prediction, In counterfactual prediction, formform of of model matters and model matters and retractions are retractions are unavoidableunavoidable..

Ockham efficiency Ockham efficiency agrees very closely agrees very closely with best contemporary practice.with best contemporary practice.

Maybe that’s all there is to it.Maybe that’s all there is to it.

ConclusionsConclusions

Ockham’s razor isOckham’s razor is necessary necessary for staying for staying on the on the straightest pathstraightest path to the to the truthtruth

Does not reliably Does not reliably point atpoint at or or indicateindicate the truth.the truth.

DemonstrablyDemonstrably works without works without circlescircles, , evasionsevasions, or , or magicmagic..

Such a theory is motivated in Such a theory is motivated in counterfactual inferencecounterfactual inference and and estimationestimation..