On the Hamiltonicity Gap and doubly stochastic matrices

On the Hamiltonicity Gap and DoublyStochastic Matrices*

Vivek S. Borkar,1 Vladimir Ejov,2 Jerzy A. Filar2

1School of Technology and Computer Science, Tata Institute of FundamentalResearch, Mumbai 40005, India; e-mail: [email protected]

2School of Mathematics and Statistics, The University of South Australia, MawsonLakes, SA 5095, Australia; e-mails: [email protected];[email protected]

Received 23 July 2006; accepted 17 April 2008; received in final form 29 April 2008Published online 10 September 2008 in Wiley InterScience (www.interscience.wiley.com).DOI 10.1002/rsa.20237

ABSTRACT: We consider the Hamiltonian cycle problem embedded in singularly perturbed (con-trolled) Markov chains. We also consider a functional on the space of stationary policies of the processthat consists of the (1,1)-entry of the fundamental matrices of the Markov chains induced by thesepolicies. We focus on the subset of these policies that induce doubly stochastic probability transitionmatrices which we refer to as the “doubly stochastic policies.” We show that when the perturbationparameter, ε, is sufficiently small, the minimum of this functional over the space of the doubly sto-chastic policies is attained at a Hamiltonian cycle, provided that the graph is Hamiltonian. We alsoshow that when the graph is non-Hamiltonian, the above minimum is strictly greater than that in aHamiltonian case. We call the size of this difference the “Hamiltonicity Gap” and derive a conservativelower bound for this gap. Our results imply that the Hamiltonian cycle problem is equivalent to theproblem of minimizing the variance of the first hitting time of the home node, over doubly stochasticpolicies. © 2008 Wiley Periodicals, Inc. Random Struct. Alg., 34, 502–519, 2009

Keywords: Hamiltonian cycle; controlled Markov chains; optimal policy; singular perturbation

1. INTRODUCTION

We consider the following version of the Hamiltonian cycle problem: given a directed graph,find a simple cycle that contains all vertices of the graph (Hamiltonian cycle (HC)) or prove

Correspondence to: Jerzy A. Filar*Supported by ARC Grant (DP0666632); V.S. Borkar was supported in part by a grant from the Department ofScience and Technology, Government of India.© 2008 Wiley Periodicals, Inc.

502

HAMILTONICITY GAP AND DOUBLY STOCHASTIC MATRICES 503

that HC does not exist. With respect to this property—Hamiltonicity—graphs possessingHC are called Hamiltonian.1

This article is a direct continuation of a recent contribution [6] by the present authors.It considerably strengthens the results reported in [6] and introduces a new notion of“Hamiltonicity Gap” that differentiates between the Hamiltonian and non-Hamiltoniangraphs.

More generally, this contribution can be seen as belonging to a line of research [13], [1],[15], [8], [11], [12] which aims to exploit the tools of controlled Markov decision chains(MDPs)2 to study the properties of a famous problem of combinatorial optimization: theHamiltonian Cycle Problem (HCP).

In particular, in [14] and [9] it was shown that Hamiltonian cycles of a graph can becharacterized as the minimizers of a functional based on the fundamental matrices of Markovchains induced by deterministic policies in a suitably perturbed MDP, provided that the valueof the perturbation parameter, ε, is sufficiently small. Furthermore, it was conjectured thatthe minimum over the space of deterministic policies would also constitute the minimumover the space of all stationary policies. The above conjecture remains open under theperturbation studied in [9] and [14].

However, in the present article, we extend the results of Borkar et al. [6] to prove that underanother linear, but symmetric, singular perturbation introduced in [6], the same elementof the fundamental matrix—considered as a function on the the space of all stationarypolicies that induce doubly stochastic probability transition matrices—is minimized at aHamiltonian cycle, provided that the graph is Hamiltonian. Interestingly, the minimumvalue of this objective function is a function of only the total number of nodes and theperturbation parameter and is strictly less than the corresponding minimum value of anynon-Hamiltonian graph with the same number of nodes. We name the difference betweenthe values of these global minima, the “Hamiltonicity gap” and claim that its presence hasa theoretical—but not necessarily practical—importance.

We believe that the theoretical importance of the Hamiltonicity gap stems from the factthat it enables us to differentiate between Hamiltonian and non-Hamiltonian graphs onthe basis of the minimum value of an objective function of a mathematical program thatlends itself to an appealing probabilistic/statistical interpretation; namely a constant plusthe scaled variance of the first return time of node 1. Hence, the Hamiltonian cycle problemis equivalent to the problem of minimizing the variance of the first hitting time of the homenode, over doubly stochastic policies.

Next, we shall briefly differentiate between our approaches and some of the best relatedlines of research. Many of the successful classical approaches of discrete optimization to theHamiltonian cycle problem focus on solving a linear programming “relaxation” followed byheuristics that prevent the formation of subcycles. In our approach, we embed a given graphin a singularly perturbed MDP in such a way that we can identify Hamiltonian cycles andsubcycles with exhaustive and nonexhaustive ergodic classes of induced Markov chains,respectively. Indirectly, this allows us to exploit a formidable array of properties of Markovchains, including those of the corresponding fundamental matrices. It should be mentionedthat algorithmic approaches to the HCP based on embeddings of the problem in Markov

1The name of the problem owes to the fact that Sir William Hamilton investigated the existence of such cycles onthe dodecahedron graph [20].2The acronym MDP stems from the alternative name of Markov decision processes.

Random Structures and Algorithms DOI 10.1002/rsa

504 BORKAR, EJOV, AND FILAR

decision processes are also beginning to emerge (e.g., see Andramonov et al. [1], Filar andLasserre [15] and Ejov et al. [10]).

Note that this approach is essentially different from that adopted in the study of randomgraphs where an underlying random mechanism is used to generate a graph (e.g., seeKarp’s seminal paper [17]). In our approach, the graph that is to be studied is given andfixed but a controller can choose arcs according to a probability distribution and with asmall probability (due to a perturbation) an arc may take you to a node other than its “head.”Of course, random graphs played an important role in the study of Hamiltonicity, a strikingresult to quote is that of Robinson and Wormald [19] who showed that with high probabilityk-regular graphs are Hamiltonian for k ≥ 3.

2. FORMULATION AND PRELIMINARIES

As in Borkar et al. [6], our dynamic stochastic approach to the HCP considers a movingobject tracing out a directed path on the graph � with its movement “controlled” by afunction f mapping the set of nodes V = V(�) = {1, 2 . . . , s} of � into the set of arcsA = A(�) of �. We think of this set of nodes as the state space of a controlled Markovchain � = �(�) where for each state/node i, the action space A(i) := {a|(i, a) ∈ A} is inone-to-one correspondence with the set of arcs emanating from that node, or, equivalently,with the set of endpoints of those arcs.

Illustration: Consider the complete graph �5 on five nodes (with no self-loops) and thinkof the nodes as the states of an MDP (denoted by �) with initial state 1, and of the arcsemanating from a given node as the actions available at that state. In a natural way theHamiltonian cycle c1 : 1 → 2 → 3 → 4 → 5 → 1 corresponds to the “deterministiccontrol” f1 : {1, 2, 3, 4, 5} → {2, 3, 4, 5, 1}, where f1(2) = 3 corresponds to the controllerchoosing arc (2,3) in state 2 with probability 1. The Markov chain induced by f1 is givenby the “zero-one” transition matrix P(f1) which, clearly, is irreducible. On the other hand,the union of two subcycles: 1 → 2 → 3 → 1 and 4 → 5 → 4 corresponds to the con-trol f2 : {1, 2, 3, 4, 5} → {2, 3, 1, 5, 4} which identifies the Markov chain transition matrixP(f2) (see later) containing two distinct ergodic classes. This leads to a natural embed-ding of the Hamiltonian cycle problem in a Markov decision problem �. The latter MDPhas a multichain ergodic structure which considerably complicates the analysis. However,this multichain structure can be “disguised”—but not completely lost—with the help of a“singular perturbation.” For instance, we could easily replace P(f2) with Pε(f2):

P(f2) =

0 1 0 0 00 0 1 0 01 0 0 0 00 0 0 0 10 0 0 1 0

and Pε(f2) =

ε 1 − 4ε ε ε ε

ε ε 1 − 4ε ε ε

1 − 4ε ε ε ε ε

ε ε ε ε 1 − 4ε

ε ε ε 1 − 4ε ε

.



The above perturbation is singular because it altered the ergodic structure of P(f2) bychanging it to an irreducible (completely ergodic) Markov Chain Pε(f2). Furthermore, itleads to a doubly stochastic matrix whose rows and columns both add to one. This has beenachieved as follows: Each row and column of the unperturbed s × s matrix (a permutationmatrix) has exactly one 1 and the rest zeros. We replace the 1’s by 1 − (s − 1)ε for some0 < ε < 1/(s−1) and the zeros by ε to get the doubly stochastic perturbation of the matrix.The precise definition of the doubly stochastic perturbation is given later in this section.

Next, we introduce the notion of a Markov chain induced by a stationary randomizedpolicy (also called a randomized strategy or a randomized control). The latter can be definedby an s × s stochastic matrix f with entries representing probabilities f (i, a) of choosing apossible action a at a particular state i whenever this state is visited. Of course, f (i, a) = 0,whenever a �∈ A(i). Randomized policies compose the strategy space FS. The discretenature of the HCP focuses our attention on special paths which our moving object can traceout in �. These paths correspond to the subspace FD ⊂ FS of deterministic policies arisingwhen the controller at every fixed state chooses some particular action with probability1 whenever this state is visited (f1 and f2 are instances of the latter). To illustrate thesedefinitions consider the simple case where fλ is obtained from the strictly deterministicpolicy f2 by the “controller” deciding to randomize at node 4 by choosing the arcs (4, 5)

and (4, 3) with probabilities f (4, 5) = 1 − λ and f (4, 3) = λ, respectively. The perturbedtransition probability matrix of the resulting policy fλ is given by

Pε(fλ) =

ε 1 − 4ε ε ε ε

ε ε 1 − 4ε ε ε

1 − 4ε ε ε ε ε

ε ε λ − 5ελ + ε ε 5ελ − λ − 4ε + 1ε ε ε 1 − 4ε ε

.

This is because(1 − λ)ε + λ(1 − 4ε) = λ − 5ελ + ε,

and(1 − λ)(1 − 4ε) + λε = 5ελ − λ − 4ε + 1.

As λ ranges from 0 to 1 the Markov chain ranges from the one induced by f2 to oneinduced by another deterministic policy. An attractive feature of an irreducible Markovchain P is the simple structure of its Cesaro-limit (or, stationary distribution) matrix P∗. Itconsists of an identical row-vector πT representing the unique solution of the linear systemof equations: πT P = πT , πT 1 = 1, where 1 is an s-dimensional column vector with unityin every entry.

2.1. The Fundamental Matrix

We consider a Markov Chain {Xn} on a finite state space S = {1, 2, . . . , s}, with a transitionmatrix P = [[p(i, j)]], assumed to be irreducible. Let πT = [π1, . . . , πs] denote its uniqueinvariant probability vector and define its stationary distribution matrix by

P∗ = limN↑∞

1

N + 1

N∑n=0

Pn =

πT

πT

...πT

.



For the sake of completeness, we recall probabilistic expressions for the elements of the

fundamental matrix G = [[g(i, j)]] = (I − P + P∗)−1, I being the s × s identity matrix.Let

e1 = [1, 0, . . . , 0]T ∈ Rs, 1 = [1, 1, . . . , 1]T ∈ R

s,

and define column and row vectors w and zT, respectively, as

w := Ge1, zT := eT1 G.

Let Ei[·] denote the expectation when X0 = i and τi = min{n > 0 : Xn = i} (= ∞ if theset on the right is empty). The following theorem is proved in [6].

Theorem 1. If P is irreducible, the elements of the fundamental matrix G = [[g(i, j)]]are given by:

g(i, i) = Ei[τi(τi + 1)]2(Ei[τi])2

,

g(i, j) = Ei[τi(τi + 1)]2(Ei[τi])2

− Ei[τj]Ei[τi] , i �= j.

2.2. Doubly Stochastic Perturbations of Graphs

A stochastic matrix P is called doubly stochastic if the transposed matrix PT is also stochastic(i.e., row sums of P and column sums of P add up to one). A doubly stochastic deterministicpolicy is one that induces a doubly stochastic probability transition matrix where units areinserted in place of arcs selected by that policy and zeroes in all other places. Hence a Markovchain induced by such a policy has a probability transition matrix that is a permutationmatrix.

For any randomized policy f the corresponding s × s perturbed probability transitionmatrix Pε(f ) has entries defined by the rule (for 0 < ε < 1

s )

Pε(f )(i, j) := (1 − sε)f (i, j) + ε. (1)

Thus every stationary policy f induces not only a perturbed (irreducible) probability tran-sition matrix Pε(f ), but also the accompanying stationary distribution matrix P∗

ε (f ) and thefundamental matrix Gε(f ). Recall that for a doubly stochastic matrix, the stationary distri-bution is simply πj = 1

s for all j, where πj(f ) := πj is the jth element of any row of P∗ε (f ).

Since πj = 1Ej [τj ] (e.g., see Theorem 5.3.2, p. 96, of [5]), we have Ef

j [τj] = s.

Notational remark: Given the underlying graph, a selection of a particular policy funiquely determines a Markov chain. Thus the various probabilistic quantities associatedwith that chain, in the sense prescribed above, are synonymous with the selection of thatpolicy. To simplify the notation, in many parts of this article the explicit dependence ofthese quantities on the policy f will be suppressed.

Denote by Dε the convex set of perturbed doubly stochastic matrices obtained by takingthe closed convex hull of the finite set Dε

e corresponding to the perturbed doubly stochasticdeterministic policies. We also write Dε

e as the disjoint union Dεd ∪Dε

h where Dεh corresponds

to the Hamiltonian cycles and Dεd to disjoint unions of short cycles that cover the graph.



Double stochasticity eliminates any other possibilities. Define the matrix norm ‖A‖∞ =maxi,j |a(i, j)| for a square matrix A = [[a(i, j)]]. Call P ∈ Dε an η0-perturbation of aHamiltonian cycle if there exists a P ∈ Dε

h such that ‖P − P‖∞ < η0 for a prescribedη0 > 0. Let Dε

p denote the set of such P. For a policy f inducing a doubly stochasticprobability transition matrix Pε(f ) and the fundamental matrix Gε(f ), define

wε1(f ) := Gε(f )(1, 1),

the top-left entry of Gε(f ). As mentioned above, when the policy f is given and fixed weshall write simply: wε

1 := wε1(f ).

The following are proved in [6].

Lemma 1.a. For a policy f inducing a doubly stochastic probability transition matrix we have

wε1 = 1

s2

s∑i=1

Ei[τ1] (2)

= (s + 1)

2s+ 1

2s2E1[(τ1 − s)2].

Furthermore, for policies corresponding to Dεh, the above reduces to3

wε1 = s + 1

2s+ O(ε). (3)

b. For Pε ∈ Dεd , wε

1 → ∞ as ε ↓ 0.c. For Hamiltonian graphs and sufficiently small ε > 0, all minima of wε

1 on Dεe are

attained on Dεp.

Theorem 2. For Hamiltonian graphs and sufficiently small ε > 0, all minima of wε1 on

Dε are attained on Dεp.

3. HAMILTONIAN CYCLES ARE GLOBAL MINIMIZERS

Clearly, the preceding theorem would be considerably strengthened if it could be shown thatthe minima of wε

1 on Dε are attained at doubly stochastic matrices induced by Hamiltoniancycles, not merely in their “sufficiently small” neighborhoods. This section is dedicated tothe proof of this fact.

3.1. Directional Derivatives

We shall now derive an expression for the directional derivative of our objective functionwε

1. Let P0, P1 denote two doubly stochastic matrices in Dε and for 0 ≤ λ ≤ 1, set P(λ) =λP1 + (1 − λ)P0.4 Correspondingly define Vλ(i) := Ei[τ1], i ∈ S, where the dependence

3Note that the superscript in wε1 emphasizes the dependence on ε.

4This is equivalent to a “line segment” fλ = λf1 + (1 − λ)f0 in the policy space.



of the distribution of τ1 on the parameter λ is implicit. Also, let P0, P1, P(λ) denote sub-matrices derived, respectively, from P0, P1, P(λ) by deletion of their first row and column.Similarly, for vectors, tilde will denote truncations resulting from the omission of the firstentry. For instance, Vλ = [Vλ(2), . . . , Vλ(s)]T , 1 = [1, · · · , 1]T ∈ R

s−1.From Lemma 1.3., p. 46, of [4], we know that Vλ is the unique solution to

Vλ = 1 + P(λ)Vλ. (4)

That is,

Vλ = (I − P(λ))−11. (5)

Let {Xn} denote a Markov chain governed by P(λ) whose entries will be denoted by p(j|i)and let I{E} denote the indicator function of any event E .

Lemma 2. Let νλ(i) = E[∑τ1m=0 I{Xm = i}] when the initial distribution is the uniform

distribution. Then νTλ := [νλ(2), . . . , νλ(s)] is the unique solution to

νTλ = 1

s1T + νT

λ P(λ), (6)

that is,

νTλ = 1

s1T (I − P(λ))−1. (7)

Proof. Let ζ1 = min{n ≥ 0 : Xn = 1} (= ∞ if this set is empty). Let νjλ(i) =

Ej[∑ζ1m=0 I{Xm = i}]. Then for j �= 1, ζ1 = τ1 and ν

jλ(i) = Ej[∑ζ1

m=0 I{Xm = i}] =Ej[∑τ1

m=0 I{Xm = i}]. Consider i, j �= 1. Note that Mn = ∑nm=1(I{Xm = i}− p(i|Xm−1)), n ≥

1, is a martingale. By the optional sampling theorem (e.g., see [7] p. 85 or [5] p. 45),

Ej

[τ1∧N∑m=1

(I{Xm = i} − p(i|Xm−1))

]= 0

for N ≥ 1, where τ1 ∧ N := min{τ1, N}. Letting N ↑ ∞ and using the easily establishedfact that E[τ 2

1 ] < ∞ (which implies uniform integrability of the sum above as N varies),we have

Ej

[τ1∑

m=1

(I{Xm = i} − p(i|Xm−1))

]= 0.



Thus

νjλ(i) = Ej

[τ1∑

m=0

I{Xm = i}]

= δij + Ej

[τ1∑

m=1

I{Xm = i}]

= δij + Ej

[τ1∑

m=1

p(i|Xm−1)

]= δij + Ej

[τ1−1∑m=0

p(i|Xm)

]

= δij + Ej

τ1−1∑

m=0

∑k �=1

p(i|k)I{Xm = k} (since Xm �= 1 for m < τ1)

= δij + Ej

∑

k �=1

τ1−1∑m=0

p(i|k)I{Xm = k}

= δij +∑k �=1

Ej

[τ1∑

m=0

I{Xm = k}]

p(i|k)(since Xτ1 = 1 �= k)

= δij +∑k �=1

νjλ(k)p(i|k).

Since ν1λ(i) = 0 for i �= 1, we also have

ν1λ(i) = 0 = δi1 +

∑k �=1

ν1λ(k)p(i|k).

Multiply both sides by 1s and sum over j. Then

s∑j=1

1

sν

jλ(i) = 1

s+ 1

s

∑j

∑k �=1

νjλ(k)p(i|k).

This proves the claim.

Let J(λ) denote our objective as a function of λ, that is, wε1, evaluated along the line

segment {P(λ) : 0 ≤ λ ≤ 1}. From (2), we have

J(λ) = 1

s2

s∑i=1

Vλ(i).

Differentiating with respect to λ on both sides yields

J ′(λ) = 1

s2

s∑i=1

V ′λ(i) = 1

s2

s∑i=2

V ′λ(i),

because Vλ(1) = E1[τ1] = s ∀λ ∈ [0, 1] ⇒ V ′λ(1) ≡ 0. From (4) and the definition of P(λ)

we haveV ′

λ = (P1 − P0)Vλ + P(λ)V ′λ.

ThereforeV ′

λ = (I − P(λ))−1(P1 − P0)Vλ,



and with the help of (5) and (7) this leads to

J ′(λ) = 1

s21T (I − P(λ))−1(P1 − P0)Vλ

= 1

sνT

λ (P1 − P0)Vλ (8)

= 1

s21T (I − P(λ))−1(P1 − P0)(I − P(λ))−11. (9)

3.2. Hamiltonian Cycles are Both Local and Global Minima

Though the following, purely technical, lemma is a straightforward application of theCauchy-Schwarz inequality, we include its proof for the sake of completeness.

Lemma 3. If xm = m, 1 ≤ m ≤ s, and {ym}a permutation of {xj}, then∑

i xiyi is maximizedexactly when yi = xi for all i and minimized exactly when yi = s + 1 − xi for all i.

Proof. Maximization claim is immediate from the Cauchy-Schwarz inequality. Whileminimizing, for the permutation {zi = s + 1 − yi} of {xj}, we have

∑i xizi ≤ ∑

i x2i with

equality if and only if xi = zi∀i and hence∑i

xiyi =∑

i

xi(s + 1 − zi) ≥∑

i

xi(s + 1 − xi),

with equality if and only if yi = s + 1 − xi ∀i. Hence the minimization claim follows.

We now consider J ′(0) in the situations where the doubly stochastic matrix P0 ∈ Dεh

is induced by a deterministic policy tracing out a Hamiltonian cycle. We shall first showthat, on a straight line path from P0 toward any doubly stochastic P1 induced by the graph,J ′(0) > 0. This shows that deterministic policies inducing Hamiltonian cycles correspondto local minima.

Suppose then that P0 corresponds to a Hamiltonian cycle C0. Without loss of generality,we assume C0 is the simplest Hamiltonian cycle that corresponds to the permutation σ0 thatmaps {1, 2, · · · , s − 1, s} onto {2, 3, · · · , s, 1}.

To start with, we consider P1 ∈ Dεh ∪ Dε

d , other than P0. That is, P1 is induced byany deterministic policy that traces out in the graph either a union of disjoint cycles or aHamiltonian cycle other than C0.

For each i ∈ S, other than 1, let m(i) denote the number of steps required to reach node1 from i on C0, if ε = 0.

V0(i) = Ei[τ1] = m(i) + O(ε) = (s − i + 1) + O(ε) (10)

To verify that the (suppressed) ε−dependence above is, indeed, only of order O(ε) notethat by (1) we can write P0 = P0(ε) = P0(0) − εK0, for some fixed matrix K0. Now, itfollows from (5) that (10) is simply the ith equation in the system

V0 = (I − P0(ε))−11 = (I − P0(0) + εK0)

−11. (11)

Now, it is easy to check that I − P0(0) is invertible and hence that the above equationcorresponds to a regularly (rather than a singularly) perturbed system. By well-knownresults from perturbation theory of linear operators (e.g., see [18], [2]) it follows that

V0 = (I − P0(0))−11 + O(ε).



Also, if j = i, i + 1, · · · , s, then Ej[∑τ1 =0 I{X = i}] = O(ε). For j = 1, 2, . . . , i − 1,

Ej[∑τ1 =0 I{X = i}] = 1 + O(ε). Both these claims can be proved by arguments analogous

to those used for proving (10). Hence for i = 2, . . . , s,

ν0(i) = 1

s

∑1≤j≤i

Ej

[τ1∑

i=0

I{X = i}]

+ O(ε)

= i − 1

s+ O(ε).

Thus by (4)

νT0 P0V0 =

s−1∑i=2

ν0(i)V0(i + 1)

= 1

s{1(s − 2) + 2(s − 3) + · · · (s − 2)1} + O(ε)

= 1

s{1[(s − 1) − 1] + 2[(s − 1) − 2] + · · · (s − 1)[(s − 1) − (s − 1)]} + O(ε)

= 1

s

((s − 1)

s−1∑r=1

r −s−1∑r=1

r2

)+ O(ε)

= (s − 1)2

2− 1

s

s−1∑r=1

r2 + O(ε).

Now suppose that P1 is induced by either a Hamiltonian cycle distinct from C0 or a deter-ministic policy that traces out a union of disjoint cycles in the graph. Hence, for every rowindex i there is a unique column index ji such that P1(i, ji) = 1 and ji �= jk; for i �= k. Thus

νT0 P1V0 =

s∑i=2

ν0(i)V0(ji)

= 1

s

(s∑

i=2

(i − 1)(s − ji + 1)

)+ O(ε)

= 1

s

(s−1∑r=1

r[(s − 1) − (jr+1 − 2)])

+ O(ε)

= s − 1

s

s−1∑r=1

r − 1

s

s−1∑r=1

ryr + O(ε)

= (s − 1)2

2− 1

s

s−1∑r=1

ryr + O(ε)

where r := i − 1, yr := (jr+1 − 2) and yr ∈ {0, 1, 2, . . . , (s − 2)} with yr �= yk wheneverr �= k. Note that if yr were allowed to take values only in the set {1, 2, . . . , s − 1}, then byLemma 3 we would have that

s−1∑r=1

r2 >

s−1∑r=1

ryr ,



whenever (y1, . . . , ys−1) �= (1, . . . , s−1). However, the inclusion of 0 as one of the possiblevalues for yr can only lower the right hand side of the above inequality.

Hence we have proved that

νT0 P1V0 − νT

0 P0V0 > 0

whenever P1 ∈ Dεh ∪ Dε

d and P1 �= P0.Now consider an arbitrary doubly stochastic P1 other than P0. By Birkhoff-von Neumann

theorem (e.g., see [3]) we know that P1 = ∑Mi=1 γiP

†i , where γi > 0 for all i,

∑i γi = 1,

P†i ∈ Dε

h ∪Dεd correspond to permutation matrices and M ≥ 1 is the number of permutation

policies induced by the graph. Note also that for at least one i in the above summationP†

i �= P0. Then by the preceding strict inequalities and by (8) we have that

J ′(0) = 1

s

(νT

0 P1V0 − νT0 P0V0

) = 1

s

(∑i

γi(νT0 P†

i V0 − νT0 P0V0)

)> 0.

The following main result now follows rather easily.

Theorem 4.

a. If P0 is induced by a Hamiltonian cycle, it is a strict local minimum for the costfunctional wε

1.b. If P0 is induced by a Hamiltonian cycle, it is also a global minimum for the cost

functional wε1.

Proof. The first part was proved above for P0 corresponding to the Hamiltonian cycle C0:it is sufficient to observe that for a strict local minimum the quantity

ν0P1V0 − ν0P0V0

remains strictly bounded away from zero as ε ↓ 0 for all extremal P1 �= P0. The effect ofconsidering another Hamiltonian cycle would be only to permute the order of the terms invarious summations, without changing the conclusions.

To see the second part, first note that the above allows us to choose an η0 > 0 such that P0

is the strict local minimum of wε1 in the η0−neighborhood of P0. As in the proof of Theorem

2 (see pp. 385–387 of [6]), choose ε > 0 small enough so that the global minimum of wε1

is attained on the η0−neighborhood of P0. “Small enough” here is quantified by an upperbound that depends only on s and η0, see [6]. The claim follows.

4. HAMILTONICITY GAP

Recall from Lemma 1 (see also Lemma 6 of [6]) that for a doubly stochastic policy f ∈ Dε,we had proved that the functional consisting of the top left hand element of the fundamentalmatrix induced by f is given by

wε1 := wε

1(f ) = s + 1

2s+ 1

2s2E1[(τ1 − s)2]. (12)



Note that dependence on f on the right is suppressed, but the expectation term is a functionof f , since the policy determines the distribution of τ1.

It should now be clear that a consequence of Lemma 1 and Theorem 4 is that wheneverthe underlying graph � is Hamiltonian the minimum of the above functional over f ∈ Dε

is given by

wε1(fh) = min

f ∈Dε

[wε

1(f )] = s + 1

2s+ O(ε), (13)

where fh ∈ Dεh is a policy defining any Hamiltonian cycle in the graph.

This section is devoted to the proof of the fact that, for ε > 0 and sufficiently small,there exists �(s) > 0 such that whenever the graph � is non-Hamiltonian

minf ∈Dε

[wε

1(f )] − wε

1(fh) ≥ �(s) − O(ε).

We name the quantity �(s) the Hamiltonicity gap (of order s) because it distinguishes allnon-Hamiltonian graphs with s nodes from all Hamiltonian graphs with the same numberof nodes.

Before presenting the proof, we note that such a result is reasonable when we consider thepossible variability of τ1—as captured by its variance E1[(τ1 − s)2]—for both Hamiltonianand non-Hamiltonian graphs. In the former case, it is clear that this variance can be madenearly zero by following a Hamiltonian cycle because the latter policy would yield a varianceactually equal to zero were it not for the (small) perturbation ε. However, if the graph isnon-Hamiltonian, perhaps, we can not avail ourselves of such a variance annihilating policy.This intuitive reasoning is made rigorous in the remainder of this section.

4.1. A Lower Bound for the Non-Hamiltonian Case

The key step in what follows is the derivation of an upper bound on Prf ({τ1 = s}), the prob-ability that return to node 1 returns in exactly s steps, under an arbitrary doubly stochasticpolicy f in a non-Hamiltonian graph.

Lemma 4. Suppose that the graph � is non-Hamiltonian, that is, Dεh = φ and let f be

an arbitrary doubly stochastic policy.

a. If ε = 0, then

Prf ({τ1 = s}) ≤ 1

4.

b. If ε > 0 and small, then

Prf ({τ1 = s}) ≤ 1

4+ O(ε).

Proof. First consider the case ε = 0. Let f be an arbitrary doubly stochastic policy and let{Xt}∞

0 be the Markov chain induced by f and the starting state 1. Let γ1 = (X0, X1, . . . , Xs) bea path of s steps through the graph and let χ1 = {γ1|X0 = Xs = 1, Xk �= 1; k = 1, . . . , s−1}.That is, the event that the first return to 1 occurs after s steps is {τ1 = s}, which is simplythe event that γ1 traces a path within χ1 and hence

Prf ({τ1 = s}) =∑γ1∈χ1

pγ1 ,



where pγ1 denotes the probability (under f ) of observing the path γ1. However, because thegraph is assumed to be non-Hamiltonian, all the paths in χ1 that receive a positive probabilityhave the structure

γ1 = γ ′1 ∪ γ1,

where γ ′1 consists of a nonself-intersecting “reduced path” from 1 to itself of length m ≤ s−2

adjoined at some node(s) other than 1 by one or more loops of total length s−m, that togetherconstitute γ1. One can think of γ ′

1 ∪ γ1 as the first and second parts of a figure comprisingof basic loop with one or more side-lobes attached to it, each of which is either a loop or aconnected union of loops. The simplest instance of this is a figure of eight, with two loopsof length m and s − m respectively, attached at a node other than 1.

Let pγ1 denote the probability of the original path and p′γ1

that of the reduced path. Letq := pγ1/p′

γ1≤ 1, which is the contribution to p coming from the loops comprising γ1.

More generally, define

γ0 = γ ′1, γ1 = γ ′

1 ∪ γ1, γ2 = γ ′1 ∪ γ1 ∪ γ1, γ3 = γ ′

1 ∪ γ1 ∪ γ1 ∪ γ1, . . . .

The paths γn(n ≥ 2) from 1 to itself that begin with the same reduced path γ ′1 but may

repeat exactly the same loop traversals of γ1 path for n ≥ 2 times, all contribute to the event{τ1 �= s}, as does γ0 = γ ′

1.The paths γn, n ≥ 2 have probabilities pγ1qn−1. The total probability that these paths and

γ0 = γ ′1 (but excluding the original γ1) contribute to {τ1 �= s} is as follows:

pγ1

q+

∑n≥2

pγ1qn−1 = pγ1

(1

q+ q

1 − q

)

= pγ1

(−1 + 1

q(1 − q)

)≥ 3pγ1 .

From the above it follows that

Prf ({τ1 �= s}) ≥∑γ1∈χ1

3pγ1 = 3Prf ({τ1 = s}).

Hence,

1 = Prf (τ1 < ∞)

= Prf (τ1 = s) + Prf (τ1 �= s)

≥ 4Prf (τ1 = s),

implying Prf (τ1 = s) ≤ 14 , or, Prf (τ1 �= s) ≥ 3

4 .Returning to the case when ε > 0 and small, we note that in the Markov chain induced

by f there are now two types of transitions: strong that correspond to f assigning a positiveprobability to arcs that are actually in the graph and weak that are strictly the result of ourperturbation. The latter are of order ε. Thus the only impact that the perturbation makes onthe argument presented above is to introduce an adjustment of order ε. This completes theproof.



Theorem 5. Suppose that the graph � is non-Hamiltonian, that is, Dεh = φ. Let �(s) =

38s2 .

a. For any policy f , we have that E1[(τ1 − s)2] ≥ 34 − O(ε).

b. We also have thatminf ∈Dε

[wε

1(f )] − wε

1(fh) ≥ �(s) − O(ε).

Proof. Let f be an arbitrary doubly stochastic policy and E1[(τ1−s)2] be the correspondingvariance of the first return time to node 1 (starting from 1). Clearly,

E1[(τ1 − s)2] =∑k≥1

(k − s)2Prf (τ1 = k)

≥∑

k≥1,k �=s

Prf (τ1 = k) = Prf (τ1 �= s).

Hence by Lemma 4(b) we have obtained part (a), namely

E1[(τ1 − s)2] ≥ 3

4− O(ε). (14)

It now follows from (12) that

wε1 := wε

1(f ) ≥ s + 1

2s+ 1

2s2

(3

4− O(ε)

)= s + 1

2s+ �(s) − O(ε). (15)

Part (b) now follows immediately from (15) and (13).

5. SYMMETRIC DOUBLY STOCHASTIC POLICIES

In view of the results of the preceding sections, it is clear that the Hamiltonian cycle problemis equivalent to the following mathematical programming problem

min[wε

1(f )] = min

{[I − Pε(f ) + P∗

ε (f )]−1

1,1

}subject to:

(i)∑

a

f (i, a) = 1, i = 1, . . . , s, f (i, a) ≥ 0 ∀i, a,

(ii) 1T Pε(f ) = 1T .

Of course, constraints (i) in the above ensure that f is a proper stationary policy, whereasconstraints (ii) ensure that Pε(f ) is a doubly stochastic probability transition matrix. Fur-thermore, (12) ensures that this problem is equivalent to minimizing the variance of τ1, thefirst return time to node 1. That is

arg minf ∈Dε

[wε

1(f )] = arg min

f ∈Dε[Varf (τ1)] (16)

where for every doubly stochastic stationary policy the variance of the first return time to 1is denoted by Varf (τ1) := E1[(τ1 − s)2].Random Structures and Algorithms DOI 10.1002/rsa


Since, in some contexts, variance minimization is a convex optimization problem, it isimportant to stress that this is not the case here. However, there is an interesting convexsubset of symmetric doubly stochastic policies such that our objective function over thatsubset is strictly convex. At this stage, it is not clear whether the convex program consistingof minimization of wε

1(f ) over that set is related to the original Hamiltonicity problem.Nonetheless, for the sake of completeness, the remainder of this section is devoted toestablishing these restricted convexity properties and to showing that they do not extend toall of Dε.

Let P0, P1 be doubly stochastic matrices corresponding to a pair of policies f0, f1 andlet P0, P1 denote the corresponding matrices with the first row and column removed. Also,P(λ) = λP1 + (1 − λ)P0 for λ ∈ (0, 1) and, as in Section 3, let

J(λ) = 1

s2

s∑i=1

Vλ(i), (17)

that is, J(λ) is the short-hand notation for the corresponding objective function wε1(fλ) =

wε1(λf1 + (1 − λ)f0). We now derive a useful expression for the m-th derivative of J(λ).

Lemma 5. For m ≥ 1,

J (m)(λ) = dmJ(λ)

dλm=

(m!s2

)1T (I − P(λ))−1((P1 − P0)(I − P(λ))−1)m1. (18)

Proof. In Section 3 equation (9), the claim has already been proved for m = 1. We claim

that V (m)

λ

def= dmVλ/dλm is uniquely specified by the linear system

V (m)

λ = m(P1 − P0)V(m−1)

λ + P(λ)V (m)

λ .

This follows by simple induction. The claim follows from this and (17).

Define,

qdef= (I − P(λ))−1(P1 − P0)(I − P(λ))−11,

Adef= I − 1

2(P(λ) + P(λ)T ).

Lemma 6. A is positive definite.

Proof. Since P(λ) is irreducible doubly stochastic, 12 (P(λ)+ P(λ)T ) is strictly substochas-

tic with spectral radius <1, where the latter observation follows from Gershgorin’s “disctheorem” (e.g., see [16], p. 344). Thus A is symmetric with strictly positive eigenvalues.The claim follows.

Define Dεsym

def= {Q ∈ Dε : Q = QT }. Note that for every P ∈ Dε, PT ∈ Dε and hence12 (P + PT ) ∈ Dε

sym. Clearly Dεsym is a convex compact polytope. Let Dε

sym,e denote the set ofits extreme points.

Lemma 7. P ∈ Dεsym,e if and only if P = 1

2 (P + PT ) for some P ∈ Dεe .



Proof. If P ∈ Dεe as well, the claim trivially holds. Suppose not. If the claim is false,

P = ∑mi=1 aiPi for some m ≥ 2, ai > 0 ∀i with

∑i ai = 1, and distinct Pi ∈ De for

1 ≤ i ≤ m. Since P = PT , we also have P = ∑mi=1 aiPT

i and thus P = ∑mi=1 aiPi for

Pidef= 1

2 (Pi + PTi )∀i. This contradicts the fact P ∈ Dε

sym,e, proving the claim.

From now on, consider P0, P1 ∈ Dεsym. In particular, it follows from (18) and properties

of symmetric matrices that

J (2)(λ) = 2

s21T (I − P(λ))−1(P1 − P0)(I − P(λ))−1(P1 − P0)(I − P(λ))−11

= 2

s2qT Aq, (19)

Lemma 8. If J (2)(λ) = 0 for some λ ∈ (0, 1), then J (m)(λ) = 0 ∀m ≥ 1.

Proof. By Lemma 6, J (2)(λ) = 0 would imply q = 0. The claim is now immediate from(18).

Lemma 9. J (2)(λ) > 0 for any λ ∈ (0, 1).

Proof. By Lemma 6 and (19), J (2)(λ) ≥ 0. If it were zero, so would be all the higherderivatives by Lemma 8 and the rational function J(λ) would have to be a constant, whichit cannot be.

Corollary 1. The cost function wε1 restricted to Dε

sym is strictly convex and hence attains

its minimum at a unique point P in Dεsym,e.

Example. We consider Petersen graph with the adjacency matrix

0 1 0 0 1 1 0 0 0 01 0 1 0 0 0 1 0 0 00 1 0 1 0 0 0 1 0 00 0 1 0 1 0 0 0 1 01 0 0 1 0 0 0 0 0 11 0 0 0 0 0 0 1 1 00 1 0 0 0 0 0 0 1 10 0 1 0 0 1 0 0 0 10 0 0 1 0 1 1 0 0 00 0 0 0 1 0 1 1 0 0

. (20)

Next, we consider a deterministic policy f0 consisting of two subcycles of length 5. Inparticular, f0 is equivalent to the map

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} → {2, 3, 4, 5, 1, 8, 9, 10, 6, 7}.Furthermore, let f1 be equivalent to the map

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} → {5, 1, 2, 3, 4, 9, 10, 6, 7, 8},Random Structures and Algorithms DOI 10.1002/rsa


Fig. 1. Function J(λ).

Fig. 2. Function J (2)(λ).

namely, f1 consists of the “reverse sub-cycles” of f0.Clearly, for every ε, Pε(f1) = PT

ε (f1) and we can evaluate J(λ) = wε1(fλ) =

wε1(λf1 + (1 − λ)f0) and it’s second derivative on the interval λ ∈ [0, 1] in accor-

dance with (17) and (19), respectively. The plots of these functions are displayed inFigs. 1 and 2. It is clear from the figures that J(λ) = wε

1(fλ) is neither concave norconvex.

ACKNOWLEDGMENTS

The authors are indebted to Nelly Litvak, Ali Eshragh, and Giang Nguyen for their readingof the manuscript, comments, and corrections. We are also grateful to an anonymous refereefor very meticulous corrections.



REFERENCES

[1] M. Andramonov, J. A. Filar, A. Rubinov, and P. Pardalos, Hamiltonian cycle problem viaMarkov chains and min-type approaches, In P. Pardalos, editor, Approximation Complexity inNumerical Optimization, Kluwer, Dordrecht, 2000, pp. 31–47.

[2] K. E. Avrachenkov, M. Haviv, and P. G. Howlett, Inversion of analytic matrix functions thatare singular at the origin, SIAM J Matrix Anal Appl 22 (2001), 1175–1189.

[3] R. B. Bapat and T. E. S. Raghavan, Nonnegative matrices and applications, CambridgeUniversity Press, Cambridge, 1997.

[4] V. S. Borkar, Topics in controlled markov chains, Pitman Lecture Notes in Mathematics 240,Longman Scientific and Technical, Harlow, Essex, UK, 1991.

[5] V. S. Borkar, Probability theory: An advanced course, Springer-Verlag, New York, 1995.

[6] V. S. Borkar, V. Ejov, and J. A. Filar, Directed graphs, hamiltonicity and doubly stochasticmatrices, Random Struct Algorithm 25 (2004), 376–395.

[7] L. Breiman, Probability, SIAM, Philadelphia, 1992.

[8] M. Chen and J. A. Filar, Hamiltonian cycles, quadratic programming and ranking of extremepoints, In C. Floudas and P. Pardalos, editors, Recent Advances in global optimization,Princeton University Press, New Jersey, 1992, pp. 32–49.

[9] V. Ejov, J. Filar, and M. Nguyen, Hamiltonian cycles and singularly perturbed Markov chains,Math Oper Res 29 (2004), 114–131.

[10] V. Ejov, J. Filar, and J. Gondzio, An interior point heuristic algorithm for the HCP, J GlobalOptim 29 (2004), 315–334.

[11] E. Feinberg, Constrained discounted Markov decision process and Hamiltonian cycles, MathOper Res 25 (2000), 130–140.

[12] J. A. Filar, Hamiltonian cycles and Markov chains, Found Trends Stochastic Sys 1 (2006),77–162.

[13] J. A. Filar and D. Krass, Hamiltonian cycles and Markov chains, Mathem Oper Res 19 (1994),223–237.

[14] J. A. Filar and K. Liu, Hamiltonian cycle problem and singularly perturbed Markov decisionprocess, Statistics, Probability and Game Theory, IMS Lecture Notes—Monograph Ser 30(1996), 45–63.

[15] J. A. Filar and J.-B. Lasserre, A non-standard branch and bound method for the Hamiltoniancycle problem, ANZIAM J 42 (2000), C586–C607.

[16] R. A. Horn and C. R. Johnson, Matrix analysis, Cambridge University Press, Cambridge, 1985.

[17] R. Karp, Probabilistic analysis of partitioning algorithms for the traveling-salesman problemin the plane, Math Oper Res 2 (1977), 209–224.

[18] C. E. Langenhop, The Laurent expansion for a nearly singular matrix, Linear Algebra Appl 4(1971), 329–340.

[19] R. Robinson and N. Wormald, Almost all regular graphs are hamiltonian, Random StructAlgorith 5 (1994), 363–374.

[20] R. J. Wilson, Introduction to graph theory, Longman, New York, 1996.


Documents

On the Hamiltonicity Gap and doubly stochastic matrices