14
Research Article The Statistical Mechanics of Random Set Packing and a Generalization of the Karp-Sipser Algorithm C. Lucibello 1 and F. Ricci-Tersenghi 2 1 Dipartimento di Fisica, Universit` a “La Sapienza”, Piazzale Aldo Moro 2, 00185 Rome, Italy 2 Dipartimento di Fisica, INFN-Sezione di Roma1, CNR-IPCF UOS Roma Kerberos, Universit` a “La Sapienza”, Piazzale Aldo Moro 2, 00185 Rome, Italy Correspondence should be addressed to F. Ricci-Tersenghi; [email protected] Received 19 November 2013; Accepted 8 January 2014; Published 10 March 2014 Academic Editor: Hyunggyu Park Copyright © 2014 C. Lucibello and F. Ricci-Tersenghi. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We analyse the asymptotic behaviour of random instances of the maximum set packing (MSP) optimization problem, also known as maximum matching or maximum strong independent set on hypergraphs. We give an analytic prediction of the MSPs size using the 1RSB cavity method from statistical mechanics of disordered systems. We also propose a heuristic algorithm, a generalization of the celebrated Karp-Sipser one, which allows us to rigorously prove that the replica symmetric cavity method prediction is exact for certain problem ensembles and breaks down when a core survives the leaf removal process. e -phenomena threshold discovered by Karp and Sipser, marking the onset of core emergence and of replica symmetry breaking, is elegantly generalized to = /(−1) for one of the ensembles considered, where is the size of the sets. 1. Introduction e maximum set packing is a very much studied problem in combinatorial optimization, one of Karp’s twenty-one NP- complete problems. Given a set = {1, . . . , } and a collection of its subsets S = { | ⊆ , ∈ } labeled by = {1,...,}; a set packing (SP) is a collection of the subsets such that they are pairwise disjoint. e size of a SP S S is |S |. A maximum set packing (MSP) is an SP of maximum size. e integer programming formulation of the MSP problem reads maximize , (1) subject to :∈ ≤1 ∀ ∈ , (2) = {0, 1} ∀ ∈ . (3) e MSP problem, also known in the literature as the matching problem on hypergraphs or the strong indepen- dent set problem on hypergraphs, is an NP-Hard problem. is general formulation, however, can be specialized to obtain two other famous optimization problems: the restric- tion of the MSP problem to sets of size 2 corresponds to the problem of maximum matching on ordinary graphs and can be solved in polynomial time [1]; the restriction where each element of appears exactly 2 times in S is the maximum independent set and belongs to the NP-Hard class. e formulation ((1)–(3)) of the MSP problem, therefore, encodes an ample class of packing problems and, as all packing problems, is related by duality to a covering prob- lem, the minimum set covering problem. Another common specialization of the general MSP problem, known as - set packing, is that in which all sets have size at most . is is one of the most studied specializations in the computer science community, the efforts concentrating on minimal degree conditions to obtain a perfect matching [2], linear relaxations [3, 4], and approximability conditions [57]. Motivated by this interest, we choose a -set packing problem ensemble as the principal application of the general analytical framework developed in the following sections. e asymptotic behaviour of random sparse instances of the MSP problem has not been investigated by mathematicians Hindawi Publishing Corporation International Journal of Statistical Mechanics Volume 2014, Article ID 136829, 13 pages http://dx.doi.org/10.1155/2014/136829

The Statistical Mechanics of Random Set Packing and a

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Statistical Mechanics of Random Set Packing and a

Research ArticleThe Statistical Mechanics of Random Set Packingand a Generalization of the Karp-Sipser Algorithm

C Lucibello1 and F Ricci-Tersenghi2

1 Dipartimento di Fisica Universita ldquoLa Sapienzardquo Piazzale Aldo Moro 2 00185 Rome Italy2 Dipartimento di Fisica INFN-Sezione di Roma1 CNR-IPCF UOS Roma Kerberos Universita ldquoLa SapienzardquoPiazzale Aldo Moro 2 00185 Rome Italy

Correspondence should be addressed to F Ricci-Tersenghi federicoricciuniroma1it

Received 19 November 2013 Accepted 8 January 2014 Published 10 March 2014

Academic Editor Hyunggyu Park

Copyright copy 2014 C Lucibello and F Ricci-Tersenghi This is an open access article distributed under the Creative CommonsAttribution License which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

We analyse the asymptotic behaviour of random instances of the maximum set packing (MSP) optimization problem also knownas maximummatching or maximum strong independent set on hypergraphs We give an analytic prediction of theMSPs size usingthe 1RSB cavity method from statistical mechanics of disordered systemsWe also propose a heuristic algorithm a generalization ofthe celebrated Karp-Sipser one which allows us to rigorously prove that the replica symmetric cavity method prediction is exact forcertain problem ensembles and breaks downwhen a core survives the leaf removal processThe 119890-phenomena threshold discoveredbyKarp and Sipsermarking the onset of core emergence and of replica symmetry breaking is elegantly generalized to119862

119904= 119890(119889minus1)

for one of the ensembles considered where 119889 is the size of the sets

1 Introduction

The maximum set packing is a very much studied problemin combinatorial optimization one of Karprsquos twenty-one NP-complete problems Given a set 119865 = 1 119872 and acollection of its subsets S = 119878

119894| 119878

119894sube 119865 119894 isin 119881 labeled

by 119881 = 1 119873 a set packing (SP) is a collection of thesubsets 119878

119894such that they are pairwise disjointThe size of a SP

S1015840

sube S is |S1015840

| A maximum set packing (MSP) is an SP ofmaximum size The integer programming formulation of theMSP problem reads

maximize sum

119894isin119881

119899119894 (1)

subject to sum

119894119903isin119878119894

119899119894le 1 forall119903 isin 119865 (2)

119899119894= 0 1 forall119894 isin 119881 (3)

The MSP problem also known in the literature as thematching problem on hypergraphs or the strong indepen-dent set problem on hypergraphs is an NP-Hard problem

This general formulation however can be specialized toobtain two other famous optimization problems the restric-tion of theMSP problem to sets 119878

119894of size 2 corresponds to the

problem of maximum matching on ordinary graphs and canbe solved in polynomial time [1] the restriction where eachelement of 119865 appears exactly 2 times in S is the maximumindependent set and belongs to the NP-Hard class

The formulation ((1)ndash(3)) of the MSP problem thereforeencodes an ample class of packing problems and as allpacking problems is related by duality to a covering prob-lem the minimum set covering problem Another commonspecialization of the general MSP problem known as 119896-set packing is that in which all sets 119878

119894have size at most

119896 This is one of the most studied specializations in thecomputer science community the efforts concentrating onminimal degree conditions to obtain a perfect matching [2]linear relaxations [3 4] and approximability conditions [5ndash7] Motivated by this interest we choose a 119896-set packingproblem ensemble as the principal application of the generalanalytical framework developed in the following sectionsThe asymptotic behaviour of random sparse instances of theMSP problem has not been investigated by mathematicians

Hindawi Publishing CorporationInternational Journal of Statistical MechanicsVolume 2014 Article ID 136829 13 pageshttpdxdoiorg1011552014136829

2 International Journal of Statistical Mechanics

and computer scientists only in the matching [8] andindependent set [9] restrictions some work has been doneExtending some theorems of [8] (on which a part of this workis greatly inspired) to a greater class of problem ensembles issome of the main aims of the present work

On the other hand also the statistical physics literatureis lacking an accurate study of the random MSP problemOne of its specialization though the matching problem hasbeen covered since the beginning of the physicistsrsquo interest inoptimization problems with the work of Parisi and Mezardon the weighted and fully connected version of the problem[10 11] More recently the matching problem on sparserandom graphs has also been accurately studied [12 13]using the cavity method technique Also the independentset problem on random graphs [14] and the dual problemto set packing the set covering problem [15] received someattention by the disordered physics community The SPproblem was investigated with the cavity method formalismin a disguised form as a glass model on a generalized Bethelattice in [16 17] This corresponds as we will see in thenext section to a factor graph ensemble with fixed factor andvariable degrees thus we will not cover this case in Section 7

The paper is organized as follows

(i) In Section 2 we map the MSP problem ((1)ndash(3)) intoa statistical physical model defined on a factor graphand relate the MSP size to the density 120588 at infinitechemical potential

(ii) We introduce the replica symmetric (RS) cavitymethod in Section 3 and give an estimate for theaverage MSPs size on sparse factor graph ensemblesin the thermodynamic limit

(iii) In Section 4 we establish a criterion for the validity ofthe RS ansatz and introduce the 1RSB formalism

(iv) In Section 5 we propose a generalization of the Karp-Sipser heuristic algorithm [8] to the MSP problemand prove the validity of the RS ansatz for certainensemble of problems Moreover we find a relation-ship between a core emergence phenomena and thebreaking of replica symmetry breaking

(v) Section 6 describes the numerical simulations per-formed

(vi) In Section 7 we apply the analytical tools developed tosome problem ensembles We compare the numericalresults obtained from an exact algorithm with theanalytical predictions focusing to greater extent toone ensemble modelling the 119896-set packing

2 Statistical Physics Description

In order to turn the MSP combinatorial optimization prob-lem into a useful statistical physical model let us recast ((1)ndash(3)) into a graphical model using the factor graph formalism[18 19] We define our variable nodes set to be 119881 and to each119894 isin 119881 we associate a variable 119899

119894taking values in 0 1 as in

(3) 119865 will be our factor nodes set as its elements acts as hardconstrains on the variables 119899

119894through (2) The edge set 119864 is

then naturally defined as 119864 = (119894 119903) | 119894 isin 119881 119903 isin 119878119894sube 119865

We call119866 = (119881 119865 119864) the factor graph thus composed and canthen rewrite (2) as

sum

119894isin120597119903

119899119894le 1 forall119903 isin 119865 (4)

A SP is a configuration 119899119894 satisfying (4) and its relative size is

120588(119899119894) = (1119873)sum

119894isin119881119899119894that is simply the fraction of occupied

sitesIt is now easy to define an appropriate Gibbs measure

for the MSPs problem on 119866 through the grand canonicalpartition function

Ξ119866(120583) = sum

119899119894

prod

119894isin119881

119890120583119899119894

prod

119903isin119865

I(sum

119894isin120597119903

119899119894le 1) (5)

Only SPs contribute to the partition function and in the closepacking limit as we will call the limit 120583 uarr +infin the measureis dominated by MSPs Equation (5) is also a model for aparticle gas with hard core repulsion and chemical potential120583located on a hypergraph and as such has been studied mainlyon lattice structures and more in general on ordinary graphsModel (5) has been studied on a generalized Bethe lattice (iethe ensemble G

119877119877(119889 119888) defined in Section 7 a 119889-uniform 119888-

regular factor graph) in [16 17] as a prototype of a systemwith finite connectivity showing a glassy behaviour This hasbeen the only approach although disguised as a hard spheresmodel from the statistical physics community to a generalMSP problem

The grand canonical potential is defined as

120596119866(120583) = minus

1

120583119873

logΞ119866(120583) (6)

and the particle density as

120588119866(120583) =

1

119873

⟨sum

119894isin119881

119899119894⟩

119866120583

= minus120596119866(120583) minus 120583120597

120583120596119866(120583) (7)

Grand potential and density are related to entropy by thethermodynamic relation

119904119866(120583) = minus120583 (120596

119866(120583) + 120588

119866(120583)) = 120583

2

120597120583120596119866(120583) (8)

In the close packing limit (ie 120583 uarr +infin) we recover the MSPproblem since in this limit the Gibbs measure is uniformlyconcentrated on MSPs and 120588

119866gives the MSP relative size

Since entropy remains finite in this limit from (8) we obtainthe MSP relative size

120588119866

equiv lim120583rarr+infin

120588119866(120583) = lim

120583rarr+infin

minus 120596119866(120583) (9)

In this paper we focus on random instances of the MSPproblem As usual in statistical physics we will assume thenumber of variables 119873 (the number of subsets 119878

119894) and the

number of constrains 119872 to diverge keeping the ratio 119873119872

finite We will refer to this limit as the thermodynamic limitInstances of theMSP problemwill be encoded in factor graph

International Journal of Statistical Mechanics 3

ensembles which we assume to be locally tree-like in thethermodynamic limit

TheMSP relative size 120588119866is a self-averaging quantity in the

thermodynamic limit and we want to compute its asymptoticvalue

120588 = lim119873rarr+infin

E119866[120588

119866] (10)

where we denoted with E119866[sdot] the expectation over the factor

graph ensemble In the last equation the 119873 dependence isencoded in the graph ensemble considered Computing (10)is not an easy task and some approximation have to betaken We will employ the cavity method from the statisticalphysics of disordered systems [19 20] using both the replicasymmetric (RS) and the one-step replica symmetry breaking(1RSB) ansatz We will prove in Section 5 that the RS ansatz isexact in a certain region of the phase space while in Section 7we will give numerical evidence that the 1RSB approximationgives very good results outside the RS region

3 Replica Symmetry

31 Bethe Approximation on a Single Instance The RS cavitymethod has been known for many decades outside thestatistical physics community as the Belief Propagation (BP)algorithm and only in recent years the two approaches havebeen bridged [18 19] We start with a variational approxima-tion to the grand potential equation (6) of an instance of theproblem the Bethe free energy approximation

120596RS119866

[] =

1

119873

[ sum

119903isin120597119865

120596119903[] + sum

119894isin119881

(1 minus |120597119894|) 120596119894[]] (11)

with the factor and variable contributions given by

120596119903[] = minus

1

120583

log[

[

sum

119899119894isin120597119903

I(sum

119894isin120597119903

119899119894le 1) prod

119895isin120597119903

prod

119904isin120597119895119903

]119904rarr 119895

(119899119895)]

]

120596119894[] = minus

1

120583

log[sum

119899119894

119890120583119899119894

prod

119903isin120597119894

]119903rarr 119894

(119899119894)]

(12)

The grand canonical potential is expressed as a functionof the factor node to variable node messages = ]

119903rarr 119894

Minimization of 120596RS119866

[] over the messages constrained to benormalized to one yields the fixed point BP equations for theset packing

]119903rarr 119894

(1) =

1

119885119903rarr 119894

prod

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(0)

]119903rarr 119894

(0) =

1

119885119903rarr 119894

[

[

prod

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(0)

+ 119890120583

sum

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(1)

times prod

1198951015840isin120597119903119894119895

prod

1199041015840isin1205971198951015840119903

]1199041015840rarr1198951015840 (0)

]

]

(13)

The coeeficients 119885119903rarr 119894

are normalization factor Equations(13) can be simplified introducing the fields 119905

119903rarr 119894 defined as

]119903rarr 119894

(1)

]119903rarr 119894

(0)

= 119890minus120583119905119903rarr 119894

(14)

yielding

119905119903rarr 119894

=

1

120583

log[

[

1 + sum

119895isin120597119903119894

119890120583(1minussum

119904isin120597119895119903119905119904rarr 119895)]

]

(15)

Since we are interested in the close packing limit to solve theproblemwewill straightforwardly apply the zero temperaturecavity method [21] The related BP equations which can befound as the 120583 uarr infin limit of (15) read

119905119903rarr 119894

= max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin120597119903119894

(16)

We note that the messages 119905119903rarr 119894

are bounded to take valuesin the interval [0 1] and that if we set the initial value ofeach 119905

119903rarr 119894in the discrete set 0 1 at each BP iteration all

messages will take value either 0 or 1 These values can bedirectly interpreted as the occupational loss occurring in thesubtree 119903 rarr 119894 if the subtree is connected to the occupiednode 119894 This loss cannot be negative (thus a gain) since weput an additional constrain on the subtree demanding every119895 neighbour of 119903 to be empty and cannot be greater than 1 aswell in fact 119905

119903rarr 119894= 1 corresponds to the worst case scenario

where an otherwise occupied node 119895 isin 120597119903119894has to be emptiedThe Bethe free energy for model (5) on the factor graph

119866 can be expressed as a function of the fixed point messages119905119903rarr 119894

as

120596RS119866

(120583) = minus

1

120583119873

[

[

sum

119903isin120597119865

log(1 + sum

119895isin120597119903

119890120583(1minussum

119904isin120597119895119903119905119904rarr 119895)

)

+ sum

119894isin119881

(1 minus |120597119894|) log (1 + 119890120583(1minussum

119903isin120597119894119905119903rarr 119894)

)]

]

(17)

4 International Journal of Statistical Mechanics

We finally arrive to the Bethe estimation of the MSPs relativesize taking the close packing limit of (17) and using 120596

RS119866

=

minus120588RS119866 which is given by

120588RS119866

=

1

119873

[

[

[

sum

119903isin119865

max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin119903

+sum

119894isin119881

(1 minus |120597119894|)max0 1 minus sum

119903isin119894

119905119903rarr 119894

]

]

]

(18)

Let us examine the various contributions to (18) sincewewantto convince ourselves that it exactly counts the MSP size atleast on tree factor graphs The term (1 minus |120597119894|)max0 1 minus

sum119903isin119894

119905119903rarr 119894

contributes with 1 minus |120597119894| to the sum only if all theincoming 119905messages are zero In this case 119899

119894is frozen to 1 that

is the variable 119894 takes part of all the MSPs in 119866 Obviously allthe neighbours of a variable frozen to 1 have to be frozen to0 To all its |120597119894| neighbours 119903 the frozen to 1 variable 119894 sends amessage 1minussum

119904isin120597119894119903119905119904rarr 119894

= 1 so that we have |120597119894| contributionsin the first sum of (18) max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 andthe total contribution from 119894 correctly sums up to 1 If for acertain 119894 we have a total field 120591

119894equiv 1 minus sum

119903isin119894119905119903rarr 119894

lt 0 (twoor more incoming messages are equal to one) the variableis frozen to 0 that is it does not take part of any MSPs Itcorrectly does not contribute to 120596

RS119866

since it sends a message1 minus sum

119904isin119894119903119905119904rarr 119894

le 0 to each neighbour 119903 thus it is notcomputed in max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

The third case is the most interesting It concerns vari-

ables 119894 which take part to a fraction of the MSPs We willcall them unfrozen variables The total field on an unfrozenvariable 119894 is 120591

119894= 0 (thus we have no contribution from the

second sum in (18)) and all incoming messages are 0 exceptfor a single 119905

119903rarr 119894= 1 To this sole function node 119903 the node

119894 sends a message 1 so that the contribution of 119903 to the firstsum is 1 Actually BP equations impose that 119903 has to have atleast another unfrozen neighbour beside 119894 In other terms thefunction node 119903 says that whatever MSP we consider one ofmy neighbours has to be occupied The corresponding termmax0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 in (18) accounts for thatThe presence of unfrozen variables is the reason why we

cannot express the density 120588119866through the formula

120588119866

= ⟨sum

119894isin119881

119899119894⟩ (19)

In fact using the infinite chemical potential formalism wecannot compute

⟨119899119894⟩ = lim

120583rarr+infin

119890120583(1minussum

119903isin119894119905119903rarr 119894)

1 + 119890120583(1minussum

119903isin119894119905119903rarr 119894)

(20)

when lim120583rarr+infin

1 minus sum119903isin119894

119905119903rarr 119894

= 0 and we would have touse the 119874(1120583) corrections to the fields 119905

119903rarr 119894 We bypass

the problem using the grand potential 120596RS119866

to obtain 120588RS119866

also addressing a problem reported in [22] of extending ananalysis suited for weighted matchings and independent setsto the unweighted case

32 Ensemble Averages To proceed further in the analysisand since one is often concerned with the average propertiesof a class of related factor graphs let us consider the casewhere the factor graph 119866 is sampled from a locally tree-likefactor graph ensemble G(119873) We will employ the followingnotation for the graph ensembles expectations E

119866[sdot] for

graphs averages E1198620

[sdot] (E1198630

[sdot]) for expectations over thefactor (variable) degree distribution which we will sometimecall root degree distribution and E

119862[sdot] (E

119863[sdot]) for expecta-

tions over the excess degree distribution of factor (variable)nodes conditioned to have at least one adjacent edge whichwe will sometime call residual degree distribution Thequantities 119888 and 119889 and the random variables 119862 119862

0 119863 and

1198630 are related by

P [119862 = 119896] =

(119896 + 1)P [1198620= 119896 + 1]

119888

P [119863 = 119896] =

(119896 + 1)P [1198630= 119896 + 1]

119889

(21)

In Section 7 we discuss some specific factor graph ensembleswhere119862 and119863 are fixed to a deterministic value or Poissoniandistributed

With these definitions the distributional equation corre-sponding to Belief Propagation formula (16) which reads

1198791015840 d= max 0 cup

1 minus

119863119895

sum

119904=1

119879119904119895

119895isin1119862

(22)

where 119863119895 are iid random residual variable degrees 119862 is

a random residual factor degree and 119879119904119895 are iid random

incoming messages Since on a given graph each message119905119903rarr 119894

takes value only on 0 1 we can take the distributionof messages 119905 to be of the form

119875 (119905) = 119901120575 (119905) + (1 minus 119901) 120575 (119905 minus 1) (23)

For this to be a fixed point of (22) the parameter 119901 has tosatisfy the self-consistent equation

119901lowast= E

119862(1 minus E

119863119901119863

lowast)

119862

(24)

Using (18) and (24) we obtain the replica symmetricapproximation to the asymptotic MSP relative size

120588RS

=

119889

119888

(1 minus E1198620

(1 minus E119863119901119863

lowast)

1198620

) + E1198630

(1 minus 1198630) 119901

1198630

lowast (25)

It turns out that the RS approximation is exact only when theratio 119873119872 is sufficiently low In the SP language (25) holdstrue only when the number of the subsets among which wechoose our MSP is not too big compared to the number ofelements of which they are composed

In the next section we will quantitatively establish thelimits of validity of the RS ansatz and introduce the one-step replica symmetry breaking formalism which provides abetter approximation to the exact results in the regime of large119873119872 ratio In Section 5 we prove that (25) is exact for certainchoices of the factor graph ensembles

International Journal of Statistical Mechanics 5

4 Replica Symmetry Breaking

41 RS Consistency and Bugs Propagation Here we proposetwo criterions in order to check the consistency of the RScavity method iIf any of those fails

The first criterion is the assumption of unicity of the fixedpoint of (24) and its dynamical stability under iteration Werestrict ourselves to the subspace of distributions with sup-port on 0 1 although it is possible to extend the followinganalysis to the whole space of distributions over [0 1] withan argument based on stochastic dominance following [22]Characterizing the distributions over 0 1 as in (23) with areal parameter 119901 isin [0 1] from (22) we obtain the dynamicalsystem

1199011015840

= E119862(1 minus E

119863119901119863

)

119862

equiv 119891 (119901) (26)

The stability criterion100381610038161003816100381610038161198911015840

(119901lowast)

10038161003816100381610038161003816lt 1 (27)

suggests that the RS approximation to the MSP size (25) isexact as long as (27) is satisfied This statement will be maderigorous in Section 5

The secondmethodwe use to check the RS stability calledbugs proliferation is the zero temperature analogous of spinglass susceptibility We will compute the average number ofchanging 119905messages induced by a change in a single message119905119903rarr 119894

(1 rarr 0 or 0 rarr 1) This is given by

119873ch = E[

[

[

sum

(119904119895)

sum

1198860 1198861

1198870 1198871

I (119905119904rarr 119895

= 1198870997888rarr 119887

1| 119905

119903rarr 119894= 119886

0997888rarr 119886

1)]

]

]

(28)

where 1198860 119886

1 119887

0 and 119887

1take value in 0 1 Since a random

factor graph is locally a tree and assuming correlations decayfast enough last equation can be expressed as

119873ch =

+infin

sum

119904=0

(119862119863)

119904

sum

1198860 1198861

1198870 1198871

119875 (119905119904= 119887

0997888rarr 119887

1| 119905

119900= 119886

0997888rarr 119886

1)

(29)

where 119905119904is a message at distance 119904 from the tree root 119900 and

we defined the average residual degrees 119862 = E119862[119862] and 119863 =

E119863[119863] The stability condition 119873ch lt +infin yields a constraint

on the greatest eigenvalue 120582119872of the transfer matrix 119875(119887

0rarr

1198871| 119886

0rarr 119886

1)

119862119863120582119872

lt 1 (30)

The two methods presented above give equivalent condi-tions for the RS ansatz to hold true and they simply expressthe independence for a finite subgraph from the tail boundaryconditions

42 The 1RSB Formalism We are going to develop the1RSB formalism for the MSP problem and then apply it in

Section 71 to the ensemble G119877119875 We will not check the

coherence of the 1RSB ansatz through the interstate andintrastate susceptibilities [23] we are then not guaranteedagainst the need of further steps of replica symmetry breakingin order to recover the exact solution Even in the worstcase scenario though when the 1RSB solution is triviallyexact only in the RS region and a full RSB ansatz is neededotherwise the 1RSB prediction for MSP relative size 120588 shouldbe everywhere more accurate than the RS one and possiblyvery close to the real value We will refer to the textbook ofMontanari and Mezard [19] for a detailed exposition of the1RSB cavity method

Let us fix a factor graph 119866 from a locally tree-likeensembleG We call119876

119903rarr 119894(119905119903rarr 119894

) the distribution of messageson the directed edge 119903 rarr 119894 over the states of the system Westill expect the messages 119905

119903rarr 119894to take values 0 or 1 so that

119876119903rarr 119894

can be parametrized as

119876119903rarr 119894

(119905119903rarr 119894

) = 119902119903rarr 119894

120575 (119905119903rarr 119894

) + (1 minus 119902119903rarr 119894

) 120575 (119905119903rarr 119894

minus 1)

(31)

The 1RSB Parisi parameter 119909 isin [0 1] has to be properlyrescaled in order to correctly take the limit 120583 uarr infin Thereforewe introduce the new 1RSB parameter 119910 = 120583119909 which staysfinite in the close packing limit and takes a value in [0 +infin)The reweighting factor 119890minus119910120596

119903rarr 119894

iter is defined as

119890minus119910120596119903rarr 119894

iter=

119885119903rarr 119894

prod119895isin120597119903119894

prod119904isin120597119895119903

119885119904rarr 119895

(32)

Last equation combined with (13) and (14) gives 120596119903rarr 119894

iter =

minus119905119903rarr 119894

Averaging over the whole ensemble we can then writethe zero temperature 1RSB message passing rules (also calledSurvey Propagation equations)

1199021015840 d=

prod119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

119890119910+ (1 minus 119890

119910)prod

119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

(33)

In preceding equation 119902119904119895 are iidrv on [0 1] and as usual

119862 is the randomvariable residual degree and 119863119895 are random

independent factors residual degrees Fixed points of (33)take the form

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (34)

where 1198752(119902) is a continuous distribution on [0 1] and 119901

2=

1 minus 1199010minus 119901

1 Parameters 119901

0and 119901

1have to satisfy the closed

equations

1199011= E

119862(1 minus E

119863(1 minus 119901

0)119863

)

119862

1199010= 1 minus E

119862(1 minus E

119863119901119863

1)

119862

(35)

Solutions of (35)with1199012= 0 correspond to replica symmetric

solutions and their instability marks the onset of a spin glassphase In this new phase the MSPs are clustered accordingto the general scenario displayed by constraint satisfactionproblems [24]

6 International Journal of Statistical Mechanics

From the stable fixed point of (33) we can calculate the1RSB free energy functional 120601(119910) as

minus119910120601 (119910) =

119889

119888

E log[

[

(1 minus 119890119910

)

1198620

prod

119895=1

(1 minus

119863119895

prod

119904=1

119902119904119895) + 119890

119910]

]

+ E (1 minus 1198630) log[(1 minus 119890

119910

)(1 minus

119863

prod

119903=1

119902119903) + 119890

119910

]

(36)

and 1RSB density 1205881RSB(119910) = minus(120597119910120601(119910)120597119910) as

1205881RSB (119910) =

119889

119888

E119890119910

(1 minus prod1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895))

(1 minus 119890119910)prod

1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895) + 119890

119910

+ E (1 minus 1198630)

119890119910

prod1198630

119903=1119902119903

(1 minus 119890119910) (1 minus prod

1198630

119903=1119902119903) + 119890

119910

(37)

with expectations intended over G and over fixed pointmessages 119902

119904 and 119902

119904119895 Since the free energy functional

120601(119910) and the complexity Σ(120588) are related by the Legendretransform

Σ (120588) = minus119910120588 minus 119910120601 (119910) (38)

with 120597Σ120597120588 = minus119910 through (36) we can compute thecomplexity taking the inverse transform Equilibrium statesthat is MSPs are selected by

1205881RSB = argmax

120588

120588 Σ (120588) ge 0 (39)

or equivalently taking the 1RSB parameter 119910 to be

119910119904= argmax

119910isin[0+infin]

120601 (119910) (40)

In the static 1RSB phase we expect Σ(120588119904) = 0 so that

from (38) we have 120601(119910119904) = minus120588

119904 We will see that this is

generally true except for the ensemble G119877119875

(2 119888) of Section 7corresponding to maximum matchings on ordinary graphswhere the equilibrium state have maximal complexity and119910119904= +infin The relation

1205881RSB = minus120601 (119910

119904) (41)

is always valid though since for 119910119904

uarr infin complexity staysfinite

5 A Heuristic Algorithm and Exact Results

In this section we propose a heuristic greedy algorithm toaddress the problem of MSP It is a natural generalizationof the algorithm that Karp and Sipser proposed to solve themaximummatching problemonErdos-Renyi randomgraphs[8] therefore we will call it generalized Karp-Sipser (GKS)Extending their derivation concerning the leaf removal partof the algorithm we are able to prove that the RS prediction

forMSPdensity is exact as long as the stability criterion (27) issatisfiedWewill not give the proofs of the following theoremsas they are lengthy but effortless extension of those given in[8] In order to find the maximummatching on a graph Karpand Sipser noticed that as long as the graph contains a nodeof degree one (a leaf) its unique edge has to belong to one ofthe perfect matchings

They considered the simplest randomized algorithm onecan imagine as long as there is any leaf remove it from thegraph otherwise remove a random edge then iterate untilthe graph is depletedThey studied the behaviour of this leaf-removal algorithm on random graphs and were able to provethat it grants whp a maximum matching (within an 119900(119899)

error)To generalize some of their results we need to extend the

definition of leaf to that of pendantWe call pendant a variablenode whose factor neighbours all have degree one except forone at most Stating the same concept in different words allof the neighbours of a pendant have the pendant itself as theirsole neighbour except for one of them at most See Figure 1for a pictorial representation of a pendant (in red) The GKSalgorithm is articulated in two phases a pendant removal anda random occupation phase We give the pseudocode for thegeneralized Karp-Sipser algorithm (see Algorithm 1)

At each step the algorithm prioritizes the removal ofpendants over that of random variable nodes We notice thatthe removal of a pendant is always an optimal choice in orderto achieve aMSP we have no guarantees though on the effectof the occupation of a random node We call phase 1 theexecution of the algorithm up to the point where the firstnonpendant variable is added to 119881

1015840 It is trivial to show thatphase 1 is enough to find an MSP on a tree factor graphThe interesting thing though is that phase 1 is also able todeplete nontree factor graphs and find an MSP as long as thefactor graphs are sufficiently sparse and large enough

We call core the subset of the variable nodes which hasnot been assigned to theMSP in phase 1 We will show howthe emergence of a core is directly related to replica symmetrybreaking Let us define as usual

119891 (119909) equiv E119862(1 minus E

119863119909119863

)

119862

(42)

The function119891 is continuous nonincreasing and satisfiesthe relation 1 ge 119891(0) ge 119891(1) = P[119862 = 0] It turns out thatin the large graph limit phase 1 of GKS is characterized bythe solutions of the system of equations

119901 = 119891 (119903)

119903 = 119891 (119901)

(43)

We notice that system (43) is equivalent to 1RSB equation (35)once the substitutions 119901 rarr 119901

0and 119903 rarr 1 minus 119901

1are made

Lemma 1 The system of (43) admits always a (unique)solution 119901

lowast= 119903

lowast

Proof By monotony and continuity the function 119892(119901) =

119891(119901) minus 119901 has a single zero in the interval [0 1]

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 2: The Statistical Mechanics of Random Set Packing and a

2 International Journal of Statistical Mechanics

and computer scientists only in the matching [8] andindependent set [9] restrictions some work has been doneExtending some theorems of [8] (on which a part of this workis greatly inspired) to a greater class of problem ensembles issome of the main aims of the present work

On the other hand also the statistical physics literatureis lacking an accurate study of the random MSP problemOne of its specialization though the matching problem hasbeen covered since the beginning of the physicistsrsquo interest inoptimization problems with the work of Parisi and Mezardon the weighted and fully connected version of the problem[10 11] More recently the matching problem on sparserandom graphs has also been accurately studied [12 13]using the cavity method technique Also the independentset problem on random graphs [14] and the dual problemto set packing the set covering problem [15] received someattention by the disordered physics community The SPproblem was investigated with the cavity method formalismin a disguised form as a glass model on a generalized Bethelattice in [16 17] This corresponds as we will see in thenext section to a factor graph ensemble with fixed factor andvariable degrees thus we will not cover this case in Section 7

The paper is organized as follows

(i) In Section 2 we map the MSP problem ((1)ndash(3)) intoa statistical physical model defined on a factor graphand relate the MSP size to the density 120588 at infinitechemical potential

(ii) We introduce the replica symmetric (RS) cavitymethod in Section 3 and give an estimate for theaverage MSPs size on sparse factor graph ensemblesin the thermodynamic limit

(iii) In Section 4 we establish a criterion for the validity ofthe RS ansatz and introduce the 1RSB formalism

(iv) In Section 5 we propose a generalization of the Karp-Sipser heuristic algorithm [8] to the MSP problemand prove the validity of the RS ansatz for certainensemble of problems Moreover we find a relation-ship between a core emergence phenomena and thebreaking of replica symmetry breaking

(v) Section 6 describes the numerical simulations per-formed

(vi) In Section 7 we apply the analytical tools developed tosome problem ensembles We compare the numericalresults obtained from an exact algorithm with theanalytical predictions focusing to greater extent toone ensemble modelling the 119896-set packing

2 Statistical Physics Description

In order to turn the MSP combinatorial optimization prob-lem into a useful statistical physical model let us recast ((1)ndash(3)) into a graphical model using the factor graph formalism[18 19] We define our variable nodes set to be 119881 and to each119894 isin 119881 we associate a variable 119899

119894taking values in 0 1 as in

(3) 119865 will be our factor nodes set as its elements acts as hardconstrains on the variables 119899

119894through (2) The edge set 119864 is

then naturally defined as 119864 = (119894 119903) | 119894 isin 119881 119903 isin 119878119894sube 119865

We call119866 = (119881 119865 119864) the factor graph thus composed and canthen rewrite (2) as

sum

119894isin120597119903

119899119894le 1 forall119903 isin 119865 (4)

A SP is a configuration 119899119894 satisfying (4) and its relative size is

120588(119899119894) = (1119873)sum

119894isin119881119899119894that is simply the fraction of occupied

sitesIt is now easy to define an appropriate Gibbs measure

for the MSPs problem on 119866 through the grand canonicalpartition function

Ξ119866(120583) = sum

119899119894

prod

119894isin119881

119890120583119899119894

prod

119903isin119865

I(sum

119894isin120597119903

119899119894le 1) (5)

Only SPs contribute to the partition function and in the closepacking limit as we will call the limit 120583 uarr +infin the measureis dominated by MSPs Equation (5) is also a model for aparticle gas with hard core repulsion and chemical potential120583located on a hypergraph and as such has been studied mainlyon lattice structures and more in general on ordinary graphsModel (5) has been studied on a generalized Bethe lattice (iethe ensemble G

119877119877(119889 119888) defined in Section 7 a 119889-uniform 119888-

regular factor graph) in [16 17] as a prototype of a systemwith finite connectivity showing a glassy behaviour This hasbeen the only approach although disguised as a hard spheresmodel from the statistical physics community to a generalMSP problem

The grand canonical potential is defined as

120596119866(120583) = minus

1

120583119873

logΞ119866(120583) (6)

and the particle density as

120588119866(120583) =

1

119873

⟨sum

119894isin119881

119899119894⟩

119866120583

= minus120596119866(120583) minus 120583120597

120583120596119866(120583) (7)

Grand potential and density are related to entropy by thethermodynamic relation

119904119866(120583) = minus120583 (120596

119866(120583) + 120588

119866(120583)) = 120583

2

120597120583120596119866(120583) (8)

In the close packing limit (ie 120583 uarr +infin) we recover the MSPproblem since in this limit the Gibbs measure is uniformlyconcentrated on MSPs and 120588

119866gives the MSP relative size

Since entropy remains finite in this limit from (8) we obtainthe MSP relative size

120588119866

equiv lim120583rarr+infin

120588119866(120583) = lim

120583rarr+infin

minus 120596119866(120583) (9)

In this paper we focus on random instances of the MSPproblem As usual in statistical physics we will assume thenumber of variables 119873 (the number of subsets 119878

119894) and the

number of constrains 119872 to diverge keeping the ratio 119873119872

finite We will refer to this limit as the thermodynamic limitInstances of theMSP problemwill be encoded in factor graph

International Journal of Statistical Mechanics 3

ensembles which we assume to be locally tree-like in thethermodynamic limit

TheMSP relative size 120588119866is a self-averaging quantity in the

thermodynamic limit and we want to compute its asymptoticvalue

120588 = lim119873rarr+infin

E119866[120588

119866] (10)

where we denoted with E119866[sdot] the expectation over the factor

graph ensemble In the last equation the 119873 dependence isencoded in the graph ensemble considered Computing (10)is not an easy task and some approximation have to betaken We will employ the cavity method from the statisticalphysics of disordered systems [19 20] using both the replicasymmetric (RS) and the one-step replica symmetry breaking(1RSB) ansatz We will prove in Section 5 that the RS ansatz isexact in a certain region of the phase space while in Section 7we will give numerical evidence that the 1RSB approximationgives very good results outside the RS region

3 Replica Symmetry

31 Bethe Approximation on a Single Instance The RS cavitymethod has been known for many decades outside thestatistical physics community as the Belief Propagation (BP)algorithm and only in recent years the two approaches havebeen bridged [18 19] We start with a variational approxima-tion to the grand potential equation (6) of an instance of theproblem the Bethe free energy approximation

120596RS119866

[] =

1

119873

[ sum

119903isin120597119865

120596119903[] + sum

119894isin119881

(1 minus |120597119894|) 120596119894[]] (11)

with the factor and variable contributions given by

120596119903[] = minus

1

120583

log[

[

sum

119899119894isin120597119903

I(sum

119894isin120597119903

119899119894le 1) prod

119895isin120597119903

prod

119904isin120597119895119903

]119904rarr 119895

(119899119895)]

]

120596119894[] = minus

1

120583

log[sum

119899119894

119890120583119899119894

prod

119903isin120597119894

]119903rarr 119894

(119899119894)]

(12)

The grand canonical potential is expressed as a functionof the factor node to variable node messages = ]

119903rarr 119894

Minimization of 120596RS119866

[] over the messages constrained to benormalized to one yields the fixed point BP equations for theset packing

]119903rarr 119894

(1) =

1

119885119903rarr 119894

prod

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(0)

]119903rarr 119894

(0) =

1

119885119903rarr 119894

[

[

prod

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(0)

+ 119890120583

sum

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(1)

times prod

1198951015840isin120597119903119894119895

prod

1199041015840isin1205971198951015840119903

]1199041015840rarr1198951015840 (0)

]

]

(13)

The coeeficients 119885119903rarr 119894

are normalization factor Equations(13) can be simplified introducing the fields 119905

119903rarr 119894 defined as

]119903rarr 119894

(1)

]119903rarr 119894

(0)

= 119890minus120583119905119903rarr 119894

(14)

yielding

119905119903rarr 119894

=

1

120583

log[

[

1 + sum

119895isin120597119903119894

119890120583(1minussum

119904isin120597119895119903119905119904rarr 119895)]

]

(15)

Since we are interested in the close packing limit to solve theproblemwewill straightforwardly apply the zero temperaturecavity method [21] The related BP equations which can befound as the 120583 uarr infin limit of (15) read

119905119903rarr 119894

= max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin120597119903119894

(16)

We note that the messages 119905119903rarr 119894

are bounded to take valuesin the interval [0 1] and that if we set the initial value ofeach 119905

119903rarr 119894in the discrete set 0 1 at each BP iteration all

messages will take value either 0 or 1 These values can bedirectly interpreted as the occupational loss occurring in thesubtree 119903 rarr 119894 if the subtree is connected to the occupiednode 119894 This loss cannot be negative (thus a gain) since weput an additional constrain on the subtree demanding every119895 neighbour of 119903 to be empty and cannot be greater than 1 aswell in fact 119905

119903rarr 119894= 1 corresponds to the worst case scenario

where an otherwise occupied node 119895 isin 120597119903119894has to be emptiedThe Bethe free energy for model (5) on the factor graph

119866 can be expressed as a function of the fixed point messages119905119903rarr 119894

as

120596RS119866

(120583) = minus

1

120583119873

[

[

sum

119903isin120597119865

log(1 + sum

119895isin120597119903

119890120583(1minussum

119904isin120597119895119903119905119904rarr 119895)

)

+ sum

119894isin119881

(1 minus |120597119894|) log (1 + 119890120583(1minussum

119903isin120597119894119905119903rarr 119894)

)]

]

(17)

4 International Journal of Statistical Mechanics

We finally arrive to the Bethe estimation of the MSPs relativesize taking the close packing limit of (17) and using 120596

RS119866

=

minus120588RS119866 which is given by

120588RS119866

=

1

119873

[

[

[

sum

119903isin119865

max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin119903

+sum

119894isin119881

(1 minus |120597119894|)max0 1 minus sum

119903isin119894

119905119903rarr 119894

]

]

]

(18)

Let us examine the various contributions to (18) sincewewantto convince ourselves that it exactly counts the MSP size atleast on tree factor graphs The term (1 minus |120597119894|)max0 1 minus

sum119903isin119894

119905119903rarr 119894

contributes with 1 minus |120597119894| to the sum only if all theincoming 119905messages are zero In this case 119899

119894is frozen to 1 that

is the variable 119894 takes part of all the MSPs in 119866 Obviously allthe neighbours of a variable frozen to 1 have to be frozen to0 To all its |120597119894| neighbours 119903 the frozen to 1 variable 119894 sends amessage 1minussum

119904isin120597119894119903119905119904rarr 119894

= 1 so that we have |120597119894| contributionsin the first sum of (18) max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 andthe total contribution from 119894 correctly sums up to 1 If for acertain 119894 we have a total field 120591

119894equiv 1 minus sum

119903isin119894119905119903rarr 119894

lt 0 (twoor more incoming messages are equal to one) the variableis frozen to 0 that is it does not take part of any MSPs Itcorrectly does not contribute to 120596

RS119866

since it sends a message1 minus sum

119904isin119894119903119905119904rarr 119894

le 0 to each neighbour 119903 thus it is notcomputed in max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

The third case is the most interesting It concerns vari-

ables 119894 which take part to a fraction of the MSPs We willcall them unfrozen variables The total field on an unfrozenvariable 119894 is 120591

119894= 0 (thus we have no contribution from the

second sum in (18)) and all incoming messages are 0 exceptfor a single 119905

119903rarr 119894= 1 To this sole function node 119903 the node

119894 sends a message 1 so that the contribution of 119903 to the firstsum is 1 Actually BP equations impose that 119903 has to have atleast another unfrozen neighbour beside 119894 In other terms thefunction node 119903 says that whatever MSP we consider one ofmy neighbours has to be occupied The corresponding termmax0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 in (18) accounts for thatThe presence of unfrozen variables is the reason why we

cannot express the density 120588119866through the formula

120588119866

= ⟨sum

119894isin119881

119899119894⟩ (19)

In fact using the infinite chemical potential formalism wecannot compute

⟨119899119894⟩ = lim

120583rarr+infin

119890120583(1minussum

119903isin119894119905119903rarr 119894)

1 + 119890120583(1minussum

119903isin119894119905119903rarr 119894)

(20)

when lim120583rarr+infin

1 minus sum119903isin119894

119905119903rarr 119894

= 0 and we would have touse the 119874(1120583) corrections to the fields 119905

119903rarr 119894 We bypass

the problem using the grand potential 120596RS119866

to obtain 120588RS119866

also addressing a problem reported in [22] of extending ananalysis suited for weighted matchings and independent setsto the unweighted case

32 Ensemble Averages To proceed further in the analysisand since one is often concerned with the average propertiesof a class of related factor graphs let us consider the casewhere the factor graph 119866 is sampled from a locally tree-likefactor graph ensemble G(119873) We will employ the followingnotation for the graph ensembles expectations E

119866[sdot] for

graphs averages E1198620

[sdot] (E1198630

[sdot]) for expectations over thefactor (variable) degree distribution which we will sometimecall root degree distribution and E

119862[sdot] (E

119863[sdot]) for expecta-

tions over the excess degree distribution of factor (variable)nodes conditioned to have at least one adjacent edge whichwe will sometime call residual degree distribution Thequantities 119888 and 119889 and the random variables 119862 119862

0 119863 and

1198630 are related by

P [119862 = 119896] =

(119896 + 1)P [1198620= 119896 + 1]

119888

P [119863 = 119896] =

(119896 + 1)P [1198630= 119896 + 1]

119889

(21)

In Section 7 we discuss some specific factor graph ensembleswhere119862 and119863 are fixed to a deterministic value or Poissoniandistributed

With these definitions the distributional equation corre-sponding to Belief Propagation formula (16) which reads

1198791015840 d= max 0 cup

1 minus

119863119895

sum

119904=1

119879119904119895

119895isin1119862

(22)

where 119863119895 are iid random residual variable degrees 119862 is

a random residual factor degree and 119879119904119895 are iid random

incoming messages Since on a given graph each message119905119903rarr 119894

takes value only on 0 1 we can take the distributionof messages 119905 to be of the form

119875 (119905) = 119901120575 (119905) + (1 minus 119901) 120575 (119905 minus 1) (23)

For this to be a fixed point of (22) the parameter 119901 has tosatisfy the self-consistent equation

119901lowast= E

119862(1 minus E

119863119901119863

lowast)

119862

(24)

Using (18) and (24) we obtain the replica symmetricapproximation to the asymptotic MSP relative size

120588RS

=

119889

119888

(1 minus E1198620

(1 minus E119863119901119863

lowast)

1198620

) + E1198630

(1 minus 1198630) 119901

1198630

lowast (25)

It turns out that the RS approximation is exact only when theratio 119873119872 is sufficiently low In the SP language (25) holdstrue only when the number of the subsets among which wechoose our MSP is not too big compared to the number ofelements of which they are composed

In the next section we will quantitatively establish thelimits of validity of the RS ansatz and introduce the one-step replica symmetry breaking formalism which provides abetter approximation to the exact results in the regime of large119873119872 ratio In Section 5 we prove that (25) is exact for certainchoices of the factor graph ensembles

International Journal of Statistical Mechanics 5

4 Replica Symmetry Breaking

41 RS Consistency and Bugs Propagation Here we proposetwo criterions in order to check the consistency of the RScavity method iIf any of those fails

The first criterion is the assumption of unicity of the fixedpoint of (24) and its dynamical stability under iteration Werestrict ourselves to the subspace of distributions with sup-port on 0 1 although it is possible to extend the followinganalysis to the whole space of distributions over [0 1] withan argument based on stochastic dominance following [22]Characterizing the distributions over 0 1 as in (23) with areal parameter 119901 isin [0 1] from (22) we obtain the dynamicalsystem

1199011015840

= E119862(1 minus E

119863119901119863

)

119862

equiv 119891 (119901) (26)

The stability criterion100381610038161003816100381610038161198911015840

(119901lowast)

10038161003816100381610038161003816lt 1 (27)

suggests that the RS approximation to the MSP size (25) isexact as long as (27) is satisfied This statement will be maderigorous in Section 5

The secondmethodwe use to check the RS stability calledbugs proliferation is the zero temperature analogous of spinglass susceptibility We will compute the average number ofchanging 119905messages induced by a change in a single message119905119903rarr 119894

(1 rarr 0 or 0 rarr 1) This is given by

119873ch = E[

[

[

sum

(119904119895)

sum

1198860 1198861

1198870 1198871

I (119905119904rarr 119895

= 1198870997888rarr 119887

1| 119905

119903rarr 119894= 119886

0997888rarr 119886

1)]

]

]

(28)

where 1198860 119886

1 119887

0 and 119887

1take value in 0 1 Since a random

factor graph is locally a tree and assuming correlations decayfast enough last equation can be expressed as

119873ch =

+infin

sum

119904=0

(119862119863)

119904

sum

1198860 1198861

1198870 1198871

119875 (119905119904= 119887

0997888rarr 119887

1| 119905

119900= 119886

0997888rarr 119886

1)

(29)

where 119905119904is a message at distance 119904 from the tree root 119900 and

we defined the average residual degrees 119862 = E119862[119862] and 119863 =

E119863[119863] The stability condition 119873ch lt +infin yields a constraint

on the greatest eigenvalue 120582119872of the transfer matrix 119875(119887

0rarr

1198871| 119886

0rarr 119886

1)

119862119863120582119872

lt 1 (30)

The two methods presented above give equivalent condi-tions for the RS ansatz to hold true and they simply expressthe independence for a finite subgraph from the tail boundaryconditions

42 The 1RSB Formalism We are going to develop the1RSB formalism for the MSP problem and then apply it in

Section 71 to the ensemble G119877119875 We will not check the

coherence of the 1RSB ansatz through the interstate andintrastate susceptibilities [23] we are then not guaranteedagainst the need of further steps of replica symmetry breakingin order to recover the exact solution Even in the worstcase scenario though when the 1RSB solution is triviallyexact only in the RS region and a full RSB ansatz is neededotherwise the 1RSB prediction for MSP relative size 120588 shouldbe everywhere more accurate than the RS one and possiblyvery close to the real value We will refer to the textbook ofMontanari and Mezard [19] for a detailed exposition of the1RSB cavity method

Let us fix a factor graph 119866 from a locally tree-likeensembleG We call119876

119903rarr 119894(119905119903rarr 119894

) the distribution of messageson the directed edge 119903 rarr 119894 over the states of the system Westill expect the messages 119905

119903rarr 119894to take values 0 or 1 so that

119876119903rarr 119894

can be parametrized as

119876119903rarr 119894

(119905119903rarr 119894

) = 119902119903rarr 119894

120575 (119905119903rarr 119894

) + (1 minus 119902119903rarr 119894

) 120575 (119905119903rarr 119894

minus 1)

(31)

The 1RSB Parisi parameter 119909 isin [0 1] has to be properlyrescaled in order to correctly take the limit 120583 uarr infin Thereforewe introduce the new 1RSB parameter 119910 = 120583119909 which staysfinite in the close packing limit and takes a value in [0 +infin)The reweighting factor 119890minus119910120596

119903rarr 119894

iter is defined as

119890minus119910120596119903rarr 119894

iter=

119885119903rarr 119894

prod119895isin120597119903119894

prod119904isin120597119895119903

119885119904rarr 119895

(32)

Last equation combined with (13) and (14) gives 120596119903rarr 119894

iter =

minus119905119903rarr 119894

Averaging over the whole ensemble we can then writethe zero temperature 1RSB message passing rules (also calledSurvey Propagation equations)

1199021015840 d=

prod119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

119890119910+ (1 minus 119890

119910)prod

119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

(33)

In preceding equation 119902119904119895 are iidrv on [0 1] and as usual

119862 is the randomvariable residual degree and 119863119895 are random

independent factors residual degrees Fixed points of (33)take the form

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (34)

where 1198752(119902) is a continuous distribution on [0 1] and 119901

2=

1 minus 1199010minus 119901

1 Parameters 119901

0and 119901

1have to satisfy the closed

equations

1199011= E

119862(1 minus E

119863(1 minus 119901

0)119863

)

119862

1199010= 1 minus E

119862(1 minus E

119863119901119863

1)

119862

(35)

Solutions of (35)with1199012= 0 correspond to replica symmetric

solutions and their instability marks the onset of a spin glassphase In this new phase the MSPs are clustered accordingto the general scenario displayed by constraint satisfactionproblems [24]

6 International Journal of Statistical Mechanics

From the stable fixed point of (33) we can calculate the1RSB free energy functional 120601(119910) as

minus119910120601 (119910) =

119889

119888

E log[

[

(1 minus 119890119910

)

1198620

prod

119895=1

(1 minus

119863119895

prod

119904=1

119902119904119895) + 119890

119910]

]

+ E (1 minus 1198630) log[(1 minus 119890

119910

)(1 minus

119863

prod

119903=1

119902119903) + 119890

119910

]

(36)

and 1RSB density 1205881RSB(119910) = minus(120597119910120601(119910)120597119910) as

1205881RSB (119910) =

119889

119888

E119890119910

(1 minus prod1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895))

(1 minus 119890119910)prod

1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895) + 119890

119910

+ E (1 minus 1198630)

119890119910

prod1198630

119903=1119902119903

(1 minus 119890119910) (1 minus prod

1198630

119903=1119902119903) + 119890

119910

(37)

with expectations intended over G and over fixed pointmessages 119902

119904 and 119902

119904119895 Since the free energy functional

120601(119910) and the complexity Σ(120588) are related by the Legendretransform

Σ (120588) = minus119910120588 minus 119910120601 (119910) (38)

with 120597Σ120597120588 = minus119910 through (36) we can compute thecomplexity taking the inverse transform Equilibrium statesthat is MSPs are selected by

1205881RSB = argmax

120588

120588 Σ (120588) ge 0 (39)

or equivalently taking the 1RSB parameter 119910 to be

119910119904= argmax

119910isin[0+infin]

120601 (119910) (40)

In the static 1RSB phase we expect Σ(120588119904) = 0 so that

from (38) we have 120601(119910119904) = minus120588

119904 We will see that this is

generally true except for the ensemble G119877119875

(2 119888) of Section 7corresponding to maximum matchings on ordinary graphswhere the equilibrium state have maximal complexity and119910119904= +infin The relation

1205881RSB = minus120601 (119910

119904) (41)

is always valid though since for 119910119904

uarr infin complexity staysfinite

5 A Heuristic Algorithm and Exact Results

In this section we propose a heuristic greedy algorithm toaddress the problem of MSP It is a natural generalizationof the algorithm that Karp and Sipser proposed to solve themaximummatching problemonErdos-Renyi randomgraphs[8] therefore we will call it generalized Karp-Sipser (GKS)Extending their derivation concerning the leaf removal partof the algorithm we are able to prove that the RS prediction

forMSPdensity is exact as long as the stability criterion (27) issatisfiedWewill not give the proofs of the following theoremsas they are lengthy but effortless extension of those given in[8] In order to find the maximummatching on a graph Karpand Sipser noticed that as long as the graph contains a nodeof degree one (a leaf) its unique edge has to belong to one ofthe perfect matchings

They considered the simplest randomized algorithm onecan imagine as long as there is any leaf remove it from thegraph otherwise remove a random edge then iterate untilthe graph is depletedThey studied the behaviour of this leaf-removal algorithm on random graphs and were able to provethat it grants whp a maximum matching (within an 119900(119899)

error)To generalize some of their results we need to extend the

definition of leaf to that of pendantWe call pendant a variablenode whose factor neighbours all have degree one except forone at most Stating the same concept in different words allof the neighbours of a pendant have the pendant itself as theirsole neighbour except for one of them at most See Figure 1for a pictorial representation of a pendant (in red) The GKSalgorithm is articulated in two phases a pendant removal anda random occupation phase We give the pseudocode for thegeneralized Karp-Sipser algorithm (see Algorithm 1)

At each step the algorithm prioritizes the removal ofpendants over that of random variable nodes We notice thatthe removal of a pendant is always an optimal choice in orderto achieve aMSP we have no guarantees though on the effectof the occupation of a random node We call phase 1 theexecution of the algorithm up to the point where the firstnonpendant variable is added to 119881

1015840 It is trivial to show thatphase 1 is enough to find an MSP on a tree factor graphThe interesting thing though is that phase 1 is also able todeplete nontree factor graphs and find an MSP as long as thefactor graphs are sufficiently sparse and large enough

We call core the subset of the variable nodes which hasnot been assigned to theMSP in phase 1 We will show howthe emergence of a core is directly related to replica symmetrybreaking Let us define as usual

119891 (119909) equiv E119862(1 minus E

119863119909119863

)

119862

(42)

The function119891 is continuous nonincreasing and satisfiesthe relation 1 ge 119891(0) ge 119891(1) = P[119862 = 0] It turns out thatin the large graph limit phase 1 of GKS is characterized bythe solutions of the system of equations

119901 = 119891 (119903)

119903 = 119891 (119901)

(43)

We notice that system (43) is equivalent to 1RSB equation (35)once the substitutions 119901 rarr 119901

0and 119903 rarr 1 minus 119901

1are made

Lemma 1 The system of (43) admits always a (unique)solution 119901

lowast= 119903

lowast

Proof By monotony and continuity the function 119892(119901) =

119891(119901) minus 119901 has a single zero in the interval [0 1]

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 3: The Statistical Mechanics of Random Set Packing and a

International Journal of Statistical Mechanics 3

ensembles which we assume to be locally tree-like in thethermodynamic limit

TheMSP relative size 120588119866is a self-averaging quantity in the

thermodynamic limit and we want to compute its asymptoticvalue

120588 = lim119873rarr+infin

E119866[120588

119866] (10)

where we denoted with E119866[sdot] the expectation over the factor

graph ensemble In the last equation the 119873 dependence isencoded in the graph ensemble considered Computing (10)is not an easy task and some approximation have to betaken We will employ the cavity method from the statisticalphysics of disordered systems [19 20] using both the replicasymmetric (RS) and the one-step replica symmetry breaking(1RSB) ansatz We will prove in Section 5 that the RS ansatz isexact in a certain region of the phase space while in Section 7we will give numerical evidence that the 1RSB approximationgives very good results outside the RS region

3 Replica Symmetry

31 Bethe Approximation on a Single Instance The RS cavitymethod has been known for many decades outside thestatistical physics community as the Belief Propagation (BP)algorithm and only in recent years the two approaches havebeen bridged [18 19] We start with a variational approxima-tion to the grand potential equation (6) of an instance of theproblem the Bethe free energy approximation

120596RS119866

[] =

1

119873

[ sum

119903isin120597119865

120596119903[] + sum

119894isin119881

(1 minus |120597119894|) 120596119894[]] (11)

with the factor and variable contributions given by

120596119903[] = minus

1

120583

log[

[

sum

119899119894isin120597119903

I(sum

119894isin120597119903

119899119894le 1) prod

119895isin120597119903

prod

119904isin120597119895119903

]119904rarr 119895

(119899119895)]

]

120596119894[] = minus

1

120583

log[sum

119899119894

119890120583119899119894

prod

119903isin120597119894

]119903rarr 119894

(119899119894)]

(12)

The grand canonical potential is expressed as a functionof the factor node to variable node messages = ]

119903rarr 119894

Minimization of 120596RS119866

[] over the messages constrained to benormalized to one yields the fixed point BP equations for theset packing

]119903rarr 119894

(1) =

1

119885119903rarr 119894

prod

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(0)

]119903rarr 119894

(0) =

1

119885119903rarr 119894

[

[

prod

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(0)

+ 119890120583

sum

119895isin120597119903119894

prod

119904isin120597119895119903

]119904rarr 119895

(1)

times prod

1198951015840isin120597119903119894119895

prod

1199041015840isin1205971198951015840119903

]1199041015840rarr1198951015840 (0)

]

]

(13)

The coeeficients 119885119903rarr 119894

are normalization factor Equations(13) can be simplified introducing the fields 119905

119903rarr 119894 defined as

]119903rarr 119894

(1)

]119903rarr 119894

(0)

= 119890minus120583119905119903rarr 119894

(14)

yielding

119905119903rarr 119894

=

1

120583

log[

[

1 + sum

119895isin120597119903119894

119890120583(1minussum

119904isin120597119895119903119905119904rarr 119895)]

]

(15)

Since we are interested in the close packing limit to solve theproblemwewill straightforwardly apply the zero temperaturecavity method [21] The related BP equations which can befound as the 120583 uarr infin limit of (15) read

119905119903rarr 119894

= max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin120597119903119894

(16)

We note that the messages 119905119903rarr 119894

are bounded to take valuesin the interval [0 1] and that if we set the initial value ofeach 119905

119903rarr 119894in the discrete set 0 1 at each BP iteration all

messages will take value either 0 or 1 These values can bedirectly interpreted as the occupational loss occurring in thesubtree 119903 rarr 119894 if the subtree is connected to the occupiednode 119894 This loss cannot be negative (thus a gain) since weput an additional constrain on the subtree demanding every119895 neighbour of 119903 to be empty and cannot be greater than 1 aswell in fact 119905

119903rarr 119894= 1 corresponds to the worst case scenario

where an otherwise occupied node 119895 isin 120597119903119894has to be emptiedThe Bethe free energy for model (5) on the factor graph

119866 can be expressed as a function of the fixed point messages119905119903rarr 119894

as

120596RS119866

(120583) = minus

1

120583119873

[

[

sum

119903isin120597119865

log(1 + sum

119895isin120597119903

119890120583(1minussum

119904isin120597119895119903119905119904rarr 119895)

)

+ sum

119894isin119881

(1 minus |120597119894|) log (1 + 119890120583(1minussum

119903isin120597119894119905119903rarr 119894)

)]

]

(17)

4 International Journal of Statistical Mechanics

We finally arrive to the Bethe estimation of the MSPs relativesize taking the close packing limit of (17) and using 120596

RS119866

=

minus120588RS119866 which is given by

120588RS119866

=

1

119873

[

[

[

sum

119903isin119865

max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin119903

+sum

119894isin119881

(1 minus |120597119894|)max0 1 minus sum

119903isin119894

119905119903rarr 119894

]

]

]

(18)

Let us examine the various contributions to (18) sincewewantto convince ourselves that it exactly counts the MSP size atleast on tree factor graphs The term (1 minus |120597119894|)max0 1 minus

sum119903isin119894

119905119903rarr 119894

contributes with 1 minus |120597119894| to the sum only if all theincoming 119905messages are zero In this case 119899

119894is frozen to 1 that

is the variable 119894 takes part of all the MSPs in 119866 Obviously allthe neighbours of a variable frozen to 1 have to be frozen to0 To all its |120597119894| neighbours 119903 the frozen to 1 variable 119894 sends amessage 1minussum

119904isin120597119894119903119905119904rarr 119894

= 1 so that we have |120597119894| contributionsin the first sum of (18) max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 andthe total contribution from 119894 correctly sums up to 1 If for acertain 119894 we have a total field 120591

119894equiv 1 minus sum

119903isin119894119905119903rarr 119894

lt 0 (twoor more incoming messages are equal to one) the variableis frozen to 0 that is it does not take part of any MSPs Itcorrectly does not contribute to 120596

RS119866

since it sends a message1 minus sum

119904isin119894119903119905119904rarr 119894

le 0 to each neighbour 119903 thus it is notcomputed in max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

The third case is the most interesting It concerns vari-

ables 119894 which take part to a fraction of the MSPs We willcall them unfrozen variables The total field on an unfrozenvariable 119894 is 120591

119894= 0 (thus we have no contribution from the

second sum in (18)) and all incoming messages are 0 exceptfor a single 119905

119903rarr 119894= 1 To this sole function node 119903 the node

119894 sends a message 1 so that the contribution of 119903 to the firstsum is 1 Actually BP equations impose that 119903 has to have atleast another unfrozen neighbour beside 119894 In other terms thefunction node 119903 says that whatever MSP we consider one ofmy neighbours has to be occupied The corresponding termmax0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 in (18) accounts for thatThe presence of unfrozen variables is the reason why we

cannot express the density 120588119866through the formula

120588119866

= ⟨sum

119894isin119881

119899119894⟩ (19)

In fact using the infinite chemical potential formalism wecannot compute

⟨119899119894⟩ = lim

120583rarr+infin

119890120583(1minussum

119903isin119894119905119903rarr 119894)

1 + 119890120583(1minussum

119903isin119894119905119903rarr 119894)

(20)

when lim120583rarr+infin

1 minus sum119903isin119894

119905119903rarr 119894

= 0 and we would have touse the 119874(1120583) corrections to the fields 119905

119903rarr 119894 We bypass

the problem using the grand potential 120596RS119866

to obtain 120588RS119866

also addressing a problem reported in [22] of extending ananalysis suited for weighted matchings and independent setsto the unweighted case

32 Ensemble Averages To proceed further in the analysisand since one is often concerned with the average propertiesof a class of related factor graphs let us consider the casewhere the factor graph 119866 is sampled from a locally tree-likefactor graph ensemble G(119873) We will employ the followingnotation for the graph ensembles expectations E

119866[sdot] for

graphs averages E1198620

[sdot] (E1198630

[sdot]) for expectations over thefactor (variable) degree distribution which we will sometimecall root degree distribution and E

119862[sdot] (E

119863[sdot]) for expecta-

tions over the excess degree distribution of factor (variable)nodes conditioned to have at least one adjacent edge whichwe will sometime call residual degree distribution Thequantities 119888 and 119889 and the random variables 119862 119862

0 119863 and

1198630 are related by

P [119862 = 119896] =

(119896 + 1)P [1198620= 119896 + 1]

119888

P [119863 = 119896] =

(119896 + 1)P [1198630= 119896 + 1]

119889

(21)

In Section 7 we discuss some specific factor graph ensembleswhere119862 and119863 are fixed to a deterministic value or Poissoniandistributed

With these definitions the distributional equation corre-sponding to Belief Propagation formula (16) which reads

1198791015840 d= max 0 cup

1 minus

119863119895

sum

119904=1

119879119904119895

119895isin1119862

(22)

where 119863119895 are iid random residual variable degrees 119862 is

a random residual factor degree and 119879119904119895 are iid random

incoming messages Since on a given graph each message119905119903rarr 119894

takes value only on 0 1 we can take the distributionof messages 119905 to be of the form

119875 (119905) = 119901120575 (119905) + (1 minus 119901) 120575 (119905 minus 1) (23)

For this to be a fixed point of (22) the parameter 119901 has tosatisfy the self-consistent equation

119901lowast= E

119862(1 minus E

119863119901119863

lowast)

119862

(24)

Using (18) and (24) we obtain the replica symmetricapproximation to the asymptotic MSP relative size

120588RS

=

119889

119888

(1 minus E1198620

(1 minus E119863119901119863

lowast)

1198620

) + E1198630

(1 minus 1198630) 119901

1198630

lowast (25)

It turns out that the RS approximation is exact only when theratio 119873119872 is sufficiently low In the SP language (25) holdstrue only when the number of the subsets among which wechoose our MSP is not too big compared to the number ofelements of which they are composed

In the next section we will quantitatively establish thelimits of validity of the RS ansatz and introduce the one-step replica symmetry breaking formalism which provides abetter approximation to the exact results in the regime of large119873119872 ratio In Section 5 we prove that (25) is exact for certainchoices of the factor graph ensembles

International Journal of Statistical Mechanics 5

4 Replica Symmetry Breaking

41 RS Consistency and Bugs Propagation Here we proposetwo criterions in order to check the consistency of the RScavity method iIf any of those fails

The first criterion is the assumption of unicity of the fixedpoint of (24) and its dynamical stability under iteration Werestrict ourselves to the subspace of distributions with sup-port on 0 1 although it is possible to extend the followinganalysis to the whole space of distributions over [0 1] withan argument based on stochastic dominance following [22]Characterizing the distributions over 0 1 as in (23) with areal parameter 119901 isin [0 1] from (22) we obtain the dynamicalsystem

1199011015840

= E119862(1 minus E

119863119901119863

)

119862

equiv 119891 (119901) (26)

The stability criterion100381610038161003816100381610038161198911015840

(119901lowast)

10038161003816100381610038161003816lt 1 (27)

suggests that the RS approximation to the MSP size (25) isexact as long as (27) is satisfied This statement will be maderigorous in Section 5

The secondmethodwe use to check the RS stability calledbugs proliferation is the zero temperature analogous of spinglass susceptibility We will compute the average number ofchanging 119905messages induced by a change in a single message119905119903rarr 119894

(1 rarr 0 or 0 rarr 1) This is given by

119873ch = E[

[

[

sum

(119904119895)

sum

1198860 1198861

1198870 1198871

I (119905119904rarr 119895

= 1198870997888rarr 119887

1| 119905

119903rarr 119894= 119886

0997888rarr 119886

1)]

]

]

(28)

where 1198860 119886

1 119887

0 and 119887

1take value in 0 1 Since a random

factor graph is locally a tree and assuming correlations decayfast enough last equation can be expressed as

119873ch =

+infin

sum

119904=0

(119862119863)

119904

sum

1198860 1198861

1198870 1198871

119875 (119905119904= 119887

0997888rarr 119887

1| 119905

119900= 119886

0997888rarr 119886

1)

(29)

where 119905119904is a message at distance 119904 from the tree root 119900 and

we defined the average residual degrees 119862 = E119862[119862] and 119863 =

E119863[119863] The stability condition 119873ch lt +infin yields a constraint

on the greatest eigenvalue 120582119872of the transfer matrix 119875(119887

0rarr

1198871| 119886

0rarr 119886

1)

119862119863120582119872

lt 1 (30)

The two methods presented above give equivalent condi-tions for the RS ansatz to hold true and they simply expressthe independence for a finite subgraph from the tail boundaryconditions

42 The 1RSB Formalism We are going to develop the1RSB formalism for the MSP problem and then apply it in

Section 71 to the ensemble G119877119875 We will not check the

coherence of the 1RSB ansatz through the interstate andintrastate susceptibilities [23] we are then not guaranteedagainst the need of further steps of replica symmetry breakingin order to recover the exact solution Even in the worstcase scenario though when the 1RSB solution is triviallyexact only in the RS region and a full RSB ansatz is neededotherwise the 1RSB prediction for MSP relative size 120588 shouldbe everywhere more accurate than the RS one and possiblyvery close to the real value We will refer to the textbook ofMontanari and Mezard [19] for a detailed exposition of the1RSB cavity method

Let us fix a factor graph 119866 from a locally tree-likeensembleG We call119876

119903rarr 119894(119905119903rarr 119894

) the distribution of messageson the directed edge 119903 rarr 119894 over the states of the system Westill expect the messages 119905

119903rarr 119894to take values 0 or 1 so that

119876119903rarr 119894

can be parametrized as

119876119903rarr 119894

(119905119903rarr 119894

) = 119902119903rarr 119894

120575 (119905119903rarr 119894

) + (1 minus 119902119903rarr 119894

) 120575 (119905119903rarr 119894

minus 1)

(31)

The 1RSB Parisi parameter 119909 isin [0 1] has to be properlyrescaled in order to correctly take the limit 120583 uarr infin Thereforewe introduce the new 1RSB parameter 119910 = 120583119909 which staysfinite in the close packing limit and takes a value in [0 +infin)The reweighting factor 119890minus119910120596

119903rarr 119894

iter is defined as

119890minus119910120596119903rarr 119894

iter=

119885119903rarr 119894

prod119895isin120597119903119894

prod119904isin120597119895119903

119885119904rarr 119895

(32)

Last equation combined with (13) and (14) gives 120596119903rarr 119894

iter =

minus119905119903rarr 119894

Averaging over the whole ensemble we can then writethe zero temperature 1RSB message passing rules (also calledSurvey Propagation equations)

1199021015840 d=

prod119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

119890119910+ (1 minus 119890

119910)prod

119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

(33)

In preceding equation 119902119904119895 are iidrv on [0 1] and as usual

119862 is the randomvariable residual degree and 119863119895 are random

independent factors residual degrees Fixed points of (33)take the form

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (34)

where 1198752(119902) is a continuous distribution on [0 1] and 119901

2=

1 minus 1199010minus 119901

1 Parameters 119901

0and 119901

1have to satisfy the closed

equations

1199011= E

119862(1 minus E

119863(1 minus 119901

0)119863

)

119862

1199010= 1 minus E

119862(1 minus E

119863119901119863

1)

119862

(35)

Solutions of (35)with1199012= 0 correspond to replica symmetric

solutions and their instability marks the onset of a spin glassphase In this new phase the MSPs are clustered accordingto the general scenario displayed by constraint satisfactionproblems [24]

6 International Journal of Statistical Mechanics

From the stable fixed point of (33) we can calculate the1RSB free energy functional 120601(119910) as

minus119910120601 (119910) =

119889

119888

E log[

[

(1 minus 119890119910

)

1198620

prod

119895=1

(1 minus

119863119895

prod

119904=1

119902119904119895) + 119890

119910]

]

+ E (1 minus 1198630) log[(1 minus 119890

119910

)(1 minus

119863

prod

119903=1

119902119903) + 119890

119910

]

(36)

and 1RSB density 1205881RSB(119910) = minus(120597119910120601(119910)120597119910) as

1205881RSB (119910) =

119889

119888

E119890119910

(1 minus prod1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895))

(1 minus 119890119910)prod

1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895) + 119890

119910

+ E (1 minus 1198630)

119890119910

prod1198630

119903=1119902119903

(1 minus 119890119910) (1 minus prod

1198630

119903=1119902119903) + 119890

119910

(37)

with expectations intended over G and over fixed pointmessages 119902

119904 and 119902

119904119895 Since the free energy functional

120601(119910) and the complexity Σ(120588) are related by the Legendretransform

Σ (120588) = minus119910120588 minus 119910120601 (119910) (38)

with 120597Σ120597120588 = minus119910 through (36) we can compute thecomplexity taking the inverse transform Equilibrium statesthat is MSPs are selected by

1205881RSB = argmax

120588

120588 Σ (120588) ge 0 (39)

or equivalently taking the 1RSB parameter 119910 to be

119910119904= argmax

119910isin[0+infin]

120601 (119910) (40)

In the static 1RSB phase we expect Σ(120588119904) = 0 so that

from (38) we have 120601(119910119904) = minus120588

119904 We will see that this is

generally true except for the ensemble G119877119875

(2 119888) of Section 7corresponding to maximum matchings on ordinary graphswhere the equilibrium state have maximal complexity and119910119904= +infin The relation

1205881RSB = minus120601 (119910

119904) (41)

is always valid though since for 119910119904

uarr infin complexity staysfinite

5 A Heuristic Algorithm and Exact Results

In this section we propose a heuristic greedy algorithm toaddress the problem of MSP It is a natural generalizationof the algorithm that Karp and Sipser proposed to solve themaximummatching problemonErdos-Renyi randomgraphs[8] therefore we will call it generalized Karp-Sipser (GKS)Extending their derivation concerning the leaf removal partof the algorithm we are able to prove that the RS prediction

forMSPdensity is exact as long as the stability criterion (27) issatisfiedWewill not give the proofs of the following theoremsas they are lengthy but effortless extension of those given in[8] In order to find the maximummatching on a graph Karpand Sipser noticed that as long as the graph contains a nodeof degree one (a leaf) its unique edge has to belong to one ofthe perfect matchings

They considered the simplest randomized algorithm onecan imagine as long as there is any leaf remove it from thegraph otherwise remove a random edge then iterate untilthe graph is depletedThey studied the behaviour of this leaf-removal algorithm on random graphs and were able to provethat it grants whp a maximum matching (within an 119900(119899)

error)To generalize some of their results we need to extend the

definition of leaf to that of pendantWe call pendant a variablenode whose factor neighbours all have degree one except forone at most Stating the same concept in different words allof the neighbours of a pendant have the pendant itself as theirsole neighbour except for one of them at most See Figure 1for a pictorial representation of a pendant (in red) The GKSalgorithm is articulated in two phases a pendant removal anda random occupation phase We give the pseudocode for thegeneralized Karp-Sipser algorithm (see Algorithm 1)

At each step the algorithm prioritizes the removal ofpendants over that of random variable nodes We notice thatthe removal of a pendant is always an optimal choice in orderto achieve aMSP we have no guarantees though on the effectof the occupation of a random node We call phase 1 theexecution of the algorithm up to the point where the firstnonpendant variable is added to 119881

1015840 It is trivial to show thatphase 1 is enough to find an MSP on a tree factor graphThe interesting thing though is that phase 1 is also able todeplete nontree factor graphs and find an MSP as long as thefactor graphs are sufficiently sparse and large enough

We call core the subset of the variable nodes which hasnot been assigned to theMSP in phase 1 We will show howthe emergence of a core is directly related to replica symmetrybreaking Let us define as usual

119891 (119909) equiv E119862(1 minus E

119863119909119863

)

119862

(42)

The function119891 is continuous nonincreasing and satisfiesthe relation 1 ge 119891(0) ge 119891(1) = P[119862 = 0] It turns out thatin the large graph limit phase 1 of GKS is characterized bythe solutions of the system of equations

119901 = 119891 (119903)

119903 = 119891 (119901)

(43)

We notice that system (43) is equivalent to 1RSB equation (35)once the substitutions 119901 rarr 119901

0and 119903 rarr 1 minus 119901

1are made

Lemma 1 The system of (43) admits always a (unique)solution 119901

lowast= 119903

lowast

Proof By monotony and continuity the function 119892(119901) =

119891(119901) minus 119901 has a single zero in the interval [0 1]

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 4: The Statistical Mechanics of Random Set Packing and a

4 International Journal of Statistical Mechanics

We finally arrive to the Bethe estimation of the MSPs relativesize taking the close packing limit of (17) and using 120596

RS119866

=

minus120588RS119866 which is given by

120588RS119866

=

1

119873

[

[

[

sum

119903isin119865

max 0 cup

1 minus sum

119904isin120597119895119903

119905119904rarr 119895

119895isin119903

+sum

119894isin119881

(1 minus |120597119894|)max0 1 minus sum

119903isin119894

119905119903rarr 119894

]

]

]

(18)

Let us examine the various contributions to (18) sincewewantto convince ourselves that it exactly counts the MSP size atleast on tree factor graphs The term (1 minus |120597119894|)max0 1 minus

sum119903isin119894

119905119903rarr 119894

contributes with 1 minus |120597119894| to the sum only if all theincoming 119905messages are zero In this case 119899

119894is frozen to 1 that

is the variable 119894 takes part of all the MSPs in 119866 Obviously allthe neighbours of a variable frozen to 1 have to be frozen to0 To all its |120597119894| neighbours 119903 the frozen to 1 variable 119894 sends amessage 1minussum

119904isin120597119894119903119905119904rarr 119894

= 1 so that we have |120597119894| contributionsin the first sum of (18) max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 andthe total contribution from 119894 correctly sums up to 1 If for acertain 119894 we have a total field 120591

119894equiv 1 minus sum

119903isin119894119905119903rarr 119894

lt 0 (twoor more incoming messages are equal to one) the variableis frozen to 0 that is it does not take part of any MSPs Itcorrectly does not contribute to 120596

RS119866

since it sends a message1 minus sum

119904isin119894119903119905119904rarr 119894

le 0 to each neighbour 119903 thus it is notcomputed in max0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

The third case is the most interesting It concerns vari-

ables 119894 which take part to a fraction of the MSPs We willcall them unfrozen variables The total field on an unfrozenvariable 119894 is 120591

119894= 0 (thus we have no contribution from the

second sum in (18)) and all incoming messages are 0 exceptfor a single 119905

119903rarr 119894= 1 To this sole function node 119903 the node

119894 sends a message 1 so that the contribution of 119903 to the firstsum is 1 Actually BP equations impose that 119903 has to have atleast another unfrozen neighbour beside 119894 In other terms thefunction node 119903 says that whatever MSP we consider one ofmy neighbours has to be occupied The corresponding termmax0 cup 1 minus sum

119904isin120597119895119903119905119904rarr 119895

119895isin119903

= 1 in (18) accounts for thatThe presence of unfrozen variables is the reason why we

cannot express the density 120588119866through the formula

120588119866

= ⟨sum

119894isin119881

119899119894⟩ (19)

In fact using the infinite chemical potential formalism wecannot compute

⟨119899119894⟩ = lim

120583rarr+infin

119890120583(1minussum

119903isin119894119905119903rarr 119894)

1 + 119890120583(1minussum

119903isin119894119905119903rarr 119894)

(20)

when lim120583rarr+infin

1 minus sum119903isin119894

119905119903rarr 119894

= 0 and we would have touse the 119874(1120583) corrections to the fields 119905

119903rarr 119894 We bypass

the problem using the grand potential 120596RS119866

to obtain 120588RS119866

also addressing a problem reported in [22] of extending ananalysis suited for weighted matchings and independent setsto the unweighted case

32 Ensemble Averages To proceed further in the analysisand since one is often concerned with the average propertiesof a class of related factor graphs let us consider the casewhere the factor graph 119866 is sampled from a locally tree-likefactor graph ensemble G(119873) We will employ the followingnotation for the graph ensembles expectations E

119866[sdot] for

graphs averages E1198620

[sdot] (E1198630

[sdot]) for expectations over thefactor (variable) degree distribution which we will sometimecall root degree distribution and E

119862[sdot] (E

119863[sdot]) for expecta-

tions over the excess degree distribution of factor (variable)nodes conditioned to have at least one adjacent edge whichwe will sometime call residual degree distribution Thequantities 119888 and 119889 and the random variables 119862 119862

0 119863 and

1198630 are related by

P [119862 = 119896] =

(119896 + 1)P [1198620= 119896 + 1]

119888

P [119863 = 119896] =

(119896 + 1)P [1198630= 119896 + 1]

119889

(21)

In Section 7 we discuss some specific factor graph ensembleswhere119862 and119863 are fixed to a deterministic value or Poissoniandistributed

With these definitions the distributional equation corre-sponding to Belief Propagation formula (16) which reads

1198791015840 d= max 0 cup

1 minus

119863119895

sum

119904=1

119879119904119895

119895isin1119862

(22)

where 119863119895 are iid random residual variable degrees 119862 is

a random residual factor degree and 119879119904119895 are iid random

incoming messages Since on a given graph each message119905119903rarr 119894

takes value only on 0 1 we can take the distributionof messages 119905 to be of the form

119875 (119905) = 119901120575 (119905) + (1 minus 119901) 120575 (119905 minus 1) (23)

For this to be a fixed point of (22) the parameter 119901 has tosatisfy the self-consistent equation

119901lowast= E

119862(1 minus E

119863119901119863

lowast)

119862

(24)

Using (18) and (24) we obtain the replica symmetricapproximation to the asymptotic MSP relative size

120588RS

=

119889

119888

(1 minus E1198620

(1 minus E119863119901119863

lowast)

1198620

) + E1198630

(1 minus 1198630) 119901

1198630

lowast (25)

It turns out that the RS approximation is exact only when theratio 119873119872 is sufficiently low In the SP language (25) holdstrue only when the number of the subsets among which wechoose our MSP is not too big compared to the number ofelements of which they are composed

In the next section we will quantitatively establish thelimits of validity of the RS ansatz and introduce the one-step replica symmetry breaking formalism which provides abetter approximation to the exact results in the regime of large119873119872 ratio In Section 5 we prove that (25) is exact for certainchoices of the factor graph ensembles

International Journal of Statistical Mechanics 5

4 Replica Symmetry Breaking

41 RS Consistency and Bugs Propagation Here we proposetwo criterions in order to check the consistency of the RScavity method iIf any of those fails

The first criterion is the assumption of unicity of the fixedpoint of (24) and its dynamical stability under iteration Werestrict ourselves to the subspace of distributions with sup-port on 0 1 although it is possible to extend the followinganalysis to the whole space of distributions over [0 1] withan argument based on stochastic dominance following [22]Characterizing the distributions over 0 1 as in (23) with areal parameter 119901 isin [0 1] from (22) we obtain the dynamicalsystem

1199011015840

= E119862(1 minus E

119863119901119863

)

119862

equiv 119891 (119901) (26)

The stability criterion100381610038161003816100381610038161198911015840

(119901lowast)

10038161003816100381610038161003816lt 1 (27)

suggests that the RS approximation to the MSP size (25) isexact as long as (27) is satisfied This statement will be maderigorous in Section 5

The secondmethodwe use to check the RS stability calledbugs proliferation is the zero temperature analogous of spinglass susceptibility We will compute the average number ofchanging 119905messages induced by a change in a single message119905119903rarr 119894

(1 rarr 0 or 0 rarr 1) This is given by

119873ch = E[

[

[

sum

(119904119895)

sum

1198860 1198861

1198870 1198871

I (119905119904rarr 119895

= 1198870997888rarr 119887

1| 119905

119903rarr 119894= 119886

0997888rarr 119886

1)]

]

]

(28)

where 1198860 119886

1 119887

0 and 119887

1take value in 0 1 Since a random

factor graph is locally a tree and assuming correlations decayfast enough last equation can be expressed as

119873ch =

+infin

sum

119904=0

(119862119863)

119904

sum

1198860 1198861

1198870 1198871

119875 (119905119904= 119887

0997888rarr 119887

1| 119905

119900= 119886

0997888rarr 119886

1)

(29)

where 119905119904is a message at distance 119904 from the tree root 119900 and

we defined the average residual degrees 119862 = E119862[119862] and 119863 =

E119863[119863] The stability condition 119873ch lt +infin yields a constraint

on the greatest eigenvalue 120582119872of the transfer matrix 119875(119887

0rarr

1198871| 119886

0rarr 119886

1)

119862119863120582119872

lt 1 (30)

The two methods presented above give equivalent condi-tions for the RS ansatz to hold true and they simply expressthe independence for a finite subgraph from the tail boundaryconditions

42 The 1RSB Formalism We are going to develop the1RSB formalism for the MSP problem and then apply it in

Section 71 to the ensemble G119877119875 We will not check the

coherence of the 1RSB ansatz through the interstate andintrastate susceptibilities [23] we are then not guaranteedagainst the need of further steps of replica symmetry breakingin order to recover the exact solution Even in the worstcase scenario though when the 1RSB solution is triviallyexact only in the RS region and a full RSB ansatz is neededotherwise the 1RSB prediction for MSP relative size 120588 shouldbe everywhere more accurate than the RS one and possiblyvery close to the real value We will refer to the textbook ofMontanari and Mezard [19] for a detailed exposition of the1RSB cavity method

Let us fix a factor graph 119866 from a locally tree-likeensembleG We call119876

119903rarr 119894(119905119903rarr 119894

) the distribution of messageson the directed edge 119903 rarr 119894 over the states of the system Westill expect the messages 119905

119903rarr 119894to take values 0 or 1 so that

119876119903rarr 119894

can be parametrized as

119876119903rarr 119894

(119905119903rarr 119894

) = 119902119903rarr 119894

120575 (119905119903rarr 119894

) + (1 minus 119902119903rarr 119894

) 120575 (119905119903rarr 119894

minus 1)

(31)

The 1RSB Parisi parameter 119909 isin [0 1] has to be properlyrescaled in order to correctly take the limit 120583 uarr infin Thereforewe introduce the new 1RSB parameter 119910 = 120583119909 which staysfinite in the close packing limit and takes a value in [0 +infin)The reweighting factor 119890minus119910120596

119903rarr 119894

iter is defined as

119890minus119910120596119903rarr 119894

iter=

119885119903rarr 119894

prod119895isin120597119903119894

prod119904isin120597119895119903

119885119904rarr 119895

(32)

Last equation combined with (13) and (14) gives 120596119903rarr 119894

iter =

minus119905119903rarr 119894

Averaging over the whole ensemble we can then writethe zero temperature 1RSB message passing rules (also calledSurvey Propagation equations)

1199021015840 d=

prod119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

119890119910+ (1 minus 119890

119910)prod

119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

(33)

In preceding equation 119902119904119895 are iidrv on [0 1] and as usual

119862 is the randomvariable residual degree and 119863119895 are random

independent factors residual degrees Fixed points of (33)take the form

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (34)

where 1198752(119902) is a continuous distribution on [0 1] and 119901

2=

1 minus 1199010minus 119901

1 Parameters 119901

0and 119901

1have to satisfy the closed

equations

1199011= E

119862(1 minus E

119863(1 minus 119901

0)119863

)

119862

1199010= 1 minus E

119862(1 minus E

119863119901119863

1)

119862

(35)

Solutions of (35)with1199012= 0 correspond to replica symmetric

solutions and their instability marks the onset of a spin glassphase In this new phase the MSPs are clustered accordingto the general scenario displayed by constraint satisfactionproblems [24]

6 International Journal of Statistical Mechanics

From the stable fixed point of (33) we can calculate the1RSB free energy functional 120601(119910) as

minus119910120601 (119910) =

119889

119888

E log[

[

(1 minus 119890119910

)

1198620

prod

119895=1

(1 minus

119863119895

prod

119904=1

119902119904119895) + 119890

119910]

]

+ E (1 minus 1198630) log[(1 minus 119890

119910

)(1 minus

119863

prod

119903=1

119902119903) + 119890

119910

]

(36)

and 1RSB density 1205881RSB(119910) = minus(120597119910120601(119910)120597119910) as

1205881RSB (119910) =

119889

119888

E119890119910

(1 minus prod1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895))

(1 minus 119890119910)prod

1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895) + 119890

119910

+ E (1 minus 1198630)

119890119910

prod1198630

119903=1119902119903

(1 minus 119890119910) (1 minus prod

1198630

119903=1119902119903) + 119890

119910

(37)

with expectations intended over G and over fixed pointmessages 119902

119904 and 119902

119904119895 Since the free energy functional

120601(119910) and the complexity Σ(120588) are related by the Legendretransform

Σ (120588) = minus119910120588 minus 119910120601 (119910) (38)

with 120597Σ120597120588 = minus119910 through (36) we can compute thecomplexity taking the inverse transform Equilibrium statesthat is MSPs are selected by

1205881RSB = argmax

120588

120588 Σ (120588) ge 0 (39)

or equivalently taking the 1RSB parameter 119910 to be

119910119904= argmax

119910isin[0+infin]

120601 (119910) (40)

In the static 1RSB phase we expect Σ(120588119904) = 0 so that

from (38) we have 120601(119910119904) = minus120588

119904 We will see that this is

generally true except for the ensemble G119877119875

(2 119888) of Section 7corresponding to maximum matchings on ordinary graphswhere the equilibrium state have maximal complexity and119910119904= +infin The relation

1205881RSB = minus120601 (119910

119904) (41)

is always valid though since for 119910119904

uarr infin complexity staysfinite

5 A Heuristic Algorithm and Exact Results

In this section we propose a heuristic greedy algorithm toaddress the problem of MSP It is a natural generalizationof the algorithm that Karp and Sipser proposed to solve themaximummatching problemonErdos-Renyi randomgraphs[8] therefore we will call it generalized Karp-Sipser (GKS)Extending their derivation concerning the leaf removal partof the algorithm we are able to prove that the RS prediction

forMSPdensity is exact as long as the stability criterion (27) issatisfiedWewill not give the proofs of the following theoremsas they are lengthy but effortless extension of those given in[8] In order to find the maximummatching on a graph Karpand Sipser noticed that as long as the graph contains a nodeof degree one (a leaf) its unique edge has to belong to one ofthe perfect matchings

They considered the simplest randomized algorithm onecan imagine as long as there is any leaf remove it from thegraph otherwise remove a random edge then iterate untilthe graph is depletedThey studied the behaviour of this leaf-removal algorithm on random graphs and were able to provethat it grants whp a maximum matching (within an 119900(119899)

error)To generalize some of their results we need to extend the

definition of leaf to that of pendantWe call pendant a variablenode whose factor neighbours all have degree one except forone at most Stating the same concept in different words allof the neighbours of a pendant have the pendant itself as theirsole neighbour except for one of them at most See Figure 1for a pictorial representation of a pendant (in red) The GKSalgorithm is articulated in two phases a pendant removal anda random occupation phase We give the pseudocode for thegeneralized Karp-Sipser algorithm (see Algorithm 1)

At each step the algorithm prioritizes the removal ofpendants over that of random variable nodes We notice thatthe removal of a pendant is always an optimal choice in orderto achieve aMSP we have no guarantees though on the effectof the occupation of a random node We call phase 1 theexecution of the algorithm up to the point where the firstnonpendant variable is added to 119881

1015840 It is trivial to show thatphase 1 is enough to find an MSP on a tree factor graphThe interesting thing though is that phase 1 is also able todeplete nontree factor graphs and find an MSP as long as thefactor graphs are sufficiently sparse and large enough

We call core the subset of the variable nodes which hasnot been assigned to theMSP in phase 1 We will show howthe emergence of a core is directly related to replica symmetrybreaking Let us define as usual

119891 (119909) equiv E119862(1 minus E

119863119909119863

)

119862

(42)

The function119891 is continuous nonincreasing and satisfiesthe relation 1 ge 119891(0) ge 119891(1) = P[119862 = 0] It turns out thatin the large graph limit phase 1 of GKS is characterized bythe solutions of the system of equations

119901 = 119891 (119903)

119903 = 119891 (119901)

(43)

We notice that system (43) is equivalent to 1RSB equation (35)once the substitutions 119901 rarr 119901

0and 119903 rarr 1 minus 119901

1are made

Lemma 1 The system of (43) admits always a (unique)solution 119901

lowast= 119903

lowast

Proof By monotony and continuity the function 119892(119901) =

119891(119901) minus 119901 has a single zero in the interval [0 1]

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 5: The Statistical Mechanics of Random Set Packing and a

International Journal of Statistical Mechanics 5

4 Replica Symmetry Breaking

41 RS Consistency and Bugs Propagation Here we proposetwo criterions in order to check the consistency of the RScavity method iIf any of those fails

The first criterion is the assumption of unicity of the fixedpoint of (24) and its dynamical stability under iteration Werestrict ourselves to the subspace of distributions with sup-port on 0 1 although it is possible to extend the followinganalysis to the whole space of distributions over [0 1] withan argument based on stochastic dominance following [22]Characterizing the distributions over 0 1 as in (23) with areal parameter 119901 isin [0 1] from (22) we obtain the dynamicalsystem

1199011015840

= E119862(1 minus E

119863119901119863

)

119862

equiv 119891 (119901) (26)

The stability criterion100381610038161003816100381610038161198911015840

(119901lowast)

10038161003816100381610038161003816lt 1 (27)

suggests that the RS approximation to the MSP size (25) isexact as long as (27) is satisfied This statement will be maderigorous in Section 5

The secondmethodwe use to check the RS stability calledbugs proliferation is the zero temperature analogous of spinglass susceptibility We will compute the average number ofchanging 119905messages induced by a change in a single message119905119903rarr 119894

(1 rarr 0 or 0 rarr 1) This is given by

119873ch = E[

[

[

sum

(119904119895)

sum

1198860 1198861

1198870 1198871

I (119905119904rarr 119895

= 1198870997888rarr 119887

1| 119905

119903rarr 119894= 119886

0997888rarr 119886

1)]

]

]

(28)

where 1198860 119886

1 119887

0 and 119887

1take value in 0 1 Since a random

factor graph is locally a tree and assuming correlations decayfast enough last equation can be expressed as

119873ch =

+infin

sum

119904=0

(119862119863)

119904

sum

1198860 1198861

1198870 1198871

119875 (119905119904= 119887

0997888rarr 119887

1| 119905

119900= 119886

0997888rarr 119886

1)

(29)

where 119905119904is a message at distance 119904 from the tree root 119900 and

we defined the average residual degrees 119862 = E119862[119862] and 119863 =

E119863[119863] The stability condition 119873ch lt +infin yields a constraint

on the greatest eigenvalue 120582119872of the transfer matrix 119875(119887

0rarr

1198871| 119886

0rarr 119886

1)

119862119863120582119872

lt 1 (30)

The two methods presented above give equivalent condi-tions for the RS ansatz to hold true and they simply expressthe independence for a finite subgraph from the tail boundaryconditions

42 The 1RSB Formalism We are going to develop the1RSB formalism for the MSP problem and then apply it in

Section 71 to the ensemble G119877119875 We will not check the

coherence of the 1RSB ansatz through the interstate andintrastate susceptibilities [23] we are then not guaranteedagainst the need of further steps of replica symmetry breakingin order to recover the exact solution Even in the worstcase scenario though when the 1RSB solution is triviallyexact only in the RS region and a full RSB ansatz is neededotherwise the 1RSB prediction for MSP relative size 120588 shouldbe everywhere more accurate than the RS one and possiblyvery close to the real value We will refer to the textbook ofMontanari and Mezard [19] for a detailed exposition of the1RSB cavity method

Let us fix a factor graph 119866 from a locally tree-likeensembleG We call119876

119903rarr 119894(119905119903rarr 119894

) the distribution of messageson the directed edge 119903 rarr 119894 over the states of the system Westill expect the messages 119905

119903rarr 119894to take values 0 or 1 so that

119876119903rarr 119894

can be parametrized as

119876119903rarr 119894

(119905119903rarr 119894

) = 119902119903rarr 119894

120575 (119905119903rarr 119894

) + (1 minus 119902119903rarr 119894

) 120575 (119905119903rarr 119894

minus 1)

(31)

The 1RSB Parisi parameter 119909 isin [0 1] has to be properlyrescaled in order to correctly take the limit 120583 uarr infin Thereforewe introduce the new 1RSB parameter 119910 = 120583119909 which staysfinite in the close packing limit and takes a value in [0 +infin)The reweighting factor 119890minus119910120596

119903rarr 119894

iter is defined as

119890minus119910120596119903rarr 119894

iter=

119885119903rarr 119894

prod119895isin120597119903119894

prod119904isin120597119895119903

119885119904rarr 119895

(32)

Last equation combined with (13) and (14) gives 120596119903rarr 119894

iter =

minus119905119903rarr 119894

Averaging over the whole ensemble we can then writethe zero temperature 1RSB message passing rules (also calledSurvey Propagation equations)

1199021015840 d=

prod119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

119890119910+ (1 minus 119890

119910)prod

119862

119895=1(1 minus prod

119863119895

119904=1119902119904119895)

(33)

In preceding equation 119902119904119895 are iidrv on [0 1] and as usual

119862 is the randomvariable residual degree and 119863119895 are random

independent factors residual degrees Fixed points of (33)take the form

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (34)

where 1198752(119902) is a continuous distribution on [0 1] and 119901

2=

1 minus 1199010minus 119901

1 Parameters 119901

0and 119901

1have to satisfy the closed

equations

1199011= E

119862(1 minus E

119863(1 minus 119901

0)119863

)

119862

1199010= 1 minus E

119862(1 minus E

119863119901119863

1)

119862

(35)

Solutions of (35)with1199012= 0 correspond to replica symmetric

solutions and their instability marks the onset of a spin glassphase In this new phase the MSPs are clustered accordingto the general scenario displayed by constraint satisfactionproblems [24]

6 International Journal of Statistical Mechanics

From the stable fixed point of (33) we can calculate the1RSB free energy functional 120601(119910) as

minus119910120601 (119910) =

119889

119888

E log[

[

(1 minus 119890119910

)

1198620

prod

119895=1

(1 minus

119863119895

prod

119904=1

119902119904119895) + 119890

119910]

]

+ E (1 minus 1198630) log[(1 minus 119890

119910

)(1 minus

119863

prod

119903=1

119902119903) + 119890

119910

]

(36)

and 1RSB density 1205881RSB(119910) = minus(120597119910120601(119910)120597119910) as

1205881RSB (119910) =

119889

119888

E119890119910

(1 minus prod1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895))

(1 minus 119890119910)prod

1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895) + 119890

119910

+ E (1 minus 1198630)

119890119910

prod1198630

119903=1119902119903

(1 minus 119890119910) (1 minus prod

1198630

119903=1119902119903) + 119890

119910

(37)

with expectations intended over G and over fixed pointmessages 119902

119904 and 119902

119904119895 Since the free energy functional

120601(119910) and the complexity Σ(120588) are related by the Legendretransform

Σ (120588) = minus119910120588 minus 119910120601 (119910) (38)

with 120597Σ120597120588 = minus119910 through (36) we can compute thecomplexity taking the inverse transform Equilibrium statesthat is MSPs are selected by

1205881RSB = argmax

120588

120588 Σ (120588) ge 0 (39)

or equivalently taking the 1RSB parameter 119910 to be

119910119904= argmax

119910isin[0+infin]

120601 (119910) (40)

In the static 1RSB phase we expect Σ(120588119904) = 0 so that

from (38) we have 120601(119910119904) = minus120588

119904 We will see that this is

generally true except for the ensemble G119877119875

(2 119888) of Section 7corresponding to maximum matchings on ordinary graphswhere the equilibrium state have maximal complexity and119910119904= +infin The relation

1205881RSB = minus120601 (119910

119904) (41)

is always valid though since for 119910119904

uarr infin complexity staysfinite

5 A Heuristic Algorithm and Exact Results

In this section we propose a heuristic greedy algorithm toaddress the problem of MSP It is a natural generalizationof the algorithm that Karp and Sipser proposed to solve themaximummatching problemonErdos-Renyi randomgraphs[8] therefore we will call it generalized Karp-Sipser (GKS)Extending their derivation concerning the leaf removal partof the algorithm we are able to prove that the RS prediction

forMSPdensity is exact as long as the stability criterion (27) issatisfiedWewill not give the proofs of the following theoremsas they are lengthy but effortless extension of those given in[8] In order to find the maximummatching on a graph Karpand Sipser noticed that as long as the graph contains a nodeof degree one (a leaf) its unique edge has to belong to one ofthe perfect matchings

They considered the simplest randomized algorithm onecan imagine as long as there is any leaf remove it from thegraph otherwise remove a random edge then iterate untilthe graph is depletedThey studied the behaviour of this leaf-removal algorithm on random graphs and were able to provethat it grants whp a maximum matching (within an 119900(119899)

error)To generalize some of their results we need to extend the

definition of leaf to that of pendantWe call pendant a variablenode whose factor neighbours all have degree one except forone at most Stating the same concept in different words allof the neighbours of a pendant have the pendant itself as theirsole neighbour except for one of them at most See Figure 1for a pictorial representation of a pendant (in red) The GKSalgorithm is articulated in two phases a pendant removal anda random occupation phase We give the pseudocode for thegeneralized Karp-Sipser algorithm (see Algorithm 1)

At each step the algorithm prioritizes the removal ofpendants over that of random variable nodes We notice thatthe removal of a pendant is always an optimal choice in orderto achieve aMSP we have no guarantees though on the effectof the occupation of a random node We call phase 1 theexecution of the algorithm up to the point where the firstnonpendant variable is added to 119881

1015840 It is trivial to show thatphase 1 is enough to find an MSP on a tree factor graphThe interesting thing though is that phase 1 is also able todeplete nontree factor graphs and find an MSP as long as thefactor graphs are sufficiently sparse and large enough

We call core the subset of the variable nodes which hasnot been assigned to theMSP in phase 1 We will show howthe emergence of a core is directly related to replica symmetrybreaking Let us define as usual

119891 (119909) equiv E119862(1 minus E

119863119909119863

)

119862

(42)

The function119891 is continuous nonincreasing and satisfiesthe relation 1 ge 119891(0) ge 119891(1) = P[119862 = 0] It turns out thatin the large graph limit phase 1 of GKS is characterized bythe solutions of the system of equations

119901 = 119891 (119903)

119903 = 119891 (119901)

(43)

We notice that system (43) is equivalent to 1RSB equation (35)once the substitutions 119901 rarr 119901

0and 119903 rarr 1 minus 119901

1are made

Lemma 1 The system of (43) admits always a (unique)solution 119901

lowast= 119903

lowast

Proof By monotony and continuity the function 119892(119901) =

119891(119901) minus 119901 has a single zero in the interval [0 1]

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 6: The Statistical Mechanics of Random Set Packing and a

6 International Journal of Statistical Mechanics

From the stable fixed point of (33) we can calculate the1RSB free energy functional 120601(119910) as

minus119910120601 (119910) =

119889

119888

E log[

[

(1 minus 119890119910

)

1198620

prod

119895=1

(1 minus

119863119895

prod

119904=1

119902119904119895) + 119890

119910]

]

+ E (1 minus 1198630) log[(1 minus 119890

119910

)(1 minus

119863

prod

119903=1

119902119903) + 119890

119910

]

(36)

and 1RSB density 1205881RSB(119910) = minus(120597119910120601(119910)120597119910) as

1205881RSB (119910) =

119889

119888

E119890119910

(1 minus prod1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895))

(1 minus 119890119910)prod

1198620

119895=1(1 minus prod

119863119895

119904=1119902119904119895) + 119890

119910

+ E (1 minus 1198630)

119890119910

prod1198630

119903=1119902119903

(1 minus 119890119910) (1 minus prod

1198630

119903=1119902119903) + 119890

119910

(37)

with expectations intended over G and over fixed pointmessages 119902

119904 and 119902

119904119895 Since the free energy functional

120601(119910) and the complexity Σ(120588) are related by the Legendretransform

Σ (120588) = minus119910120588 minus 119910120601 (119910) (38)

with 120597Σ120597120588 = minus119910 through (36) we can compute thecomplexity taking the inverse transform Equilibrium statesthat is MSPs are selected by

1205881RSB = argmax

120588

120588 Σ (120588) ge 0 (39)

or equivalently taking the 1RSB parameter 119910 to be

119910119904= argmax

119910isin[0+infin]

120601 (119910) (40)

In the static 1RSB phase we expect Σ(120588119904) = 0 so that

from (38) we have 120601(119910119904) = minus120588

119904 We will see that this is

generally true except for the ensemble G119877119875

(2 119888) of Section 7corresponding to maximum matchings on ordinary graphswhere the equilibrium state have maximal complexity and119910119904= +infin The relation

1205881RSB = minus120601 (119910

119904) (41)

is always valid though since for 119910119904

uarr infin complexity staysfinite

5 A Heuristic Algorithm and Exact Results

In this section we propose a heuristic greedy algorithm toaddress the problem of MSP It is a natural generalizationof the algorithm that Karp and Sipser proposed to solve themaximummatching problemonErdos-Renyi randomgraphs[8] therefore we will call it generalized Karp-Sipser (GKS)Extending their derivation concerning the leaf removal partof the algorithm we are able to prove that the RS prediction

forMSPdensity is exact as long as the stability criterion (27) issatisfiedWewill not give the proofs of the following theoremsas they are lengthy but effortless extension of those given in[8] In order to find the maximummatching on a graph Karpand Sipser noticed that as long as the graph contains a nodeof degree one (a leaf) its unique edge has to belong to one ofthe perfect matchings

They considered the simplest randomized algorithm onecan imagine as long as there is any leaf remove it from thegraph otherwise remove a random edge then iterate untilthe graph is depletedThey studied the behaviour of this leaf-removal algorithm on random graphs and were able to provethat it grants whp a maximum matching (within an 119900(119899)

error)To generalize some of their results we need to extend the

definition of leaf to that of pendantWe call pendant a variablenode whose factor neighbours all have degree one except forone at most Stating the same concept in different words allof the neighbours of a pendant have the pendant itself as theirsole neighbour except for one of them at most See Figure 1for a pictorial representation of a pendant (in red) The GKSalgorithm is articulated in two phases a pendant removal anda random occupation phase We give the pseudocode for thegeneralized Karp-Sipser algorithm (see Algorithm 1)

At each step the algorithm prioritizes the removal ofpendants over that of random variable nodes We notice thatthe removal of a pendant is always an optimal choice in orderto achieve aMSP we have no guarantees though on the effectof the occupation of a random node We call phase 1 theexecution of the algorithm up to the point where the firstnonpendant variable is added to 119881

1015840 It is trivial to show thatphase 1 is enough to find an MSP on a tree factor graphThe interesting thing though is that phase 1 is also able todeplete nontree factor graphs and find an MSP as long as thefactor graphs are sufficiently sparse and large enough

We call core the subset of the variable nodes which hasnot been assigned to theMSP in phase 1 We will show howthe emergence of a core is directly related to replica symmetrybreaking Let us define as usual

119891 (119909) equiv E119862(1 minus E

119863119909119863

)

119862

(42)

The function119891 is continuous nonincreasing and satisfiesthe relation 1 ge 119891(0) ge 119891(1) = P[119862 = 0] It turns out thatin the large graph limit phase 1 of GKS is characterized bythe solutions of the system of equations

119901 = 119891 (119903)

119903 = 119891 (119901)

(43)

We notice that system (43) is equivalent to 1RSB equation (35)once the substitutions 119901 rarr 119901

0and 119903 rarr 1 minus 119901

1are made

Lemma 1 The system of (43) admits always a (unique)solution 119901

lowast= 119903

lowast

Proof By monotony and continuity the function 119892(119901) =

119891(119901) minus 119901 has a single zero in the interval [0 1]

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 7: The Statistical Mechanics of Random Set Packing and a

International Journal of Statistical Mechanics 7

Require a factor graph 119866 = (119881 119865 119864)Ensure a set packing 119881

1015840

1198811015840

= 0

add to 1198811015840 all isolated variable nodes and remove them

from 119866

remove from 119866 any isolated factor nodewhile 119881 is not empty do

if 119866 has any pendant thenchoose a pendant 119894 uniformly at randomadd 119894 to 119881

1015840

remove 119894 from 119866 then remove its factor neighboursand their variable neighbours

elsepick uniformly at random a variable node and addit to 119881

1015840 remove it from 119866 then remove its factorneighbours and their variable neighbours

end ifremove from 119866 any isolated factor node

end whileReturn 119881

1015840

Algorithm 1 Algorithm 1 generalized Karp-Sipser (GKS)

Figure 1 (left) A representation of a factor graph containing apendant depicted in red (right) The factor graph after the removalof the pendant as occurs in the inner part of the while loop inAlgorithm 1

If other solutions are present the relevant one is the onewith smallest 119901 as the following theorems certify

Theorem 2 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of factor nodes surviving phase 1 in the largegraphs limit is given by

1205951= 1 + 119901 minus 119903 + 119888119901E

119863[119901

119863

minus 119903119863

] (44)

with

119903 = E1198620

(1 minus E119863119901119863

)

1198620

119901 = E1198620

(1 minus E119863119903119863

)

1198620

(45)

In particular if the smallest solution is 119901lowast

= 119903lowastthe graph is

depleted with high probability in phase 1

As already said wewill not give the proof of this and of thefollowing theorem as they are lengthy and can be obtainedfrom the derivation given in Karp and Sipserrsquos article [8] with

little effort even if not in a completely trivial way Theorem 2affirms that as soon as (43) develops another solution a coresurvives phase 1 This phenomena coincides with the needfor replica symmetry breaking in the cavity formalism Let usnow establish the exactness of the cavity prediction for 120588 inthe RS phase

Theorem 3 Let (119901 119903) be the solution with smallest 119901 of (43)Then the density of variables assigned in phase 1 in the largegraphs limit is

1205881=

119889

119888

(1 minus 119903) + E1198630

(1 minus 1198630) 119901

1198630 (46)

where 119901 has been defined in previous theorem In particular ifthe smallest solution is 119901

lowast= 119903

lowastthe replica symmetric cavity

method prediction (25) is exact

These two theorems imply that the RS prediction for120588 (25) holds true as long as phase 1 manages to depletewhp the whole factor graph A similar behaviour has beenobserved in other combinatorial optimization problem forexample the random XORSAT [25] Conversely it is easy toprove given the equivalence between (43) and (35) that if asolution with 119901 = 119901

lowastexists the RS fixed point is unstable in

the 1RSB distributional space Therefore the system is in theRS phase if and only if (35) admits a unique solution

We notice that we could have also looked for the solutionsof the single equation 119901 = 119891

2

(119901) instead of the two of (43)since the value of 119903 is uniquely determined by the value of119901 Then previous theorems state that the RS results hold aslong as 1198912 has a single fixed point In [22] it has been proventhat the RS solution of the weighted maximum independentset and maximum weighted matching holds true as long asthe corresponding squared cavity operator 119891

2 has a uniquefixed point Since those are special cases of the weightedMSP

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 8: The Statistical Mechanics of Random Set Packing and a

8 International Journal of Statistical Mechanics

problem we conjecture that the 1198912 condition holds for the

general case too as it does for the unweighted case Probablyit would be possible to further generalize the Karp-Sipseralgorithm to cover analytically the weighted case as well

The scenario arising from the analysis of the algorithmand from 1RSB cavity method is the following in the RSphase maximum set packings form a single cluster and it ispossible to connect any two of them with a path involvingMSPs separated by a rearrangement of a finite number ofvariables [26] In the RSB phase insteadMSPs can be groupedinto many connected (in the sense we mentioned before)clusters each one of them defined by the assigning of thevariables in the core Two MSPs who differ on the core arealways separated by a global rearrangement of the variables(ie 119874(119873)) In presence of a core the GKS algorithm makessome suboptimal random choices (after phase 1)

We did not make an analytical study of the randomvariable removal part of the GKS algorithm This has beendone by Wormald for the matching problem associating tothe graph process a set of differential equations amenable toanalysis [9] (something similar has been done for randomXORSAT as well [25]) In that case it turns out that thealgorithm still achieves optimality and yields an almostperfect matching on the core An appropriate analysis ofthe second phase of the GKS algorithm and the reason ofits failure for most of the SP ensembles we covered deservefurther studies

6 Numerical Simulations

While the RS cavity (24) and (25) can be easily computed thenumerical solution of the 1RSB density evolution (33) ismuchmore involved (although simplified by the zero temperaturelimit) and has been obtained with a standard populationdynamics algorithm [27]

The implementation of the GKS algorithm is straightfor-wardWhile it is very fast during phase 1 we noticed a hugeslowing down in the random removal part We were able tofind set packings for factor graphs with hundred of thousandsof nodes

In order to test the cavity and GKS predictions we alsocomputed the exact MSP size on factor graphs of smallsize First we notice that a set packing problem coded in afactor graph structure 119865 is equivalent to an independent setproblem on an appropriate ordinary graph 119866 The node setof 119866 will be the variable node set of 119865 and we add an edge to119866 between each pair of nodes having a common factor nodeneighbour in 119865 Therefore each neighborhood of a factornode in 119865 forms a clique in 119866 We then solve the maximumset problem for119866 using an exact algorithm [28] implementedin the igraph library [29] Since the time complexity isexponential in the size of the graph we performed oursimulations on graphs containing up to only one hundrednodes

7 Applications to Problem Ensembles

We will now apply the methods developed in the previoussections to some factor graph ensembles each modelling

a class of MSP problem instances We consider graphensembles containing nodes with Poissonian random degreeor regular degree G

119875119877(119873 119889 119888) G

119877119875(119873 119889 119888) G

119875119875(119873 119889 119888)

and G119877119877

(119873 119889 119888) Subscript 119877 or 119875 indicates whether thetype of nodes to which they refer (variable nodes for thefirst subscript factor nodes for the second) have regularor Poissonian random degree respectively We parametrizethese ensemble by their average variable and factor degreethat is 119889 = E

11986301198630and 119888 = E

11986201198620 They are constituted

by factor graphs having 119873 variable nodes and 119872 = lfloor119873119889119888rfloor

factor nodes but they differ both for their elements and fortheir probability law We define the ensembles giving theprobability of sampling one of their elements

(i) G119877119875

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factors Every variable node has fixed degree119889 119866 is obtained linking each variable with 119889 factorschosen uniformly and independently at randomThefactor nodes degree distribution obtained is Poisso-nian of mean 119888 with high probability This ensembleis a model for the 119896-set packing and will be the mainfocus of our attention

(ii) G119875119877

(119873 119889 119888) each element119866has119873 variables and119872 =

lfloor119873119889119888rfloor factor Every factor node has fixed degree 119888119866 is built linking each factors with 119888 variables chosenuniformly and independently at randomThe variablenodes degree distribution obtained is Poissonian ofmean 119889 with high probability

(iii) G119875119875

(119873 119889 119888) each element 119866 has 119873 variables and119872 = lfloor119873119889119888rfloor factors 119866 is built adding an edge(119894 119903) with probability 119888119873 independently for eachchoice of a variable 119894 and a factor 119903 The factor graphobtained has whp Poissonian variable nodes degreedistribution of mean 119889 and Poissonian factor nodesdegree distribution of mean 119888

(iv) G119877119877

(119873 119889 119888) it is constituted of all factor graphs of119873 variable nodes of degree 119889 and 119872 = 119873119889119888 factornodes of degree 119888 (119873119889 has to be multiple of 119888) Everyfactor graph of the ensemble is equiprobable and canbe sampled using a generalization of the configurationmodel for random regular graph [30] This ensembleis a model for the 119896-set packing

We will omit the argument 119873 when we refer to an ensemblein the119873 uarr infin limit

71G119877119875

(119889119888) This is the ensemble with variable node degreesfixed to 119889 and Poissonian factor node degree that is 119862 sim 119862

0sim

Poisson(119888) The case 119889 = 2 corresponds to the maximummatching problem on Erdos-Renyi random graph [8 12]

Density evolution (22) for G119877119875

reduces to

1198791015840 d= max 0 cup 1 minus

119889minus1

sum

119904=1

119879119904119895

119895isin1119862

(47)

Considering distributions of the form 119875(119905) = 119901120575(119905) + (1 minus

119901)120575(1 minus 119905) fixed points of (47) reads

119901lowast= 119890

minus119888119901119889minus1

lowastequiv 119891 (119901

lowast) (48)

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 9: The Statistical Mechanics of Random Set Packing and a

International Journal of Statistical Mechanics 9

The last equation admits only one solution for each value of119889 and 119888 as it is easily seen through a monotony argumentconsidering the left and right hand side of the equation Thevalues of 119888 as a function of 119889 satisfying |119891

1015840

(119901lowast)| = 1 are the

critical points 119888119904(119889) delimiting the RS phase (119888 lt 119888

119904(119889)) and

are given by

119888119904(119889) =

119890

119889 minus 1

(49)

For 119888 gt 119890(119889 minus 1) a core survives phase 1 of the GKSalgorithm as stated by Theorem 2 and shown in Figure 2We notice that the same threshold value 119890(119889 minus 1) is foundin the minimum set covering problem on the dual factorgraph ensemble [31] where the application of a leaf removalalgorithm that isphase 1 ofGKS unveils an analogous coreemergence phenomena In the RS phase the MSP density isequal to the minimum set covering density but in the RSBphase we cannot compare the cavity predictions for thesetwo dual problems since in [31] the 1RSB solution to theminimum set covering is not provided

In the matching case that is 119889 = 2 equation (49)expresses the notorious 119890-phenomena discovered by Karpand Sipser while for higher values of 119889 it provides anextension of the critical threshold

We can recover the same critical condition equation (49)through the bug propagation method as the transfer matrix119875(119887

0rarr 119887

1| 119886

0rarr 119886

1) non-zero elements are only

119875 (0 997888rarr 1 | 1 997888rarr 0) = 119901119889minus1

119875 (1 997888rarr 0 | 0 997888rarr 1) = 119901119889minus1

(50)

which give 120582119872

= 119901119889minus1 The average branching factor is 119862119863 =

119888(119889 minus 1) so that (30) and (48) yield (49) The analytical valuefor the relative size of MSPs that is the particle density 120588 is

120588 (119889 119888) =

119889

119888

(1 minus 119901lowast) + (1 minus 119889) 119901

119889

lowastfor 119888 lt 119888

119904(119889) (51)

In Figure 3 we compared 120588 from (51) as a function of 119888 forsome values 119889with an exact algorithm applied to finite factorgraphs (as explained in Section 6) both above and below 119888

119904

Clearly for 119888 gt 119888119904the RS approximation is increasingly more

inaccurateWe continue our analysis of G

119877119875above the critical value

119888119904through the 1RSB cavity method as outlined in Section 42

Fixed point messages of (33) are distributed as

119875 (119902) = 1199010120575 (119902) + 119901

1120575 (119902 minus 1) + 119901

21198752(119902) (52)

where

1199011= 119890

minus119888(1minus1199010)119889minus1

1199010= 1 minus 119890

minus119888119901119889

1

1199012= 1 minus 119901

0minus 119901

1

(53)

and 1198752(119902) has to be determined through (33) Equation

(53) admits always an RS solution 1199011

= 1 minus 1199010

= 119901lowast

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

CorePhase 1

120588

Mean factor degree (c)

Figure 2 Results from GKS algorithm applied to G119877119875

(3 119888) The setpacking size after phase 1 (blue line) and after the algorithm stops(red line) are reported The vertical line is 119888

119904= 1198902 and is drawn as a

visual aid to determine the point where the RS assumptions stop tohold true Above 119888

119904a nonempty core emerges continuously (green

line)

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

RS d = 2

RS d = 3

RS d = 4

RS d = 5

Exact d = 2

Exact d = 3

Exact d = 4

Exact d = 5

Figure 3 RS cavity method analytical prediction equation (51) forMSP size in G

119877119875(119889 119888) compared with the average size of MSPs

obtained from 1000 samples of factor graphs with 100 variable nodes(dots) Dashed parts of the lines are the RS estimations in the RSBphase that is for 119888 gt 119890(119889 minus 1) where it is no longer exact

which is stable up to 119888119904 as already noticed Above 119888

119904a

new stable fixed point with 1199012

gt 0 continuously arisesand we study it numerically with a population dynamicsalgorithm

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 10: The Statistical Mechanics of Random Set Packing and a

10 International Journal of Statistical Mechanics

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)e

Set p

acki

ng si

ze (120588

)

Figure 4 Maximum set packing size for G119877119875

(119889 119888) as a function of119888 for 119889 = 2 Blue line corresponds to the RS analytical value (51) redline is from the Karp-Sipser algorithm and dotted black line is the1RSB solution (54) with 119910 ≫ 1 Karp-Sipser and 1RSB cavitymethodyield the same and exact result

The 1RSB free energy as a function of the Parisi parameter119910 takes the form

120601 (119910) = minus

119889

119910119888

E log[

[

(1 minus 119890119910

)

119862

prod

119895=1

(1 minus

119889minus1

prod

119904=1

119902119904119895) + 119890

119910]

]

+

119889 minus 1

119910

E log[(1 minus 119890119910

)(1 minus

119889

prod

119903=1

119902119903) + 119890

119910

]

(54)

As prescribed by the cavity method the value 119910119904which

maximizes 120601(119910) over [0 +infin] yields the correct free energytherefore we have 120601(119910

119904) = minus120588

1RSBUnsurprisingly as they belong to different computational

classes the cases 119889 = 2 and 119889 ge 3 show qualitatively differentpictures In the case ofmaximummatching on the Poissoniangraph ensemble numerical estimates suggest that complexityis an increasing function of 119910 on the whole real positive axisCorrect choice for parameter 119910 is then 119910

119904= +infin as already

conjectured in [12] and we find that maximum matchingsize prediction from 1RSB cavity method fully agrees withrigorous results from Karp and Sipser [8] and with the sizeof the matchings given by their algorithm (see Figure 4) The1RSB ansatz is therefore exact for 119889 = 2

The 119889 ge 3 case analysis does not yield such a definiteresult The complexity of states Σ is no more a strictlyincreasing function of 119910 It reaches its maximum in 119910

119889 the

choice of 119910 that selects the most numerous states whichcould be those where local search greedy algorithms aremorelikely to be trapped Then it decreases up to the finite value119910119904where complexity changes sign and takes negative values

Therefore 119910119904is the correct choice for the Parisi parameter

which maximizes 120601(119910) Plotted as a function of 120588 complexity

0

0005

001

0015

002

0025

0218 022 0222 0224 0226 0228 023 0232 0234 0236Set packing size (120588)

Com

plex

ity (Σ

)

minus0015

minus001

minus0005

Figure 5 Complexity in G119877119875

(119889 119888) with 119889 = 3 and 119888 = 40 as afunction of the relative MSP size 120588

0

01

02

03

04

05

1 2 3 4 5 6 7 8 9 10

RS1RSBGKS

Mean factor degree (c)

Set p

acki

ng si

ze (120588

)

e2

N = 100

Figure 6MSP size inG119877119875

(3 119888) as a function of 119888 RS and 1RSB cavitymethod is confronted with the GKS algorithm and with an exactalgorithm on factor graphs of 100 variable nodes

Σhas a convex nonphysical part with extrema theRS solution(on the right) and the point corresponding to the dynamic1RSB solution (on the left) and a concave physically relevantfor 119910 isin [119910

119889 119910

119904] (see Figure 5) The 1RSB seems to be in

very good agreement with the exact algorithm and we areinclined to believe that no further steps of replica symmetrybreaking are needed in this ensemble The GKS algorithminstead falls short of the exact value (see Figure 6) thereforeit constitutes a lower bound which is not strict but at least itcould probably be made rigorous carrying on the analysis ofthe GKS algorithm beyond phase 1

72 G119875119877

(119889119888) The ensemble G119875119877

(119889 119888) is constitute fac-tor graphs containing factor nodes of degree fixed to 119888

and variable nodes of Poissonian random degree of mean119889 It has statistical properties with respect to the MSPproblem quite different from those encountered in G

119877119875

as we will readily show The MSP problem on G119875119877

(119889 119888)

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 11: The Statistical Mechanics of Random Set Packing and a

International Journal of Statistical Mechanics 11

with 119889 = 2 is equivalent to the well known problemof maximum independent sets on Poissonian graphs [1432] The real parameter 119901 characterizing discrete supportdistributions of messages has to satisfy the fixed point (24)that reads

119901lowast= (1 minus 119890

119889(119901lowastminus1)

)

119888minus1

(55)

At fixed value of 119888 the RS ansatz holds up to the critical value119889119904(119888) which is implicitly given by first derivative condition

(27)

(119888 minus 1) 119901(119888minus2)(119888minus1)

lowast119890(119901lowastminus1)119889119904

119889119904= 1 (56)

Although critical condition equation (56) is not as elegantas the one we obtained for the ensemble G

119877119875 it can be easily

solved numerically for 119889119904as a function of 119888 For 119888 = 2 the

threshold value is exactly 119889119904(2) = 119890 For 119888 gt 2 instead 119889

119904is an

increasing function of the factor degree 119888 Thanks to (25) wereadily compute the MSP size in the RS phase

120588 =

119889

119888

(1 minus 119901119888(119888minus1)

lowast)

+ (1 minus 119901lowast119889) 119890

119889(119901lowastminus1) for 119888 lt 119888119904(119889)

(57)

We can see in Figure 7 that theMSP size 120588(119889 119888) is a decreasingfunction in both arguments as expected Equation (57) can betaken as the RS estimate for MSP size for 119889 gt 119889

119904(119888) The RS

estimate is strictly greater than the average size of SPs givenby the GKS algorithm at all values of 119889 gt 119889

119904(119888) (see Figure 7)

73 G119875119875

(119889119888) We will now briefly examine our MSP modelon the ensemble G

119875119875(119889 119888) where both factor nodes and

variable nodes have Poissonian random degrees of mean 119888

and 119889 respectively From (23) and (24) we obtain the fixedpoint condition for the probability distribution of messages

119901lowast= 119890

minus119888119890119889(119901lowastminus1)

(58)

As usual 119901 is the parameter characterizing the distributionof messages 119875(119905) = 119901120575(119905) + (1 minus 119901)120575(1 minus 119901) Equation (58)admits one and only one fixed point solution119901

lowastfor each value

of 119889 and 119888 In fact 119891 is continuous strictly decreasing and119891(0) gt 0 119891(1) lt 1 The first derivative condition |119891

1015840

(119901)| = 1

defines the critical line 119889119904(119888) through

119889119904(119888) = minus

1

119901lowastlog (119901

lowast)

(59)

The curve 119889119904(119888) separates the RS phase from the RSB phase

in the 119888 minus 119889 parametric space (see Figure 8) The unboundedRS region shares some resemblance with the corresponding(although 119889-discretized) region ofG

119875119877(119889 119888) and is at variance

with the compact area of the RS phase in G119877119875

We can compute theMSP relative size 120588 through (25) andobtain

120588 =

119889

119888

(1 minus 119901lowast) + (1 minus 119889119901

lowast) 119890

119889(119901minus1) for 119889 lt 119889119904(119888) (60)

which holds only as an approximation in the RSB phase Wecan see from Figure 8 that 120588 is decreasing both in 119888 and 119889 aswas observed in the other ensembles as well

0

02

04

06

08

1

0 05 1 15 2 25 3 35 4

Set p

acki

ng si

ze (120588

)

Mean variable degree (d)

c = 2

c = 3

c = 4

c = 5

c = 6

Figure 7 RS cavity method analytical prediction (57) for MSP sizein G

119875119877(119889 119888) (dot-dashed) compared with the set packing size from

phase 1 of the GKS algorithm (dashed lines) and with a completerun of the algorithm (continuous lines)

0

2

4

6

8

10

0 2 4 6 8 10

IsolinesPhase boundary

0

01

02

03

04

05

06

07

08

09

1

Mea

n Va

r deg

ree (

d)

Mean Fun degree (c)

Figure 8 MSP relative size for the ensemble G119875119875

(119889 119888) as given by(60) Above the boundary separating RS and RSB phase values areonly approximation to the exact value of 120588

74 G119877119877

(119889119888) The MSP problem on the ensemble G119877119877

(119889 119888)the straightforward generalization to factor graphs of therandom regular graphs ensemble poses some simplificationto the cavity formalism thanks to his homogeneity It hasalready been the object of preliminary studies by Weigt andHartmann [16] and then a much more deep work of Hansen-Goos andWeigt [17] who disguised it as a hard spheres modelon a generalized Bethe lattice The authors studied throughthe cavity method this hard spheres model on G

119877119877(119889 119888) both

at finite chemical potential 120583 and in the close packing limitand found out that the 1RSB solution is unstable in the close

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 12: The Statistical Mechanics of Random Set Packing and a

12 International Journal of Statistical Mechanics

packing limit therefore suggesting the need of a full RSBtreatment of the problem

8 Conclusions

We studied the average asymptotic behaviour of randominstances of the maximum set packing problem both froma mathematical and a physical viewpoint We contributedto the known list of models where the replica symmetriccavity method can be proven to give exact results thanksto the generalization of an algorithm (and of its analysis)first proposed by Karp and Sipser [8] Moreover our analysisaddress a problem reported in recent work on weighted max-imum matchings and independent sets on random graphs[22] where the authors could not extend their results tothe unweighted cases We achieve here the desired resultmaking use the grand canonical potential instead of the directcomputation of single variable expectations We also extendtheir condition for the system to be in what physicists calla replica symmetric phase namely the uniqueness of thefixed point of the square of a certain operator (which is theanalogue of the one defined in (42)) to the more generalsetting of maximum set packing (although without weights)On some problem ensembles where the assumptions ofTheorems 2 and 3 no longer hold and the RS cavity methodfails we used the 1RSB cavity methodmachinery to obtain ananalytical estimation of the MSP size Numerical simulationsshow very good agreement of the 1RSB estimation with theexact values although comparisons have been done only withsmall random problems due to the exact algorithm beingof exponential time complexity The GKS algorithm insteadgenerally fails to find any MSPs except for some specialinstances of the problem

Some questions remain open for further investigationTo validate the 1RSB approach the stability of the 1RSBsolution has to be checked against more steps of replicasymmetry breaking Moreover a thorough analysis of thesecond phase of the GKS algorithm could shed some lighton the mechanism of replica symmetry breaking and give arigorous lower bound to the average maximum packing size

Regarding the problem of looking for optimal solutionsin single samples we believe that an efficient heuristicalgorithm able to obtain near-to-optimal configurations alsoin the RSB phase could be constructed following the ideas of[33]

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This research has received financial support from the Euro-pean Research Council (ERC) through Grant agreement no247328 and from the Italian Research Minister through theFIRB project no RBFR086NN1

References

[1] S Micali and V V Vazirani ldquoAn O(|V||v||c|E||) algoithm forfinding maximum matching in general graphsrdquo in Proceedingsof the 21st Annual IEEE Symposium on Foundations of ComputerScience pp 17ndash27 1980

[2] N Alon J-H Kim and J Spencer ldquoNearly perfect matchings inregular simple hypergraphsrdquo Israel Journal of Mathematics vol100 pp 171ndash187 1997

[3] Z Furedi J Kahn and PD Seymu ldquoOn the fractionalmatchingpolytope of a hypergraphrdquoCombinatorica vol 13 no 2 pp 167ndash180 1993

[4] Y H Chan and L C Lau ldquoOn linear and semidefinite pro-gramming relaxations for hypergraphmatchingrdquoMathematicalProgramming vol 135 pp 123ndash148 2011

[5] V Rodl and L Thoma ldquoAsymptotic packing and the randomgreedy algorithmrdquo Random Structures amp Algorithms vol 8 no3 pp 161ndash177 1996

[6] M M Halldorsson ldquoApproximations of weighted independentset and hereditary subset problemsrdquo Journal of Graph Algo-rithms and Applications vol 4 no 1 pp 1ndash6 2000

[7] E Hazan S Safra and O Schwartz ldquoOn the complexity ofapproximating K-set packingrdquo Computational Complexity vol15 no 1 pp 20ndash39 2006

[8] R M Karp and M Sipser ldquoMaximum matchings in sparserandom graphsrdquo in Proceedings of the 22nd Annual Symposiumon Foundations of Computer Science vol 12 pp 364ndash375 IEEEComputer Society 1981

[9] N C Wormald ldquoThe differential equation method for randomgraph processes and greedy algorithmsrdquo in Lectures on Approx-imation and Randomized Algorithms pp 73ndash155 1999

[10] G Parisi and M Mezard ldquoReplicas and optimizationrdquo Journalde Physique Lettres vol 46 no 17 pp 771ndash778 1985

[11] G Parisi and M Mezard ldquoA replica analysis of the travellingsalesman problemrdquo Journal de Physique vol 47 pp 1285ndash12961986

[12] H Zhou and Z-C Ou-Yang ldquoMaximummatching on random-graphsrdquo In press httparxivorgabscond-mat0309348

[13] L Zdeborova and M Mezard ldquoThe number of matchings inrandom graphsrdquo Journal of Statistical Mechanics vol 2006Article ID P05003 2006

[14] L Dallasta A Ramezanpour and P Pin ldquoStatistical mechanicsofmaximal independent setsrdquo Physical Review E vol 80 ArticleID 061136 2009

[15] M Mezard and M Tarzia ldquoStatistical mechanics of the hittingset problemrdquo Physical Review E vol 76 no 4 Article ID 04112410 pages 2007

[16] M Weigt and A K Hartmann ldquoGlassy behavior inducedby geometrical frustration in a hard-core lattice-gas modelrdquoEurophysics Letters (EPL) vol 62 Article ID 021005 p 5332003

[17] H Hansen-Goos and M Weigt ldquoA hard-sphere model on gen-eralized bethe lattices staticsrdquo Journal of Statistical Mechanicsvol 2005 Article ID P04006 2005

[18] F R Kschischang B J Frey and H-A Loeliger ldquoFactorgraphs and the sum-product algorithmrdquo IEEE Transactions onInformation Theory vol 47 no 2 pp 498ndash519 2001

[19] AMontanari andMMezard Information Physics and Compu-tation Oxford University Press 2009

[20] G Parisi MMezard andM A Virasoro Spin GlassTheory andBeyond World Scientific Singapore 1987

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 13: The Statistical Mechanics of Random Set Packing and a

International Journal of Statistical Mechanics 13

[21] M Mezard and G Parisi ldquoThe cavity method at zero tempera-turerdquo Journal of Statistical Physics vol 22 no 1-2 pp 1ndash34 2003

[22] D Gamarnik T Nowicki and G Swirszcz ldquoMaximum weightindependent sets andmatchings in sparse randomgraphs Exactresults using the local weak convergence methodrdquo RandomStructures and Algorithms vol 28 no 1 pp 76ndash106 2006

[23] A Montanari G Parisi and F Ricci-Tersenghi ldquoInstabilityof one-step replica-symmetry-broken phase in satisfiabilityproblemsrdquo Journal of Physics A vol 37 no 6 p 2073 2004

[24] F Krzakala AMontanari F Ricci-Tersenghi G Semerjian andL Zdeborova ldquoGibbs states and the set of solutions of randomconstraint satisfaction problemsrdquo Proceedings of the NationalAcademy of Sciences of the United States of America vol 104 no25 pp 10318ndash10323 2007

[25] MMezard F Ricci-Tersenghi and R Zecchina ldquoTwo solutionsto diluted p-spin models and XORSAT problemsrdquo Journal ofStatistical Physics vol 111 no 3-4 pp 505ndash533 2002

[26] G Semerjian ldquoOn the freezing of variables in random con-straint satisfaction problemsrdquo Journal of Statistical Physics vol130 no 2 pp 251ndash293 2008

[27] MMezard and G Parisi ldquoThe bethe lattice spin glass revisitedrdquoTheEuropean Physical Journal B vol 20 no 2 pp 217ndash233 2001

[28] S Tsukiyama M Ide H Ariyoshi and I Shirakawa ldquoA newalgorithm for generating all the maximal independent setsrdquoSIAM Journal on Computing vol 6 no 3 pp 505ndash517 1977

[29] G Csardi and T Nepusz ldquoThe igraph software package forcomplex network researchrdquo InterJournal Complex Systems p1695 2006

[30] N C Wormald ldquoModels of random regular graphsrdquo in Surveysin Combinatorics London Mathematical Society Lecture Notepp 239ndash298 1999

[31] S Takabe andKHukushima ldquoMinimumvertex cover problemson random hypergraphs replica symmetric solutionand a leafremoval algorithmrdquo In press httparxivorgabs13015769

[32] S Sanghavi D Shah and A S Willsky ldquoMessage passingfor maximum weight independent setrdquo IEEE Transactions onInformation Theory vol 55 no 11 pp 4822ndash4834 2009

[33] M Weigt and H Zhou ldquoMessage passing for vertex coversrdquoPhysical Review E vol 74 Article ID 046110 19 pages 2006

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of

Page 14: The Statistical Mechanics of Random Set Packing and a

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

High Energy PhysicsAdvances in

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

FluidsJournal of

Journal of Atomic and Molecular PhysicsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in Condensed Matter Physics

OpticsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Astronomy

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Superconductivity

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Statistical MechanicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

GravityJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

AstrophysicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Physics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Solid State PhysicsJournal of

 Computational  Methods in Physics

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Soft MatterJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

AerodynamicsJournal of

Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

PhotonicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Biophysics

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ThermodynamicsJournal of