Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
spectrahedral lifts of combinatorial polytopes
James R. LeeUniversity of Washington
Prasad Raghavendra (UC Berkeley) David Steurer (Cornell)
combinatorial optimization
Traveling Salesman Problem:Given 𝑛 cities {1,2, … , 𝑛} and costs 𝑐𝑖𝑗 ≥ 0 for traveling between cities𝑖 and 𝑗, find the permutation 𝜋 of 1,2, … , 𝑛 that minimizies
𝑐𝜋 1 𝜋 2 + 𝑐𝜋 2 𝜋 3 + ⋯+ 𝑐𝜋 𝑛 𝜋 1
Gian Carlo Rota, 1969
combinatorial optimization
Traveling Salesman Problem:Given 𝑛 cities {1,2, … , 𝑛} and costs 𝑐𝑖𝑗 ≥ 0 for traveling between cities𝑖 and 𝑗, find the permutation 𝜋 of 1,2, … , 𝑛 that minimizies
𝑐𝜋 1 𝜋 2 + 𝑐𝜋 2 𝜋 3 + ⋯+ 𝑐𝜋 𝑛 𝜋 1
TSP𝑛 = conv 1𝜏 ∈ 0,1𝑛2 ∶ 𝜏 is a tour ⊆ ℝ
𝑛2
Can find an optimal tour by minimizing a linear function over TSP𝑛: min 𝑐, 𝑥 ∶ 𝑥 ∈ TSP𝑛
Problem: TSP𝑛 has exponentially (in 𝑛) many facets.
One can tell the same (short) story for many polytopes associated to NP-complete problems.
polyhedral lifts
Minimum Spanning Tree:Given 𝑛 cities {1,2, … , 𝑛} and costs 𝑐𝑖𝑗 ≥ 0 between cities 𝑖 and 𝑗, find a spanning tree of minimum cost.
ST𝑛 = conv 1𝜏 ∈ 0,1𝑛2 ∶ 𝜏 is a spanning tree
Again, has exponentially (in 𝑛) many facets.
There is a lift of ST𝑛 in 𝑛3 dimensions with only 𝑂 𝑛3 facets.
A lift 𝑄 of 𝑃 ⊆ ℝ𝑑 is a polytope 𝑄 ⊆ ℝ𝑁 for 𝑁 ≥ 𝑑 such that 𝑄 linearly projects to 𝑃.If we can optimize over 𝑄, then we can optimize over 𝑃:
max 𝑐, 𝑥 ∶ 𝑥 ∈ 𝑃 = max 𝑐, 𝑥 ∶ 𝑥 ∈ 𝑄
example: the permutahedron
Permutahedron:Π𝑛 ⊆ ℝ𝑛
Convex hull of vectors (𝑖1, 𝑖2, … , 𝑖𝑛) such that 𝑖1, 𝑖2, … , 𝑖𝑛 = 1,2, … , 𝑛 .
Number of facets is 2𝑛 − 2.
Let Π𝑛 ⊆ ℝ𝑛2 be the convex hull of permutation matrices.
The map 𝐴 ↦ 1, 2, … , 𝑛 𝐴 is a linear projectionfrom Π𝑛 onto Π𝑛 (i.e., Π𝑛 is a lift of Π𝑛)
Birkhoff-von Neumann: Π𝑛 is the set of matrices 𝐴 = 𝑎𝑖𝑗
satisfying 𝑎𝑖𝑗 ≥ 0 for all 𝑖, 𝑗 with all row and column sums equal one. Thus Π𝑛 has 𝑛2 facets.
[Goemans 2013]: Smallest lift has Θ 𝑛 log𝑛 facets.
a general model of (small) linear programs
Powerful model of computation:For a polytope 𝑃 ⊆ ℝ𝑑 , what is the minimal # of facets in a lift of 𝑃?
Even more powerful when we allow approximation:
Indication of power:Integrality gaps for LPs often lead to NP-hardness of approximation.
Polynomial-size LPs for NP-hard problems would show that NP ⊆ P/poly. [Rothvoss 2013]
an analog for semi-definite programs
Semidefinite programs (aka spectrahedral lifts):
Let 𝒮𝑘+ denote the cone of 𝑘 × 𝑘 symmetric, positive semidefinite matrices.
A spectrahedron is the intersection 𝒮𝑘+ ∩ ℒ for some affine subspace ℒ.
This is precisely the feasible region of an SDP.
Definition: A polytope 𝑃 admits a PSD lift of size 𝑘 if 𝑃 is a linear projection of a spectrahedron 𝒮𝑘
+ ∩ ℒ.• Easy to see that minimal size of PSD lift is ≤ minimal size of polyhedral lift
• Assuming the Unique Games Conjecture, integrality gaps for SDPs translatedirectly into NP-hardness of approximation results.
[Khot-Kindler-Mossel-O’Donnell 2004, Austrin 2007, Raghavendra 2008]
a brief history of extended formulations
1980s: Swart tries to prove that 𝑃 = 𝑁𝑃 by giving a linear program for TSP.
1989: Yannakakis (the referee for Swart’s submissions) shows that every symmetric LP for TSP must have exponential size.
. . .
2012: Fiorini, Massar, Pokutta, Tiwary, de Wolf show that every LP for TSPmust have exponential size.
2013: Chan, L, Ragahvendra, Steurer show that no polynomial-size LPcan approximate the MAX-CUT problem within a factor better than 2.[Goemans-Williamson 1998: SDPs can do factor ≈ 1.139.]
2014: Rothvoss shows that every LP for the matching polytope must haveexponential size.
????: Spectahedral lifts?[Briët-Dadush-Pokutta 2014]: Random {0,1}-polytopes do not havePSD lifts of size 𝑒𝑜 𝑛 .
our results [L-Raghavendra-Steurer]
Lower bounds on PSD lift size
The TSP𝑛, CUT𝑛, and STAB 𝐺𝑛 polytopes do not admit PSD lifts of size 𝑐𝑛2/11
(for some constant 𝑐 > 1 and some family {𝐺𝑛} of 𝑛-vertex graphs)
Lower bounds on PSD rank via sums-of-squares / Lasserre hierarchy
Approximation hardness for constraint satisfaction problems
For instance, no family of polynomial-size SDP relaxations can achieve betterthan a 7/8-approximation for MAX 3-SAT.
slack matrices & psd factorizations
Consider a non-negative matrix 𝑀 ∈ ℝ+𝑚×𝑛
rank 𝑀 = min { 𝑟 ∶ 𝑀𝑖𝑗 = 𝑢𝑖 , 𝑣𝑗 , 𝑢𝑖 , 𝑣𝑗 ⊆ ℝ𝑟 }
rank+ 𝑀 = min { 𝑟 ∶ 𝑀𝑖𝑗 = 𝑢𝑖 , 𝑣𝑗 , 𝑢𝑖 , 𝑣𝑗 ⊆ ℝ+𝑟 }
rankpsd 𝑀 = min { 𝑟 ∶ 𝑀𝑖𝑗 = 𝑈𝑖 , 𝑉𝑗 , 𝑈𝑖 , 𝑉𝑗 ⊆ 𝒮+𝑟 }
where 𝑈, 𝑉 = Tr(𝑈𝑇𝑉) for 𝑈, 𝑉 ∈ 𝒮+𝑟
Suppose 𝑃 ⊆ ℝ𝑑 is a polytope defined by
The corresponding slack matrix 𝑀 ∈ ℝ𝑚×𝑛 is given by 𝑀𝑖𝑗 = 𝑏𝑖 − ⟨𝑎𝑖 , 𝑥𝑗⟩
𝑃 = 𝑥 ∶ 𝑎𝑖 , 𝑥 ≤ 𝑏𝑖 ∶ 𝑖 = 1,… ,𝑚 , and𝑃 = conv 𝑥1, … , 𝑥𝑛
Yannakakis Factorization Theorem[Yannakakis 1989, Fiorini-Massar-Pokutta-Tiwary-de Wolf, Gouveia-Parrilo-Thomas 2011]:
The minimum size of a polyhedral (resp., PSD) lift for 𝑃 is precisely rank+ 𝑀(resp., rankpsd(𝑀)) for any slack matrix 𝑀 of 𝑃.
polytopes as a set of “theorems”
Yannakakis Factorization Theorem[Yannakakis 1989, Fiorini-Massar-Pokutta-Tiwary-de Wolf, Gouveia-Parrilo-Thomas 2011]:
The minimum size of a polyhedral (resp., PSD lift) for 𝑃 is precisely rank+ 𝑀(resp., rankpsd(𝑀)) for any slack matrix 𝑀 of 𝑃.
Polytope 𝑃 is a collection of valid linear inequalities.
Farkas’ Lemma: Every valid linear inequality is a coniccombination of defining inequalities.
Polyhedral lift size: Smallest set of “axioms” that generate all valid inequalities for 𝑃. But we are allowedto use auxiliary variables.
For the rest of the talk, restriction attention to the polytope CUT𝑛 ⊆ ℝ𝑛2 :
Convex hull of cuts in the complete graph.
valid linear inequalities for CUT𝑛
Let 𝑓 ∶ ℝ𝑛 → ℝ be a quadratic polynomial such that the restriction 𝑓 0,1 𝑛
is non-negative.
For instance: 𝑓 𝑥 = 𝑥1 + 𝑥2 + ⋯+ 𝑥𝑛 − 1 2 or
𝑓 𝑥 = 𝑥1 + 𝑥2 + ⋯+ 𝑥𝑛 −𝑛
2
2−
1
4for 𝑛 odd
Then the inequality 𝑓 𝑥 ≥ 0 ∶ 𝑥 ∈ ℝ𝑛 is a valid “linear” inequality, and every valid inequality is of this form.
[Can consider this as a linear inequality in 𝑥 ⊗ 𝑥.]
Let 𝒬 be a subspace of 𝐿2 0,1 𝑛 , and define
sos 𝒬 = cone 𝑞2 ∶ 𝑞 ∈ 𝒬 ⊆ 𝐿2 0,1 𝑛
minimal size of PSD lift for CUT𝑛 ≈ min{dim𝒬 ∶ sos(𝒬) ⊇ QML+𝑛 }
QML+𝑛 = 𝑓 ∶ 0,1 𝑛 → ℝ+ ∶ deg 𝑓 ≤ 2
minimal size of polyhedral lift for CUT𝑛 = min{ 𝒬 ∶ sos(𝒬) ⊇ QML+𝑛 }
sums of (low-degree) squares
For 𝑓 ∶ 0,1 𝑛 → ℝ+, define:where 𝒫𝑑 is the space of degree ≤ 𝑑 multi-linear polynomials on 0,1 𝑛
degsos(𝑓) = min 𝑑 ∶ 𝑓 ∈ sos 𝒫𝑑
Easy: degsos 𝑓 ≤ 𝑛. (Trivial case of the Krivine-Stengle Positivstellensatz.)
Fix a function 𝑔: 0,1 𝑚 → ℝ+ and a number 𝑛 ≥ 𝑚.
For every subset 𝑆 ⊆ {1,… , 𝑛} with 𝑆 = 𝑚, let 𝑔𝑆 𝑥 = 𝑔(𝑥 𝑆).
Let ℱ 𝑔, 𝑛 = { 𝑔𝑆 ∶ 0,1 𝑛 → ℝ+ ∶ 𝑆 = 𝑚 }.
Theorem: For any 𝑔: 0,1 𝑚 → ℝ+ and 𝑛 ≥ 2𝑚,
1 + 𝑛𝑑 ≥ min dim𝒬 ∶ ℱ 𝑔, 𝑛 ⊆ sos 𝒬 ≥ 𝐶𝑛
log 𝑛
𝑑−12
where 𝑑 = degsos(𝑔) and 𝐶 = 𝐶(𝑔) > 0.
a “hard” quadratic function
Theorem [Grigoriev 2001]:For every odd integer 𝑚 ≥ 1, the function 𝑔 ∶ 0,1 𝑚 → ℝ+ given by
𝑔 𝑥 = 𝑥1 + ⋯+ 𝑥𝑚 −𝑚
2
2
−1
4
has degsos 𝑔 ≥𝑚+1
2.
Main Theorem [L-Raghavendra-Steurer 2015]: For any 𝑔: 0,1 𝑚 → ℝ+ and 𝑛 ≥ 2𝑚,
where 𝑑 = degsos(𝑔) and 𝐶 = 𝐶(𝑔) > 0.
1 + 𝑛𝑑 ≥ min dim𝒬 ∶ ℱ 𝑔, 𝑛 ⊆ sos 𝒬 ≥ 𝐶𝑛
log 𝑛
𝑑−12
Corollary: The minimal dimension of a PSD lift of CUT𝑛 grows faster thanany polynomial.
proof outline
Let 𝒬 ⊆ 𝐿2( 0,1 𝑛) be a subspace of dimension 𝑘.
Then sos 𝒬′ ⊆ sos 𝒫𝑂 log 𝑘∼
(with respect to a certain class of test functionals)
Approximation:
Degree reduction:
For a uniformly random subset 𝑆 ⊆ {1,2, … , 𝑛} with 𝑆 = 𝑚,with high probability sos 𝒬 𝑆 ⊆ sos 𝒫log 𝑘
log 𝑛
∼
sos 𝒬 ≈ sos(𝒬′) for some “well-conditioned” 𝒬′ with dim𝑄 = dim𝒬′
[John’s theorem—factorization through a cone—via Briët-Dadush-Pokutta 2014]
Pre-processing:
quantum learning
𝒬
simple approximator
𝒬 = 𝒬𝑇
𝒬0 = span(𝟏)
dim 𝒬 small ⇒ 𝒬 has small von-Neumann entropy⇒ learning cannot go on for too long
quantum learning
Consider a map 𝑄 ∶ 0,1 𝑛 → 𝒮+𝑘 with 𝔼𝑥 Tr 𝑄 𝑥 = 1
Theorem:
Want to approximate 𝑄 by a simpler map 𝑄 (s.t. 𝔼𝑥Tr 𝑄 𝑥 = 1) with respect to a class 𝒯 of test functionals Λ ∶ 0,1 𝑛 → 𝒮𝑘:
𝔼𝑥 Tr Λ 𝑥 𝑄 𝑥 − 𝑄 𝑥 ≤ 𝜀
If deg Λ ≤ 𝜅 and max𝑥
Λ 𝑥 ≤ 𝜔 for all Λ ∈ 𝒯 then there is a function
𝑅 ∶ 0,1 𝑛 → 𝒮+𝑘 with 𝔼𝑥 Tr 𝑅 𝑥 2 = 1 such that
deg 𝑅 ≤ 𝑂𝜅𝜔
𝜀𝑆 𝑈𝑄 ‖ 𝒰
𝔼𝑥 Tr Λ 𝑥 𝑄 𝑥 − 𝑅 𝑥 2 ≤ 𝜀 ∀Λ ∈ 𝒯
𝑈𝑄 = 𝔼𝑥 𝑒𝑥𝑒𝑥𝑇 ⊗ 𝑄 𝑥
𝒰 = Id/Tr(Id)
𝑆 𝐴 𝐵) = Tr 𝐴 log𝐴 − log𝐵
construction of 𝑄
Consider a map 𝑄 ∶ 0,1 𝑛 → 𝒮+𝑘 with 𝔼𝑥 Tr 𝑄 𝑥 = 1
Theorem:
Want to approximate 𝑄 by a simpler map 𝑄 (s.t. 𝔼𝑥Tr 𝑄 𝑥 = 1) with respect to a class 𝒯 of test functionals Λ ∶ 0,1 𝑛 → 𝒮𝑘:
𝔼𝑥 Tr Λ 𝑥 𝑄 𝑥 − 𝑄 𝑥 ≤ 𝜀
minimize 𝑆(𝑈𝑄 ‖ 𝒰) subject to these constraints𝑈𝑄 = 𝔼𝑥 𝑒𝑥𝑒𝑥
𝑇 ⊗ 𝑄 𝑥
Optimality dictates: 𝑄∗ 𝑥 ∝ exp
Λ∈𝒯
𝑐Λ Λ 𝑥
If one can achieve Λ 𝑐Λ small, then the Taylor series of theexponential can be truncated at a low degree.
future directions
Work modulo ideals other than 𝑋𝑖2 − 𝑋𝑖 ∶ 𝑖 = 1,… , 𝑛
Perfect matchingsCliques in 𝐺 𝑛, 𝑝
Sums of squares of sparse polynomials:
Can one find an explicit 𝑓 ∶ 0,1 𝑛 → ℝ+ such that 𝑓 ∉ sos(𝒱𝑠) ?
Let 𝒱𝑠 be the subset of multi-linear polynomials on 0,1 𝑛 thathave at most 𝑠 non-zero monomials.