15
This article was downloaded by: [New York University] On: 25 November 2013, At: 22:52 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Linear and Multilinear Algebra Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/glma20 On matrix algebras associated to sum- of-squares semidefinite programs Kristijan Cafuta a a Univerza v Ljubljani, Fakulteta za elektrotehniko, Laboratorij za uporabno matematiko, Ljubljana, Slovenia. Published online: 17 Jan 2013. To cite this article: Kristijan Cafuta (2013) On matrix algebras associated to sum-of- squares semidefinite programs, Linear and Multilinear Algebra, 61:11, 1496-1509, DOI: 10.1080/03081087.2012.758261 To link to this article: http://dx.doi.org/10.1080/03081087.2012.758261 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

On matrix algebras associated to sum-of-squares semidefinite programs

Embed Size (px)

Citation preview

Page 1: On matrix algebras associated to sum-of-squares semidefinite programs

This article was downloaded by: [New York University]On: 25 November 2013, At: 22:52Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Linear and Multilinear AlgebraPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/glma20

On matrix algebras associated to sum-of-squares semidefinite programsKristijan Cafutaa

a Univerza v Ljubljani, Fakulteta za elektrotehniko, Laboratorij zauporabno matematiko, Ljubljana, Slovenia.Published online: 17 Jan 2013.

To cite this article: Kristijan Cafuta (2013) On matrix algebras associated to sum-of-squares semidefinite programs, Linear and Multilinear Algebra, 61:11, 1496-1509, DOI:10.1080/03081087.2012.758261

To link to this article: http://dx.doi.org/10.1080/03081087.2012.758261

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra, 2013Vol. 61, No. 11, 1496–1509, http://dx.doi.org/10.1080/03081087.2012.758261

On matrix algebras associated to sum-of-squares semidefiniteprograms

Kristijan Cafuta∗

Univerza v Ljubljani, Fakulteta za elektrotehniko, Laboratorij za uporabno matematiko,Ljubljana, Slovenia

Communicated by by M. Chebotar

(Received 21 November 2012; final version received 9 December 2012)

To each semidefinite program (SDP) in primal form, we associate the matrixalgebra generated by its constraint matrices. In this note, we show that thisalgebra is always a full matrix algebra for SDPs arising from (commutativeor non-commutative) sum of squares (SOS) problems. For SDPs arising fromnon-commutative SOS and commutators problems, the situation is less clear. Weidentify an exceptional case, where the corresponding matrix algebra is not the fullmatrix algebra, and use it to reprove the Burgdorf–Klep non-commutative variantof Hilbert’s ternary quartics theorem: a bivariate non-commutative polynomial ofdegree at most 4 is trace positive if it is a sum of four squares and commutators.

Keywords: semidefinite program; sum of squares; matrix algebra

AMS Subject Classifications: Primary 13J30; 15B99; Secondary 90C22

1. Introduction

1.1. Semidefinite programming

Semidefinite programming (SDP) is a subfield of convex optimization concerned with theoptimization of a linear objective function over the intersection of the cone of positivesemidefinite matrices with an affine space. More precisely, given s × s self-adjoint matricesC, A1, . . . , Am of the same size over R and a vector b ∈ Rm , we formulate a SDP instandard primal form as follows:

inf 〈C, G〉s. t. 〈Ai , G 〉 = bi , i = 1, . . . , m

G � 0.

(PSDP)

Here 〈·, ·〉 stands for the standard inner product of matrices: 〈A, B〉 = tr(B∗ A), and G � 0means that G is positive semidefinite.

The dual problem to (PSDP) is the SDP in the standard dual form

sup 〈b, y〉s. t.

∑i

yi Ai � C. (DSDP)

Here y ∈ Rm , and the difference C − ∑i yi Ai is usually denoted by Z .

∗Email: [email protected]

© 2013 Taylor & Francis

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 3: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1497

Many problems in control theory, system identification and signal processing can beformulated using SDPs.[1,2] Combinatorial optimization problems can often be modelledor approximated by SDPs.[3] We will focus on SDP’s role in real algebraic geometry, whereit is used e.g. for finding sums of squares decompositions of polynomials.[4–6]

Recently, the applicability of SDPwas spurred by the development of practically efficientmethods to obtain optimal solutions. If the problem is of medium size (i.e. s ≤ 1000 andm ≤ 10 000), these packages are based on interior-point methods, while packages for largerSDPs use some variant of the first-order methods (see [7] for a comprehensive list of state ofthe art SDP solvers). Nevertheless, once s ≥ 3000 or m ≥ 250 000, the problem must sharesome special property; otherwise, state-of-the-art solvers will fail to solve it for complexityreasons. One way of reducing the size of an SDP is by using symmetries, cf. [8,9]. Analternative is to block diagonalize the constraint matrices A j from (PSDP) and (DSDP),i.e. study the matrix algebra A generated by A1, . . . , Am . We do this for SDPs arising from(commutative or non-commutative) sum of squares (SOS) problems.

1.1.1. Contribution and reader’s guide.

The contribution of this note is twofold. First, the matrix algebra A generated by the s × sconstraint matrices A j in a SOS (commutative or non-commutative) SDP is always the fullmatrix algebra Ms(R) (see Section 2 below). Second, the situation seems to be differentfor SDPs arising from non-commutative SOS and commutators problems. We identify anexceptional case (Section 3), and use it to reprove the Burgdorf–Klep [10] non-commutativevariant of Hilbert’s ternary quartics theorem (see Section 4): a bivariate non-commutativepolynomial of degree 4 is trace positive iff it is a sum of four squares and commutators.

1.1.2. Motivation

This note was motivated by the insight of [11] that the complexity of SDPs is governed bythe C∗-algebra A generated by the A j . Note that by the Artin–Wedderburn theorem, eachC∗-subalgebra A of Ms(C) can be uniquely decomposed as a finite direct sum of matrixalgebras, that is,

A = r⊕i=1

Mni (C). (1)

If the matrices A j in (PSDP) pairwise commute, then they can be simultaneously diagonal-ized, so the algebra A they generate is just the direct sum of copies of C. Such SDPs areparticularly easy to solve since they are really instances of linear programming problems. Ingeneral, the smaller the ‘algebraic complexity’ of (PSDP) (i.e. the smaller the maximal ni

in (1)), the easier it can be solved. Indeed, by block diagonalizing the matrices A j accordingto (1) we can employ computational improvements for block SDPs, cf. [12].

Loosely speaking, in this note, we shall show that SDPs associated to SOS problems(in two settings: the classical commutative, and the nc setting) are as bad as they get: thealgebras their constraint matrices generate are full matrix algebras, thus do not allow for anon-trivial block diagonalization.

1.2. Sum of squares

One of the main interests in real algebraic geometry is the study of positivity of realmultivariate polynomials which appear in numerous problems of applied mathematics andoptimization.[4,5,13–15]

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 4: On matrix algebras associated to sum-of-squares semidefinite programs

1498 K. Cafuta

Fix n ∈ N and let R[x] be a ring of real polynomials in n (commuting) variablesx1, . . . , xn . The set of all monomials of degree ≤ d will be denoted by [x]≤d . A polynomialp ∈ R[x] is non-negative (PSD) if and only if p(a) ≥ 0 for all a ∈ Rn . We say thatpolynomial p is a SOS if and only if there exists m ∈ N and polynomials pi ∈ R[x], 1 ≤i ≤ m, such that p = ∑m

i=1 p2i . It is obvious that every polynomial which is SOS is

PSD. But, there exist polynomials which are PSD and not SOS (among the most famous isMotzkin’s polynomial x4 y2 + x2 y4 − 3x2 y2 + 1).

The problem whether a multivariate polynomial is PSD, is well known to be NP-hard,see e.g. [16]. On the other hand, the SOS is on the tractable level since it is equivalent tothe question whether the underlying SDP is feasible.[15] This is based on the Gram matrixmethod:

Proposition 1.1 Let f ∈ R[x], and let V be the vector of all monomials v satisfying2 deg(v) ≤ deg( f ). Then, f is SOS if and only if there exists a positive semidefinite matrixG f (called a Gram matrix for f ) such that f = V t G f V .

1.3. Sum of hermitian squares

Unlike classical real algebraic geometry where real polynomial rings in commuting vari-ables are the objects of study, free real algebraic geometry deals with real polynomials innon-commuting (nc) variables and their finite-dimensional representations. Of interest arenotions of positivity induced by these. For instance, positivity via positive semidefiniteness,[17–19] which can be reformulated and studied using sums of hermitian squares and SDP.A nice survey on connections to control theory, systems engineering and optimization isgiven by de Oliveira et al. [20]. Applications to quantum physics are explained by Pironioet al. [21] who also consider computational aspects related to nc SOS.

Fix n ∈ N and let 〈X〉 be the set of words in the n non-commuting letters X1, . . . , Xn

(including the empty word denoted by 1), i.e. 〈X〉 is the monoid freely generated by X :=(X1, . . . , Xn). The set of all words from 〈X〉 of length ≤ d will be denoted by 〈X〉≤d . Weconsider linear combinations

∑w aww with aw ∈ R, w ∈ 〈X〉 of words in the n letters

X which we call nc polynomials. The set of all nc polynomials is actually a free algebra,which we denote by R〈X〉. The length of the longest word in an nc polynomial f ∈ R〈X〉is the degree of f and is denoted by deg f . The set of all nc polynomials of degree ≤ d willbe denoted by R〈X〉≤d . If an nc polynomial f involves only two variables, we use R〈X, Y 〉instead of R〈X1, X2〉.

We equip the algebra R〈X〉 with the involution ∗ that fixes R ∪ {X} pointwise andthus reverses words, e.g. (X3

1 X2 − 3X23 X1 X2)

∗ = X2 X31 − 3X2 X1 X2

3. Hence, R〈X〉 is the∗-algebra freely generated by n symmetric letters. The involution ∗ extends naturally tomatrices (in particular, to vectors) over R〈X〉. For instance, if V = (vi ) is a (column) vectorof nc polynomials vi ∈ R〈X〉, then V ∗ is the row vector with components v∗

i . We use V t

to denote the row vector with components vi .Let Sym R〈X〉 denote the set of all symmetric elements, that is,

Sym R〈X〉 := { f ∈ R〈X〉 | f = f ∗}.An nc polynomial of the form g∗g is called a hermitian square and the set of all sums ofhermitian squares will be denoted by �2. Clearly, �2 � Sym R〈X〉.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 5: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1499

Example 1.2 The nc polynomial f (X, Y ) = 1+XY +Y X+Y 2−2Y XY +XY 2 X+Y X2Yis a sum of hermitian squares, in fact, f = (1 + Y X)∗(1 + Y X)+ (Y − XY )∗(Y − XY ). Inparticular, f (A, B) is positive semidefinite for all symmetric matrices A, B. For a concrete

example, with A =[

1 −2−2 3

]and B =

[0 11 −1

], we have

f (A, B) = I + AB + B A + B2 − 2B AB + AB2 A + B A2 B =[

18 −27−27 45

]� 0.

Testing whether a given nc polynomial f ∈ R〈X〉 is an element of �2 can be doneefficiently by using SDP.[22,23] This is the Gram matrix method, which is based onthe following proposition, [17,18] the non-commutative version of the classical result forcommuting variables; see Proposition 1.1.

Proposition 1.3 Suppose the nc polynomial f ∈ Sym R〈X〉 is of degree ≤ 2d and let Wbe the vector of all words w ∈ 〈X〉 of degree ≤ d. Then, f ∈ �2 if and only if there existsa positive semidefinite matrix G f (called a Gram matrix for f ) satisfying f = W ∗G f W .

2. Algebras associated to sums of (hermitian) squares

In this section, we prove our first main result: the algebra associated to a SOS SDP is thefull matrix algebra; see Theorem 2.3 for the commutative case, and Theorem 2.6 for thenon-commutative case.

Throughout fix the number of variables n ∈ N. We can stack all monomials from [x]≤d

using the graded lexicographic order into a column vector Vd , and similarly, words from〈X〉≤d using the graded lexicographic order into a column vector Wd . The size of thesevectors will be denoted by τ(d) and σ(d), hence

τ(d) := |Vd | =(

n + d

d

), σ (d) := |Wd | =

d∑k=0

nk = nd+1 − 1

n − 1. (2)

Elements of the vector Vd will be denoted by vd(i), i = 1, . . . , τ (d) and elements of thevector Wd will be denoted by wd(i), i = 1, . . . , σ (d). Whenever d is fixed, we will omit dand write v(i) and w(i).

2.1. Commutative sos SDPs

Definition 2.1 For a fixed d ∈ N and a given monomial w ∈ [x]≤2d , we define the matrixAcom

w of order τ(d) × τ(d):

Acomw (i, j) =

{1 if w = v(i)v( j)

0 otherwise,

and let Acomd (n) be the subalgebra of Mτ(d)(R) generated by {Acom

w | w ∈ [x]≤2d}.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 6: On matrix algebras associated to sum-of-squares semidefinite programs

1500 K. Cafuta

Example 2.2 Let d = 2 and x = (x, y). Then V2 = [1 x y x2 xy y2]t . Since xy =1 · xy = x · y = y · x = xy · 1 and xy2 = x · y2 = y · xy = xy · y = y2 · x , it follows that

Acomxy =

⎡⎢⎢⎢⎢⎢⎢⎣

0 0 0 0 1 00 0 1 0 0 00 1 0 0 0 00 0 0 0 0 01 0 0 0 0 00 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎦

, Acomxy2 =

⎡⎢⎢⎢⎢⎢⎢⎣

0 0 0 0 0 00 0 0 0 0 10 0 0 0 1 00 0 0 0 0 00 0 1 0 0 00 1 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎦

.

Theorem 2.3 For each d, n ∈ N, the algebraAcomd (n) is the full matrix algebra Mτ(d)(R).

Proof We will show that Ei, j ∈ Acomd (n) holds for every i, j = 1, . . . , τ (d) where Ei, j

are the matrix units (i.e. τ(d) × τ(d) matrices whose only non-zero entry is 1 at the (i, j)position). Obviously, E1,1 ∈ Acom

d (n) since

E1,1 = Acom1 = Acom

v(1).

For every w ∈ [x] there is at most one non-zero entry (namely, a 1) in every row andevery column of Acom

w . In particular, in the first row and column of a matrix Acomv(i) there is

exactly one 1 in the i-th position and 0 elsewhere. Therefore

Acomv(i) E1,1 = Ei,1 and E1,1 Acom

v( j) = E1, j for all i, j .

Finally,

Ei, j = Ei,1 E1, j = Acomv(i) E1,1 E1,1 Acom

v( j) = Acomv(i) Acom

v(1) Acomv( j) ∈ Acom

d (n).

2.2. Non-commutative sos SDPs

Definition 2.4 For a fixed d ∈ N and a given word w ∈ 〈X〉≤2d , we define the matrix Ancw

of order σ(d) × σ(d):

Ancw (i, j) =

{1 if w = w(i)∗w( j)

0 otherwise,

and let Ancd (n) be the subalgebra of Mσ(d)(R) generated by {Anc

w | w ∈ 〈X〉≤2d}.Example 2.5 Let d = 2 and X = (X, Y ). Then W2 = [1 X Y X2 XY Y X Y 2]t . SinceXY 2 = X∗ · Y 2 = (Y X)∗ · Y and Y XY = Y ∗ · XY = (XY )∗ · Y , then

AncXY 2 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

0 0 0 0 0 0 00 0 0 0 0 0 10 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 1 0 0 0 00 0 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

, AncY XY =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 1 0 00 0 0 0 0 0 00 0 1 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 7: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1501

Theorem 2.6 For each d, n ∈ N, the algebra Ancd (n) is the full matrix algebra Mσ(d)(R).

Proof Analogous to the Proof of Theorem 2.3 we will show that Ei, j ∈ Ancd (n) holds

for every i, j = 1, . . . , σ (d). Clearly, E1,1 ∈ Ancd (n) since

E1,1 = Anc1 = Anc

w(1).

For every w ∈ 〈X〉 there is at most one 1 and 0 elsewhere in every row and every columnof Anc

w . In particular, in the first row and column of a matrix Ancw(i) there is exactly one 1

and 0 elsewhere. In the first row the 1 is in the i-th position, but this is not necessarily trueanymore for the first column in the case d ≥ 2, since the involution reverses the order.For every w(i) there exists an �i such that w(i) = w(�i )

∗. Therefore in the first column ofAnc

w(i) the 1 is in the �i -th position, and the matrix with the sole 1 in the i-th position of thefirst column is Anc

w(�i ). Therefore,

Ei,1 = Ancw(�i )

E1,1 and E1, j = E1,1 Ancw( j) for all i, j .

Finally,

Ei, j = Ei,1 E1, j = Ancw(�i )

E1,1 E1,1 Ancw( j) = Anc

w(�i )Anc

w(1) Ancw( j). �

3. Algebras associated to sums of hermitian squares and commutators

In this section, we make our first step towards our second main result. We will exhibit anexample of a matrix algebra associated to a class of non-commutative sums of squares andcommutators SDPs which is not the full matrix algebra, see Theorem 3.10 below.

The next notation we need is cyclic equivalence [24] whose definition is motivated bythe fact that we are interested in the trace of a given nc polynomial under matrix evaluations.

Definition 3.1 An element of the form [p, q] := pq − qp, where p, q are nc polynomialsfrom R〈X〉, is a commutator. Nc polynomials f, g ∈ R〈X〉 are called cyclically equivalent( f

cyc∼ g) if f − g is a sum of commutators:

f − g =k∑

i=1

[pi , qi ] =k∑

i=1

(pi qi − qi pi ) for some k ∈ N and pi , qi ∈ R〈X〉.

It is clear thatcyc∼ is an equivalence relation. The following remark shows how to test if

given nc polynomials are cyclically equivalent.

Remark 3.2

(a) For words v,w ∈ 〈X〉, we have vcyc∼ w if and only if there are words v1, v2 ∈ 〈X〉

such that v = v1v2 and w = v2v1. That is, vcyc∼ w if and only if w is a cyclic

permutation of v.(b) Nc polynomials f = ∑

w∈〈X〉 aww and g = ∑w∈〈X〉 bww (aw, bw ∈ R) are cyclically

equivalent if and only if for each word v ∈ 〈X〉,∑w∈〈X〉w

cyc∼ v

aw =∑w∈〈X〉w

cyc∼ v

bw. (3)

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 8: On matrix algebras associated to sum-of-squares semidefinite programs

1502 K. Cafuta

Example 3.3 We have Y X2Y + X2Y XY X + 3XY XY X2 cyc∼ XY 2 X + 4Y X3Y X as

Y X2Y + X2Y XY X + 3XY XY X2 − (XY 2 X + 4Y X3Y X) == [Y X, XY ] + [3XY X, Y X2] + [X2Y X, Y X ].

Definition 3.4 Let

�2 := { f ∈ R〈X〉 | ∃g ∈ �2 : fcyc∼ g}

denote the convex cone of all nc polynomials cyclically equivalent to a sum of hermitiansquares. By definition, the elements in �2 are exactly the nc polynomials which can bewritten as sums of hermitian squares and commutators.

Example 3.5 Consider f = 3Y X2Y − 2Y XY X ∈ R〈X, Y 〉. This nc polynomial is of theform

f = (XY 2 X + Y X2Y − XY XY − Y XY X) + Y X2Y + (XY XY − Y XY X)

+ (Y X2Y − XY 2 X)

= (XY − Y X)∗(XY − Y X) + (XY )∗(XY ) + [X, Y XY ] + [Y X, XY ],hence, we have f ∈ �2 since f

cyc∼ g∗

1 g1 + g∗2 g2 for g1 = (XY − Y X) and g2 = XY . In

particular, tr( f (A, B)) ≥ 0 for all symmetric matrices A, B of the same size, but in general

f (A, B) is not positive semidefinite. For a concrete example, with A =[−1 1

1 0

]and

B =[

1 00 2

], we have f (A, B) =

[0 −4

−2 8

] � 0 and tr( f (A, B)) = 8 > 0.

Definition 3.6 An nc polynomial f ∈ R〈X〉 is called trace-positive if

tr( f (A)) ≥ 0 for all tuples of symmetric matrices A of the same size. (4)

Clearly, every nc polynomial cyclically equivalent to a sum of hermitian squares istrace-positive. But, there are trace-positive nc polynomials which are not members of �2.The easiest example is the non-commutative Motzkin polynomial, f = XY 4 X + Y X4Y −3XY 2 X +1 [24, Example 4.4]; see also [25] for further examples. Nevertheless, the obvious�2-certificate for trace-positivity turns out to be very useful. The problem whether a givennc polynomial f ∈ R〈X〉 is an element of �2 can be done efficiently by using SDP as firstobserved in [26], see also [22,27]. The method behind it is a variant of the Gram matrixmethod and is based on the following proposition.

Proposition 3.7 Let W be the vector of all words w ∈ 〈X〉 satisfying 2 deg(w) ≤deg( f ), where f ∈ R〈X〉. Then, f ∈ �2 if and only if there exists a positive semidefinitematrix G f (called a tracial Gram matrix for f ) such that f

cyc∼ W ∗G f W .

On the theoretical level, trace-positive nc polynomials occur naturally in von Neumannalgebras and functional analysis. For instance, in Connes’ embedding conjecture.[24,28]In addition, trace-positive nc polynomials arise in the Lieb–Seiringer reformulation ofthe recently solved [29] Bessis–Moussa–Villani (BMV) conjecture [30] from statisticalquantum mechanics. We refer to [31] for more on trace-positive polynomials.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 9: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1503

3.1. Non-commutative tracial sos SDPs

Definition 3.8 For a fixed d ∈ N and given word w ∈ 〈X〉≤2d , we define the matrix Acycw

of order σ(d) × σ(d):

Acycw (i, j) =

{1 if w

cyc∼ w(i)∗w( j)

0 otherwise,

and let Acycd (n) be the subalgebra of Mσ(d)(R) generated by {Acyc

w | w ∈ 〈X〉≤2d}.Example 3.9 Let d = 2 and X = (X, Y ). Then W2 = [1 X Y X2 XY Y X Y 2]t . SinceXY 2 cyc

∼ Y XYcyc∼ Y 2 X and XY 2 = X∗ · Y 2 = (Y X)∗ · Y , Y XY = Y ∗ · XY = (XY )∗ · Y ,

Y 2 X = Y ∗ · Y X = (XY )∗ X then

AcycXY 2 = Acyc

Y XY = AcycY 2 X

=

⎡⎢⎢⎢⎢⎢⎢⎢⎣

0 0 0 0 0 0 00 0 0 0 0 0 10 0 0 0 1 1 00 0 0 0 0 0 00 0 1 0 0 0 00 0 1 0 0 0 00 1 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎦

.

It is obvious that for everycyc∼ equivalence class there is only one matrix Acyc

w .

In Theorems 2.3 and 2.6, we saw that for each d, n ∈ N, the algebras Acomd (n) and

Ancd (n) are full matrix algebras. Our next theorem shows that this fails in general for Acyc

d (n).

Theorem 3.10 The algebra Acyc2 (2) is isomorphic to M6(R) ⊕ R.

Proof Let W2 = [1 X Y X2 XY Y X Y 2]t . The generators of Acyc2 (2) are therefore the

following matrices:

g1 = Acyc1 = E1,1,

g2 = AcycX = E1,2 + E2,1,

g3 = AcycY = E1,3 + E3,1,

g4 = AcycX2 = E1,4 + E2,2 + E4,1,

g5 = AcycXY = E1,5 + E1,6 + E2,3 + E3,2 + E5,1 + E6,1,

g6 = AcycY 2 = E1,7 + E3,3 + E7,1,

g7 = AcycX3 = E2,4 + E4,2,

g8 = AcycX2Y

= E2,5 + E2,6 + E3,4 + E4,3 + E5,2 + E6,2,

g9 = AcycXY 2 = E2,7 + E3,5 + E3,6 + E5,3 + E6,3 + E7,2,

g10 = AcycY 3 = E3,7 + E7,3,

g11 = AcycX4 = E4,4,

g12 = AcycX3Y

= E4,5 + E4,6 + E5,4 + E6,4,

g13 = AcycX2Y 2 = E4,7 + E5,5 + E6,6 + E7,4,

g14 = AcycXY XY = E5,6 + E6,5,

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 10: On matrix algebras associated to sum-of-squares semidefinite programs

1504 K. Cafuta

g15 = AcycXY 3 = E5,7 + E6,7 + E7,5 + E7,6,

g16 = AcycY 4 = E7,7.

Define

e1 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 0 0 0 0 0 00 1 0 0 0 0 00 0 1 0 0 0 00 0 0 1 0 0 00 0 0 0 1

212 0

0 0 0 0 12

12 0

0 0 0 0 0 0 1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

, e2 =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣

0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 1

2 − 12 0

0 0 0 0 − 12

12 0

0 0 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦

.

It is straightforward to check that e1, e2 are central idempotents of the algebra Acyc2 (2), that

is, e1e2 = e2e1 = 0 and e1 + e2 = I . It follows that Acyc2 (2) is (an internal) direct sum of

algebras:Acyc

2 (2) = e1Acyc2 (2) ⊕ e2Acyc

2 (2).

Let

P =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 0 0 0 0 0 00 1 0 0 0 0 00 0 1 0 0 0 00 0 0 1 0 0 00 0 0 0 1√

21√2

0

0 0 0 0 0 0 10 0 0 0 1√

2− 1√

20

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

be a unitary matrix. Then

PAcyc2 (2)Pt = P(e1Acyc

2 (2) ⊕ e2Acyc2 (2))Pt

= P(e1Acyc2 (2))Pt ⊕ P(e2Acyc

2 (2))Pt .

Let us first look at the subalgebra P(e1Acyc2 (2))Pt , which is generated with matrices

P(e1gi )Pt for i = 1, . . . , 16. We see that

P(e1gi )Pt = gi for i = 1, 2, 3, 4, 7, 11,

P(e1g5)Pt = √2E1,5 + E2,3 + E3,2 + √

2E5,1,

P(e1g6)Pt = E1,6 + E3,3 + E6,1,

P(e1g8)Pt = √2E2,5 + E3,4 + E4,3 + √

2E5,2,

P(e1g9)Pt = E2,6 + √2E3,5 + √

2E5,3 + E6,2,

P(e1g10)Pt = E3,6 + E6,3,

P(e1g12)Pt = √2E4,5 + √

2E5,4,

P(e1g13)Pt = E4,6 + E5,5 + E6,4,

P(e1g14)Pt = E5,5,

P(e1g15)Pt = √2E5,6 + √

2E6,5,

P(e1g16)Pt = E6,6.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 11: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1505

The last row and the last column of each of these generators are all zeros. For the topleft 6 × 6 block we use analogous steps as in the proof of Theorem 2.3; we show thatEi, j ∈ P(e1Acyc

2 (2))Pt holds for every i, j = 1, . . . , 6. We see

E1,1 = P(e1g1)Pt ,

E1,5 = 1√2

E1,1 P(e1g5)Pt ,

E5,1 = 1√2

P(e1g5)Pt E1,1,

E1, j = E1,1 P(e1g j )Pt for j = 1, 2, 3, 4, 6,

Ei,1 = P(e1gi )Pt E1,1 for i = 1, 2, 3, 4, 6.

Since Ei, j = Ei,1 E1, j , this establishes that P(e1Acyc2 (2))Pt ∼= M6(R) and therefore

e1Acyc2 (2) ∼= M6(R).

Now, let us have a look at the subalgebra P(e2Acyc2 (2))Pt . The only non-zero generators

areP(e2g13)Pt = E7,7, P(e2g14)Pt = −E7,7

and therefore P(e2Acyc2 (2))Pt ∼= R. It follows that

e2Acyc2 (2) ∼= R.

Finally, all this impliesAcyc

2 (2) ∼= M6(R) ⊕ R.

�It is easy to see that whenever d = 1 or n = 1, then Acyc

d (n) is a full matrix algebra.In fact, we conjecture that the case presented in Theorem 3.10 (d = n = 2) is the onlyexample where Acyc

d (n) is not a full matrix algebra. This has been verified using GAP andMathematica for small values of d, n.

4. An application of the lone outlier

In this section, we present our second main result. We give an alternative proof of theBurgdorf–Klep [10] non-commutative version of Hilbert’s ternary quartics theorem: abivariate non-commutative polynomial of degree 4 is trace positive iff it is a sum of foursquares and commutators. Our proof is inspired by the original proof which relies on ad-hoccomputations. The key ingredients in the proof are Hilbert’s original theorem [32] and ourTheorem 3.10.

Definition 4.1 Let ˇ : R〈X〉 → R[x] be the natural algebra homomorphism which maps anon-commuting variable Xi to the commuting variable xi for all i = 1, . . . , n. The imagef ∈ R[x] of a given nc polynomial f ∈ R〈X〉 is called the commutative collapse of f .

Lemma 4.2 Let f ∈ R〈X, Y 〉4 be trace-positive on pairs of symmetric 2 × 2 matrices,and assume aX4 = 0 or aY 4 = 0. Then aX2Y 2 − aXY XY ≥ 0.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 12: On matrix algebras associated to sum-of-squares semidefinite programs

1506 K. Cafuta

Proof Without loss of generality assume aY 4 = 0. Set

Ax :=[

0 xx 0

], Bλx :=

[λx 00 −λx

]

for λ >

√∣∣∣ aX4

aX2Y 2−aXY XY

∣∣∣. Form the univariate polynomial

q(x) := tr(

f (Ax , Bλx )) = a1 + (aX2 + aY 2λ

2)x2 + ((aX2Y 2 − aXY XY )λ2 + aX4

)x4.

Since f is trace-positive on pairs of symmetric 2 × 2 matrices, q is non-negative on R.If aX2Y 2 − aXY XY < 0 then q has a negative leading coefficient, a contradiction. �

Theorem 4.3 Suppose the nc polynomial f ∈ R〈X, Y 〉 is trace-positive on pairs ofsymmetric 2 × 2 matrices, and let deg f = 4. Then f is cyclically equivalent to a sum of(at most) 4 hermitian squares.

Proof Let f be the commutative collapse of f . Then f is a (commutative) polynomialof degree at most 4. Since every matrix of the form λI2 (where λ ∈ R and I2 is the 2 × 2identity matrix) lies in the center of M2(R), we have that f (λ1 I2, λ2 I2) = f (λ1, λ2)I2 andtherefore tr( f (λ1 I2, λ2 I2)) = 2 f (λ1, λ2). From the trace-positivity of f on 2 × 2 matricestherefore follows non-negativity of f on R2. Hence by Hilbert’s theorem, f is a sum of(at most) 3 squares [32]. Equivalently, there exists a Gram matrix G f for f with rank (at

most) 3. We next show that G f lifts to a positive semidefinite tracial Gram matrix Gcycf for

f with rank (at most) 4.We retain the notations and definitions used in the proof of Theorem 3.10. Let

Q =

⎡⎢⎢⎢⎢⎢⎢⎣

1 0 0 0 0 0 00 1 0 0 0 0 00 0 1 0 0 0 00 0 0 1 0 0 00 0 0 0

√2 0 0

0 0 0 0 0 1 0

⎤⎥⎥⎥⎥⎥⎥⎦

.

Since the last row and the last column of all matrices in P(e1Acyc2 (2))Pt are of all zeros,

we can identify each such matrix with its first 6 × 6 block. Then it follows that

Qt Acom1 Q = P(e1 Acyc

1 )Pt , Qt Acomx Q =P(e1 Acyc

X )Pt ,

Qt Acomy Q = P(e1 Acyc

Y )Pt , Qt Acomx2 Q =P(e1 Acyc

X2 )Pt ,

Qt Acomxy Q = P(e1 Acyc

XY )Pt , Qt Acomy2 Q =P(e1 Acyc

Y 2 )Pt ,

Qt Acomx3 Q = P(e1 Acyc

X3 )Pt , Qt Acomx2 y Q =P(e1 Acyc

X2Y)Pt ,

Qt Acomxy2 Q = P(e1 Acyc

XY 2)Pt , Qt Acomy3 Q =P(e1 Acyc

Y 3 )Pt ,

Qt Acomx4 Q = P(e1 Acyc

X4 )Pt , Qt Acomx3 y Q =P(e1 Acyc

X3Y)Pt ,

Qt Acomxy3 Q = P(e1 Acyc

XY 3)Pt , Qt Acomy4 Q =P(e1 Acyc

Y 4 )Pt ,

Qt Acomx2 y2 Q = P

(e1(Acyc

X2Y 2 + AcycXY XY )

)Pt .

(5)

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 13: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1507

Moreover, it follows that Q P(e1Gcycf e1)Pt Qt for a given tracial Gram matrix Gcyc

f of

f is a Gram matrix for its commutative collapse f :

G f = Q P(e1Gcycf e1)Pt Qt . (6)

In addition, it is easy to see that

P(e2Gcycf e2)Pt = (Gcyc

f )XY,XY + (Gcycf )Y X,Y X − (Gcyc

f )XY,Y X − (Gcycf )Y X,XY

2E7,7

= aX2Y 2 − aXY XY − 2(Gcycf )X2,Y 2

2E7,7.

Case 1: Suppose aX2Y 2 − aXY XY − 2 max{aX4 , aY 4} ≥ 0. Then we can lift G f to a

positive semidefinite tracial Gram matrix Gcycf

∼= G f ⊕ aX2Y 2−aXY XY −2(G f )x2,y2

2 :

(Gcycf )w,XY = (Gcyc

f )w,Y X = (Gcycf )XY,w = (Gcyc

f )Y X,w = (G f )w,xy

2,

(w = 1, X, Y, X2, Y 2),

(Gcycf )XY,Y X = (Gcyc

f )Y X,XY = aXY XY

2,

(Gcycf )XY,XY = (Gcyc

f )Y X,Y X = aX2Y 2 − 2(G f )x2,y2

2,

(Gcycf )u,v = (G f )u,v, (u, v = 1, X, Y, X2, Y 2).

Then it follows from positive semidefiniteness of G f that

max{aX4 , aY 4} = max{ax4 , ay4} ≥ (G f )x2,y2 = (Gcycf )X2,Y 2 .

Since the rank of G f is (at most) 3, rank of Gcycf is (at most) 4 and therefore f is

cyclically equivalent to a sum of (at most) 4 hermitian squares.Moreover, if aX4 = 0 or aY 4 = 0 then (G f )x2,y2 = 0, which follows from the fact that

for G f � 0 its principal minors are all non-negative, in particular,

(G f )x2,x2(G f )y2,y2 − (G f )2x2,y2 ≥ 0,

where (G f )x2,x2 = ax4 = aX4 , (G f )y2,y2 = ay4 = aY 4 and (G f )y2,x2 = (G f )x2,y2 . Itfollows from Lemma 4.2 that aX2Y 2 − aXY XY ≥ 0. As before, we can lift G f to a positive

semidefinite tracial Gram matrix Gcycf

∼= G f ⊕ aX2Y 2−aXY XY

2 of f .Case 2: Suppose aX2Y 2 − aXY XY − 2 max{aX4 , aY 4} < 0. Without loss of generality

assume max{aX4 , aY 4} = aX4 = 0. Since f is non-negative on R2, it follows that aX4 =ax4 > 0. Setting

s ≥aX3Y +

√a2

X3Y+ 8aX4(aXY XY + 2aX4 − aX2Y 2)

4aX4

and looking at the coefficients bw of the canonical cyclic representative of

h(X, Y ) := f (X + sY,−Y ) ∈ R〈X, Y 〉4

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 14: On matrix algebras associated to sum-of-squares semidefinite programs

1508 K. Cafuta

we get

bX2Y 2 −2bX4 = 4aX4s2 −2aX3Y s +aX2Y 2 −2aX4 ≥ 2aX4s2 −aX3Y s +aXY XY = bXY XY .

Thus, the nc polynomial h(X, Y ) satisfies the assumptions of Case 1. Hence, h(X, Y )

is cyclically equivalent to a sum of (at most) 4 hermitian squares. This implies f (X, Y ) =h(X + sY,−Y ) is cyclically equivalent to a sum of (at most) 4 hermitian squares. �

References

[1] Boyd S, El Ghaoui L, Feron E, Balakrishnan V. Linear matrix inequalities in system and controltheory. Vol. 15, SIAM studies in applied mathematics. Philadelphia (PA): Society for Industrialand Applied Mathematics (SIAM); 1994.

[2] Parrilo PA. Structured semidefinite programs and semialgebraic geometry methods in robustnessand optimization [Ph.D. thesis]. Pasadena (CA): California Institute of Technology; 2000.

[3] Goemans MX. Semidefinite programming in combinatorial optimization. Math. Program.1997;79:143–161.

[4] Lasserre JB. Moments, positive polynomials and their applications. Vol. 1, Imperial CollegePress Optimization Series. London: Imperial College Press; 2009.

[5] Marshall M. Positive polynomials and sums of squares. Vol. 146, Mathematical Surveys andMonographs. Providence (RI): American Mathematical Society; 2008.

[6] Laurent M. Sums of squares, moment matrices and optimization over polynomials In: PutinarM, Sullivant S, editors. Emerging applications of algebraic geometry. IMA Vol. Math. Appl.2009;149:157–270.

[7] Mittelman H. Avaiable from: http://plato.asu.edu/sub/pns.html.[8] Bachoc C, Gijswijt DC, Schrijver A, Vallentin F. Invariant semidefinite programs. Handbook

on semidefinite, conic and polynomial optimization. Internat. Ser. Oper. Res. Management Sci.2012;166:219–269.

[9] Gatermann K, Parrilo PA. Symmetry groups, semidefinite programs, and sums of squares. J.Pure Appl. Algebra. 2004;192:95–128.

[10] Burgdorf S, Klep I. Trace-positive polynomials and the quartic tracial moment problem. C. R.Math. Acad. Sci. Paris. 2010;348:721–726.

[11] Helton JW, Klep I, McCullough S. The matricial relaxation of a linear matrix inequality. Math.Program. in press. Available from: http://dx.doi.org/10.1007/s10107-012-0525-z.

[12] de Klerk E, Dobre C, Pasechnik DV. Numerical block diagonalization of matrix ∗-algebras withapplication to semidefinite programming. Math. Program. 2011;129:91–111.

[13] Lasserre JB. Global optimization with polynomials and the problem of moments. SIAM J. Optim.2000/01;11:796–817.

[14] Parrilo PA, Sturmfels B. Minimizing polynomial functions. Algorithmic and quantitative realalgebraic geometry (Piscataway, NJ 2001). DIMACS Ser. Discrete Math. Theoret. Comput. Sci.,Amer. Math. Soc. 2003;60:83–99.

[15] Parrilo PA. Semidefinite programming relaxations for semialgebraic problems. Math. Program.2003;96:293–320.

[16] Murty KG, Kabadi SN. Some NP-complete problems in quadratic and nonlinear programming.Math. Program. 1987;39:117–129.

[17] Helton JW. ‘Positive’ noncommutative polynomials are sums of squares. Ann. Math.2002;156:675–694.

[18] McCullough S, Putinar M. Noncommutative sums of squares. Pacific J. Math. 2005;218:167–171.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013

Page 15: On matrix algebras associated to sum-of-squares semidefinite programs

Linear and Multilinear Algebra 1509

[19] Helton JW, Klep I, McCullough S. The convex Positivstellensatz in a free algebra. Adv. Math.2012;231:516–534.

[20] de Oliveira MC, Helton JW, McCullough S, Putinar M. Engineering systems and free semi-algebraic geometry. In: Putinar M and Sullivant S, editors. Emerging applications of algebraicgeometry. IMA Vol. Math. Appl. 2008;149:17–61.

[21] Pironio S, Navascués M, Acín A. Convergent relaxations of polynomial optimization problemswith noncommuting variables. SIAM J. Optim. 2010;20:2157–2180.

[22] Klep I, Povh J. Semidefinite programming and sums of hermitian squares of noncommutativepolynomials. J. Pure Appl. Algebra. 2010;214:740–749.

[23] Cafuta K, Klep I, Povh J. NCSOStools: a computer algebra system for symbolic and numericalcomputation with noncommutative polynomials. Optim. Methods Softw. 2011;26:363–380.Available from: http://ncsostools.fis.unm.si/.

[24] Klep I, Schweighofer M. Connes’ embedding conjecture and sums of Hermitian squares. Adv.Math. 2008;217:1816–1837.

[25] Quarez R. Some examples of trace-positive quaternary quartics. Available from:http://hal.archives-ouvertes.fr/hal-00685397.

[26] Klep I, Schweighofer M. Sums of Hermitian squares and the BMV conjecture. J. Stat. Phys.2008;133:739–760.

[27] Burgdorf S, Cafuta K, Klep I, Povh J. The tracial moment problem and trace-optimization ofpolynomials. Math. Program. in press. Available from: http://dx.doi.org/10.1007/s10107-011-0505-8.

[28] A. Connes. Classification of injective factors. Cases II1 II∞ IIIλ λ = 1. Ann. of Math.1976;104:73–115.

[29] Stahl HR. Proof of the BMV conjecture. Available from: http://arxiv.org/abs/1107.4875.[30] Bessis D, Moussa P, Villani M. Monotonic converging variational approximations to the

functional integrals in quantum statistical mechanics. J. Math. Phys. 1975;16:2318–2325.[31] Klep I. Trace-positive polynomials. Pacific. J. Math. 2011;250:339–352.[32] Hilbert D. Über die Darstellung definiter Formen als Summe von Formenquadraten. Math. Ann.

1888;32:342–350.

Dow

nloa

ded

by [

New

Yor

k U

nive

rsity

] at

22:

52 2

5 N

ovem

ber

2013