Clever Geek Handbook
📜 ⬆️ ⬇️

Semi-definite programming

Semi-defined programming (en: Semidefinite programming, SDP ) is a subsection of convex programming that optimizes a linear objective function (the objective function is a user-defined function whose value the user wants to minimize or maximize) at the intersection of the cones of positively semi-defined matrices with affine space .

Semi-defined programming is a relatively new area of ​​optimization, the interest in which is growing for several reasons. Many practical problems in the areas of operations research and combinatorial optimization can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDP problems are used in the context of linear matrix inequalities . SDP tasks, in fact, are a special case of and can be effectively solved by the internal point method . All linear programming problems can be expressed as SDP problems, and with the help of SDP task hierarchies, solutions of polynomial optimization problems can be approximated. Semi-defined programming is used to optimize complex systems . In recent years, some problems of the complexity of quantum queries have been formulated in terms of semidefinite programming.

Content

Motivation and determination

Initial Motivations

A linear programming problem is a problem in which it is necessary to maximize or minimize a linear objective function of real variables on a polyhedron . In semi-defined programming, instead we use real vectors and we are allowed to use the scalar product of vectors. The condition of nonnegativity of real variables of the LP problem is replaced by the restrictions of semidefiniteness on the matrix of variables of the SDP problem. In particular, the general problem of semidefinite programming can be defined as any problem of mathematical programming of the form

minxone,...,xn∈RnΣi,j∈[n]ci,j(xi⋅xj){\ displaystyle {\ min _ {x ^ {1}, \ ldots, x ^ {n} \ in \ mathbb {R} ^ {n}}} {\ sum _ {i, j \ in [n]} c_ {i, j} (x ^ {i} \ cdot x ^ {j})}}  
under conditionsΣi,j∈[n]ai,j,k(xi⋅xj)≤bk∀k. {\ displaystyle {\ sum _ {i, j \ in [n]} a_ {i, j, k} (x ^ {i} \ cdot x ^ {j}) \ leq b_ {k} \ qquad \ forall k }.}  

Equivalent wording

They say thatn×n {\ displaystyle n \ times n}   matrixM {\ displaystyle M}   is positively semidefinite if it is a Gram matrix of some vectors (i.e., if there exist vectorsxone,...,xn {\ displaystyle x ^ {1}, \ ldots, x ^ {n}}   such thatmi,j=xi⋅xj {\ displaystyle m_ {i, j} = x ^ {i} \ cdot x ^ {j}}   for alli,j {\ displaystyle i, j}   ). If this is done, we denote it asM⪰0 {\ displaystyle M \ succeq 0}   . Note that there are some other equivalent definitions of positive semidefinarity, for example, positive semidefinite matrices have only non-negative eigenvalues ​​and have a positive semidefinite square root.

Denote bySn {\ displaystyle \ mathbb {S} ^ {n}}   space of alln×n {\ displaystyle n \ times n}   real symmetric matrices. In this space there is a scalar product⟨A,B⟩Sn=tr(ATB)=Σi=one,j=onenAijBij. {\ displaystyle \ langle A, B \ rangle _ {\ mathbb {S} ^ {n}} = {\ rm {tr}} (A ^ {T} B) = \ sum _ {i = 1, j = 1 } ^ {n} A_ {ij} B_ {ij}.}   (Wheretr {\ displaystyle {\ rm {tr}}}   means footprint )

We can rewrite the mathematical programming problem from the previous section in equivalent form.

minX∈Sn⟨C,X⟩Sn{\ displaystyle {\ min _ {X \ in \ mathbb {S} ^ {n}}} \ langle C, X \ rangle _ {\ mathbb {S} ^ {n}}}  
under conditions⟨Ak,X⟩Sn≤bk,k=one,...,mX⪰0 {\ displaystyle {\ begin {array} {ll} {\ displaystyle \ langle A_ {k}, X \ rangle _ {\ mathbb {S} ^ {n}} \ leq b_ {k}, \ quad k = 1, \ ldots, m} \\ X \ succeq 0 \ end {array}}}  

where is the elementi,j {\ displaystyle i, j}   matricesC {\ displaystyle C}   equallyci,j {\ displaystyle c_ {i, j}}   from the previous section, butAk {\ displaystyle A_ {k}}   -n×n {\ displaystyle n \ times n}   matrix having as an elementi,j {\ displaystyle i, j}   matrix valueai,j,k {\ displaystyle a_ {i, j, k}}   from the previous section.

Note that if we add properly, this SDP task can be converted to

minX∈Sn⟨C,X⟩Sn{\ displaystyle \ min _ {X \ in \ mathbb {S} ^ {n}}} \ langle C, X \ rangle _ {\ mathbb {S} ^ {n}}  
under conditions⟨Ak,X⟩Sn=bk,k=one,...,mX⪰0 {\ displaystyle {\ begin {array} {ll} \ langle A_ {k}, X \ rangle _ {\ mathbb {S} ^ {n}} = b_ {k}, \ quad k = 1, \ ldots, m \\ X \ succeq 0 \ end {array}}}  

For convenience, the SDP task may be defined slightly in a different, but equivalent, form. For example, linear expressions using non-negative scalar variables can be added to the task specification. The task remains SDP, since each variable can be included in the matrixX {\ displaystyle X}   as a diagonal element (Xii {\ displaystyle X_ {ii}}   for somei {\ displaystyle i}   ). To provideXii≥0 {\ displaystyle X_ {ii} \ geq 0}   , you can add restrictionsXij=0 {\ displaystyle X_ {ij} = 0}   for allj≠i {\ displaystyle j \ neq i}   . As another example, note that for any positive semidefinite matrixX {\ displaystyle X}   there is a set of vectors{vi} {\ displaystyle \ {v_ {i} \}}   such that elementi {\ displaystyle i}   ,j {\ displaystyle j}   matricesX {\ displaystyle X}   equalsXij=(vi,vj) {\ displaystyle X_ {ij} = (v_ {i}, v_ {j})}   scalar product of vectorsvi {\ displaystyle v_ {i}}   andvj {\ displaystyle v_ {j}}   . Thus, SDP problems are often formulated in terms of linear expressions of scalar products of vectors. If the solution is given to the SDP problem in the standard form, the vector{vi} {\ displaystyle \ {v_ {i} \}}   can be restored in timeO(n3) {\ displaystyle O (n ^ {3})}   (for example, using the incomplete decomposition of the Cholesky matrix X).

Duality Theory

Definitions

Similar to linear programming, if the general SDP task is given as

minX∈Sn⟨C,X⟩Sn{\ displaystyle \ min _ {X \ in \ mathbb {S} ^ {n}} \ langle C, X \ rangle _ {\ mathbb {S} ^ {n}}}  
under conditions⟨Ai,X⟩Sn=bi,i=one,...,mX⪰0 {\ displaystyle {\ begin {array} {ll} \ langle A_ {i}, X \ rangle _ {\ mathbb {S} ^ {n}} = b_ {i}, \ quad i = 1, \ ldots, m \\ X \ succeq 0 \ end {array}}}  

(direct problem, or P-SDP), we define the dual semi-defined problem (D-SDP) as

maxy∈Rm⟨b,y⟩Rm{\ displaystyle \ max _ {y \ in \ mathbb {R} ^ {m}} \ langle b, y \ rangle _ {\ mathbb {R} ^ {m}}}  
under conditionsΣi=onemyiAi⪯C {\ displaystyle {\ begin {array} {ll} {\ displaystyle \ sum _ {i = 1} ^ {m}} y_ {i} A_ {i} \ preceq C \ end {array}}}  

Where for any two matricesP {\ displaystyle P}   andQ {\ displaystyle Q}   ,P⪰Q {\ displaystyle P \ succeq Q}   meansP-Q⪰0 {\ displaystyle PQ \ succeq 0}   .

Weak duality

The theorem states that the direct SDP problem has a value no less than the value of the dual SDP. Thus, any valid solution of the dual SDP problem limits the value of the direct SDP from below, and vice versa, any valid value of the direct SDP problem restricts the value of the dual SDP from above. This is because

⟨C,X⟩-⟨b,y⟩=⟨C,X⟩-Σi=onemyibi=⟨C,X⟩-Σi=onemyi⟨Ai,X⟩=⟨C-Σi=onemyiAi,X⟩≥0,{\ displaystyle \ langle C, X \ rangle - \ langle b, y \ rangle = \ langle C, X \ rangle - \ sum _ {i = 1} ^ {m} y_ {i} b_ {i} = \ langle C, X \ rangle - \ sum _ {i = 1} ^ {m} y_ {i} \ langle A_ {i}, X \ rangle = \ langle C- \ sum _ {i = 1} ^ {m} y_ {i} A_ {i}, X \ rangle \ geq 0,}  

where the last inequality reflects the fact that both matrices are positive semidefinite. The value of this function is sometimes called dual clearance.

Strong duality

Under the condition known as Slater's condition , the values ​​of the direct and dual SDP tasks are equal. This is called strong duality . Unlike linear programming problems, not every SDP problem has a strict duality. In the general case, the value of the dual SDP problem can be strictly less than the value of the direct problem.

(i) Suppose that the direct problem (P-SDP) is bounded below and strictly admissible (that is, there areX0∈Sn,X0≻0 {\ displaystyle X_ {0} \ in \ mathbb {S} ^ {n}, X_ {0} \ succ 0}   such that⟨Ai,X0⟩Sn=bi {\ displaystyle \ langle A_ {i}, X_ {0} \ rangle _ {\ mathbb {S} ^ {n}} = b_ {i}}   ,i=one,...,m {\ displaystyle i = 1, \ ldots, m}   ). Then there is an optimal solution.y∗ {\ displaystyle y ^ {*}}   for the dual problem (D-SDP) and

⟨C,X∗⟩Sn=⟨b,y∗⟩Rm.{\ displaystyle \ langle C, X ^ {*} \ rangle _ {\ mathbb {S} ^ {n}} = \ langle b, y ^ {*} \ rangle _ {\ mathbb {R} ^ {m}} .}  

(ii) Suppose that the dual problem (D-SDP) is bounded above and strictly admissible (that is,Σi=onem(y0)iAi≺C {\ displaystyle \ sum _ {i = 1} ^ {m} (y_ {0}) _ {i} A_ {i} \ prec C}   for somey0∈Rm {\ displaystyle y_ {0} \ in \ mathbb {R} ^ {m}}   ). Then there is an optimal solution.X∗ {\ displaystyle X ^ {*}}   for the direct problem (P-SDP) and the equality from (i) holds.

Examples

Example 1

Consider three random variables.A {\ displaystyle A}   ,B {\ displaystyle B}   andC {\ displaystyle C}   . By definition, their correlation coefficientsρAB,ρAC,ρBC {\ displaystyle \ rho _ {AB}, \ \ rho _ {AC}, \ rho _ {BC}}   valid if and only if

(oneρABρACρABoneρBCρACρBCone)⪰0{\ displaystyle {\ begin {pmatrix} 1 & \ rho _ {AB} & \ rho _ {AC} \\\ rho _ {AB} & 1 & \ rho _ {BC} \\\ rho _ {AC} & \ rho _ {BC} & 1 \ end {pmatrix}} \ succeq 0}  

Suppose that from some sources (for example, from empirical or experimental data) we know that-0,2≤ρAB≤-0,one {\ displaystyle -0,2 \ leq \ rho _ {AB} \ leq -0,1}   and0,four≤ρBC≤0,five {\ displaystyle 0.4 \ leq \ rho _ {BC} \ leq 0.5}   . The task of determining the smallest and largest valuesρAC {\ displaystyle \ rho _ {AC} \}   can be written in the form:

minimize / maximizex13 {\ displaystyle x_ {13}}  
under conditions
-0,2≤x12≤-0,one{\ displaystyle -0,2 \ leq x_ {12} \ leq -0,1}  
0,four≤x23≤0,five{\ displaystyle 0.4 \ leq x_ {23} \ leq 0.5}  
xeleven=x22=x33=one{\ displaystyle x_ {11} = x_ {22} = x_ {33} = 1 \}  
(onex12x13x12onex23x13x23one)⪰0{\ displaystyle {\ begin {pmatrix} 1 & x_ {12} & x_ {13} \\ x_ {12} & 1 & x_ {23} \\ x_ {13} & x_ {23} & 1 \ end {pmatrix}} \ succeq 0}  

Here we acceptρAB=x12,ρAC=x13,ρBC=x23 {\ displaystyle \ rho _ {AB} = x_ {12}, \ \ rho _ {AC} = x_ {13}, \ \ rho _ {BC} = x_ {23}}   . The problem can be formulated as an SDP task. We supplement inequalities by expanding the matrix of variables and introducing , for example

tr((0one0000000000000000000one00000000000000)⋅(onex12x13000x12onex23000x13x23one000000sone000000s2000000s3))=x12+sone=-0,one{\ opstyle \ mathrm {\ tr} \ left (\ left ({\ begin {array} {cccccc} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \ left ({\ begin {array} {cccccc} 1 & x_ {12} & x_ {13} & 0 & 0 & 0 \\ x_ {12} & 1 & x_ {23} & 0 & 0 & 0 \ x_ {13} & x_ {23} & 1 & 0 & 0 & 0 \ 0 & 0 & 0 & 0 & s_ {1} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & s_ {2} & 0 \\ 0 & 0 & 0 & 0 & 0 & s_ {3} \ end {array}} \ right) \ right) = x_ {12} + s_ {1} = - 0,1}  

After solving this problem, SDP will get the minimum and maximum valuesρAC=x13 {\ displaystyle \ rho _ {AC} = x_ {13} \}   (-0,978 {\ displaystyle -0,978}   and0,872 {\ displaystyle 0.872}   respectively).

Example 2

Consider the task

minimize(cTx)2dTx {\ displaystyle {\ frac {(c ^ {T} x) ^ {2}} {d ^ {T} x}}}  
under conditionsAx+b≥0 {\ displaystyle Ax + b \ geq 0}   ,

where it is assumed thatdTx>0 {\ displaystyle d ^ {T} x> 0}   atAx+b≥0 {\ displaystyle Ax + b \ geq 0}   .

Entering an additional variablet {\ displaystyle t}   rewrite the problem in the form:

minimizet {\ displaystyle t}  
under conditionsAx+b≥0,(cTx)2dTx≤t {\ displaystyle Ax + b \ geq 0, \, {\ frac {(c ^ {T} x) ^ {2}} {d ^ {T} x}} \ leq t}  

In this formulation, the objective function is a linear function of two variables (x,t {\ displaystyle x, t}   ).

The first constraint can be rewritten as

diag(Ax+b)≥0{\ displaystyle {\ textbf {diag}} (Ax + b) \ geq 0}   ,

where is the matrixdiag(Ax+b) {\ displaystyle {\ textbf {diag}} (Ax + b)}   is a square matrix with values ​​on the diagonal equal to vector elementsAx+b {\ displaystyle Ax + b}   .

The second limitation can be written as

tdTx-(cTx)2≥0{\ displaystyle td ^ {T} x- (c ^ {T} x) ^ {2} \ geq 0}  

Define the matrixD {\ displaystyle D}   in the following way

D=[tcTxcTxdTx]{\ displaystyle D = \ left [{\ begin {array} {cc} t & c ^ {T} x \\ c ^ {T} x & d ^ {T} x \ end {array}} \ right]}  

We can use the Schur addition theory to show that

D⪰0{\ displaystyle D \ succeq 0}   [one]

The task of semi-defined programming for this task will be

minimizet {\ displaystyle t}  
under conditions[diag(Ax+b)000tcTx0cTxdTx]⪰0 {\ displaystyle \ left [{\ begin {array} {ccc} {\ textbf {diag}} (Ax + b) & 0 & 0 \\ 0 & t & c ^ {T} x \\ 0 & c ^ {T} x & d ^ {T} x \ end {array}} \ right] \ succeq 0}  

Example 3 (Goemans – Williamson MAX CUT Approximation Algorithm)

Semi-defined programming is an important tool for creating approximation algorithms for NP-hard maximization problems. The first approximation algorithm based on SDP was proposed by Michel Goemans and David Williamson [2] . They studied the MAX CUT problem: Given a graph G = ( V , E ), it is required to split the vertices of V into two parts so as to maximize the number of edges connecting these two parts. The task can be presented as a problem of integer quadratic programming :

MaximizeΣ(i,j)∈Eone-vivj2, {\ displaystyle \ sum _ {(i, j) \ in E} {\ frac {1-v_ {i} v_ {j}} {2}},}   providedvi∈{one,-one} {\ displaystyle v_ {i} \ in \ {1, -1 \}}   for anyonei {\ displaystyle i}   .

Unless P = NP , we cannot solve this problem effectively. However, Goemans and Williamson set out a three-step procedure for attacking such tasks:

  1. We weaken the integer quadratic programming problem to the SDP problem.
  2. We solve the SDP problem (with any arbitrarily small errorε {\ displaystyle \ epsilon}   ).
  3. We round off the solution of the SDP problem in order to obtain an approximate solution of the original integer quadratic programming problem.

For the MAX CUT problem, the most natural attenuation is

maxΣ(i,j)∈Eone-⟨vi,vj⟩2,{\ displaystyle \ max \ sum _ {(i, j) \ in E} {\ frac {1- \ langle v_ {i}, v_ {j} \ rangle} {2}},}   for‖vi‖2=one {\ displaystyle \ lVert v_ {i} \ rVert ^ {2} = 1}   where maximization is carried out by vectors{vi} {\ displaystyle \ {v_ {i} \}}   rather than scalar integer variables.

The task is an SDP task, since both the objective function and the constraints are linear functions of the scalar products of vectors. The solution to the SDP problem gives a set of unit vectors inRn {\ displaystyle \ mathbf {R ^ {n}}}   . Since vectors are not necessarily collinear, the value of a weakened problem can only be greater than the value of the original integer quadratic programming problem. The final rounding procedure is needed to get a split. Goemans and Williamson choose a random hyperplane (using a uniform distribution) passing through the origin and split the vertices depending on the location relative to this plane. Direct analysis shows that this procedure provides the expected approximation coefficient of 0.87856 - ε. (The mathematical expectation of the cut value is equal to the sum over all edges of the probabilities that the edge enters the cut and this expectation is proportional to the anglecos-one⁡⟨vi,vj⟩ {\ displaystyle \ cos ^ {- 1} \ langle v_ {i}, v_ {j} \ rangle}   between vectors at end vertices of an edge. If we compare this probability with(one-⟨vi,vj⟩)/2 {\ displaystyle (1- \ langle v_ {i}, v_ {j} \ rangle) / {2}}   , the expectation of the relationship will always be no less than 0.87856.) Assuming the correct it can be shown that the approximation coefficient of this approximation is mainly optimal.

Since the advent of Goemann and Williamson, the SDP tasks have been applied to develop a large number of approximation algorithms. Not long ago, Prasad Raghavendra developed a general framework for the task of satisfying constraints , based on the [3] .

Algorithms

There are several types of algorithms for solving SDP problems. The result of the work of these algorithms is the value of the SDP task with an accuracy ofε {\ displaystyle \ epsilon}   which is obtained in a time that polynomially depends on the size of the problem andlog⁡(one/ε) {\ displaystyle \ log (1 / \ epsilon)}   .

Inner Point Methods

Most of the solution systems are based on the internal point method (CSDP, SeDuMi, SDPT3, DSDP, SDPA), which is robust and efficient for linear general-purpose SDP problems. The approach is limited in use by the fact that the algorithms are second-order methods and require the memorization and decomposition of large (and, often, dense) matrices.

First Order Methods

The first-order methods for avoid the memorization and decomposition of large Hessian matrices and are applicable to problems that are significantly larger than the internal point methods, due to a loss in accuracy. The method is implemented in the SCS solver system.

Beam Method

The SDP problem is formulated as a non-smooth optimization problem and is solved by the spectral beam method. This approach is very effective for particular classes of linear SDP problems.

Others

Algorithms based on the (PENSDP) are similar in behavior to the methods of the inner point and can be adapted for some very large problems. Other algorithms use low-level information and the reformulation of the SDP problem as a non-linear programming problem (SPDLR).

Applications

Semi-definite programming was used to search for approximate solutions of combinatorial optimization problems, such as solving the maximum cut problem with an approximation coefficient of 0.87856. SDP tasks are also used in geometry to define tensegrity graphs, and appear in control theory as linear matrix inequalities .

Notes

  1. ↑ Boyd, Vandenberghe, 1996 .
  2. ↑ Goemans, Williamson, 1995 .
  3. ↑ Raghavendra, 2008 , p. 245-254.

Literature

  • Lieven Vandenberghe, Stephen Boyd. Semidefinite Programming // SIAM Review 38. - 1996. - March. - pp . 49–95 .
  • Monique Laurent, Franz Rendl. Semidefinite Programming and Integer Programming / Report PNA-R0210, CWI, Amsterdam . - 2002. - April.
  • E. de Klerk. Aspects of Semidefinite Programming: Interior Point Algorithms and Selected Applications. - Kluwer Academic Publishers, 2002. - ISBN 1-4020-0547-4 .
  • P. Raghavendra. Optimum algorithms and inapproximability results for every CSP? // Proceedings of the 40th Annual ACM Symposium on theory of Computing (Victoria, British Columbia, Canada, May 17–20, 2008). STOC '08. . - New York, NY: ACM, 2008. - p. 245-254.
  • Robert M. Freund. Introduction to Semidefinite Programming (SDP) .
  • Michel X. Goemans, David P. Williamson. Improved approximation algorithms using semidefinite programming // JACM. - 1995. - November ( v. 42 , issue 6 ). - p . 1115-1145 . - DOI : 10.1145 / 227683.227684 .

Links

  • Links to introductions and events in the field
  • Lecture notes from László Lovász on Semidefinite Programming
Source - https://ru.wikipedia.org/w/index.php?title= Semi - defined_programming&oldid = 99321770


More articles:

  • Zile, Edgar Yanovich
  • Carlos Di Sarli
  • 51 Eridani
  • Akabli
  • Little Novel
  • Ngamaleu, Moomin
  • Hebron Accords
  • Yakovlev, Anatoly Georgievich
  • Sin: Nanatsu no Taizai
  • Cantharis sucinonigra

All articles

Clever Geek | 2019