📜 ⬆️ ⬇️

A little about conical duality

When studying theoretical courses in machine learning (mat. Economics, optimization, finance, etc.), the concept of a “dual problem” is often encountered.

Dual tasks are often used to obtain lower (or upper) estimates for the objective functional in optimization problems. In addition, for almost any meaningful formulation of the optimization problem, the dual problem has a meaningful interpretation. That is, if you are faced with an important optimization task, then its dual approach is also most likely important.

In this article I will talk about conical duality. This way of constructing dual tasks, in my opinion, is undeservedly deprived of attention ...
')
Next matan ...

How do dual tasks usually build?


Let some optimization problem be given:

 minx inRnf(x) fi(x) leq0, quad1 leqi leqk hi(x)=0,1 leqi leqm



The dual task is constructed as follows:


L(x, lambda, mu)=f(x)+ sumi=1k lambdaifi(x)+ sumi=1m muihi(x)



g( lambda, mu)= infxL(x, lambda, mu)



 max lambda, mug( lambda, mu) lambda geq0



The main difficulty in this scheme is the protection in the search step.  infxL(x, lambda, mu).

If the problem is not convex, then this is a coffin - in general it is not solved in polynomial time (if P neqNP) and we will not discuss such tasks in this article in the future.

Suppose that the problem is convex, then what?

If the problem is smooth, then we can use the first-order optimality condition.  nablaxL(x, lambda, mu)=0. From this condition, if everything is OK, it turns out to output or x( lambda, mu)= arg minxL(x, lambda, mu)and g( lambda, mu)=L(x( lambda, mu), lambda, mu)or directly function g( lambda, mu).

If the problem is not smooth, then we could use an analogue of the first order condition 0 in partialxL(x, lambda, mu)(here  partialxL(x, lambda, mu)denotes the subdifferential of the function L(x, lambda, mu)), however, this procedure is usually much more complicated.

Sometimes there is an equivalent “smooth” optimization problem and one can construct a dual one for it. But for the improvement of the structure (from non-smooth to smooth), as a rule, one always has to pay an increase in dimension.

Conical duality


There are quite a lot of optimization problems (examples below) that allow the following view:

 minx inRncTxAx+b inK


Where A- matrix, b- vector K- nondegenerate convex cone.

In this case, the dual problem can be constructed according to the following scheme:

The dual task is constructed as follows:


L(x, lambda)=cTx+ lambdaT(Ax+b)



g( lambda)= infxL(x, lambda)= begincases lambdaTb, quadc+AT lambda=0 infty, quadc+AT lambda neq0 endcases



 max lambdabT lambdac+AT lambda=0 lambda inK


where is the conjugate cone Kfor cone Kdefined as K ^ * = \ left \ {y \ in R ^ k | z ^ T y \ geq 0, \ quad \ forall z \ in K \ right \} .

As we see, the entire complexity of the construction of the dual problem was transferred to the construction of the dual cone. But the joy is that there is a good calculus for building dual cones and very often the dual cone can be written out right away.

Example


Suppose we need to build a dual optimization problem for the problem:

 minx inRn |x |2+ |x |1Ax geqb



Here  |x |1= sumi=1n|xi|,  |x |2= sqrt sumi=1nxi2

The first thing to notice: the objective function can always be made linear!

Rather, there is always an equivalent problem with a linear objective function:

 minx inRn,y inR,z inRy+z|x |2 leqy|x |1 leqz Ax geqb



Now you need to use a little secret knowledge: sets

K_1 = \ {(x, t) \ in R ^ n \ times R | \ quad \ | x \ | _1 \ leq t \}

and

K_2 = \ {(x, t) \ in R ^ n \ times R | \ quad \ | x \ | _2 \ leq t \}

are convex cones.

Thus, we arrive at an equivalent task entry:

 minx inRn,y inR,z inRy+zIn+1 beginpmatrixxy endpmatrix+0n+1 inK2In+1 beginpmatrixxz endpmatrix+0n+1 inK1Axb inR+k



Now we can immediately write out the dual task:

 max lambda, mu, nubT nu lambdai+ mui+[AT nu]i=0, quad1 leqi leqn lambdan+1+1=0 mun+1+1=0 lambda inK2(=K2) mu inK1(=K  infty) nu inR+k


or if to simplify a little

 max lambda, mu, nubT nu lambda+ mu+AT nu=0 | lambda |2 leq1 | mu | infty leq1 nu inR+k



Where  | mu | infty= maxi| mui|.

Links for further study:

Source: https://habr.com/ru/post/431756/


All Articles