SciPy, conditions optimization

SciPy (pronounced sai pay) is a numpy-based math package that also includes C and Fortran libraries. With SciPy, an interactive Python session turns into a fully functional processing environment like MATLAB, IDL, Octave, R, or SciLab.

In this article, we will look at the basic techniques of mathematical programming — solving conditional optimization problems for the scalar function of several variables using the scipy.optimize package. Algorithms for unconditional optimization have already been discussed in the previous article . More detailed and current help on scipy functions can always be obtained using the help (), Shift + Tab command or in the official documentation .

Introduction

A common interface for solving problems of both conditional and unconditional optimization in the scipy.optimize package is provided by the minimize() function. However, it is known that there is no universal way to solve all problems, so the choice of an adequate method, as always, falls on the shoulders of the researcher.

The appropriate optimization algorithm is specified using the argument of the function minimize(..., method="") .

For conditional optimization of the function of several variables, implementations of the following methods are available:

trust-constr - search for a local minimum in the trust area. Article on wiki , article on Habré ;
SLSQP - sequential quadratic programming with constraints, Newton's method of solving the Lagrange system. Wiki article .
TNC - Truncated Newton Constrained, a limited number of iterations, is good for non-linear functions with a large number of independent variables. Article on the wiki .
L-BFGS-B is a Broyden – Fletcher – Goldfarb – Shanno method of the four, implemented with reduced memory consumption due to partial loading of vectors from the Hesse matrix. Article on wiki , article on Habré .
COBYLA - ~~MARE~~ Constrained Optimization By Linear Approximation, limited optimization with linear approximation (without calculating the gradient). Article on the wiki .

Depending on the method chosen, the conditions and restrictions are set differently for solving the problem:

object of class Bounds for methods L-BFGS-B, TNC, SLSQP, trust-constr;
a list (min, max) for the same methods L-BFGS-B, TNC, SLSQP, trust-constr;
an object or a list of LinearConstraint , NonlinearConstraint objects for COBYLA, SLSQP, trust-constr;
dictionary or list of dictionaries {'type':str, 'fun':callable, 'jac':callable,opt, 'args':sequence,opt} for methods COBYLA, SLSQP.

Article layout:
1) Consider applying the conditional optimization algorithm in the trusted domain (method = "trust-constr") with the constraints specified as Bounds , LinearConstraint , NonlinearConstraint objects;
2) Consider sequential least squares programming (method = "SLSQP") with restrictions specified in the dictionary form {'type', 'fun', 'jac', 'args'} ;
3) Disassemble an example of optimizing the output products on the example of a web studio.

Conditional optimization method = "trust-constr"

The implementation of the trust-constr based on EQSQP for problems with constraints of the equality type and on TRIP for problems with constraints in the form of inequalities. Both methods are implemented by local minimum search algorithms in the confidence domain and are well suited for large-scale tasks.

The mathematical formulation of the minimum search problem in general:

m i n_{x} f (x)

$\ min_x f (x)$

c^{l} l e q c (x) l e q c^{u},

$c ^ l \ leq c (x) \ leq c ^ u,$

x^{l} l e q x l e q x^{u}

$x ^ l \ leq x \ leq x ^ u$

For strict equality constraints, the lower bound is set equal to the upper $c ^ l_j = c ^ u_j$ .
For one-sided restriction, the upper or lower limit is set to np.inf with the appropriate sign.
Let it be necessary to find the minimum of the known Rosenbrock function of two variables:

m i n_{x_{0}, x_{1}} 100 (x_{0} - x_{1}^{2})^{2} + (1 - x_{0})^{2}

$\ min_ {x_0, x_1} 100 (x_0 - x_1 ^ 2) ^ 2 + (1-x_0) ^ 2$

The following restrictions on its scope are set:

x_{0}^{2} + x_{1} l e q 1

$x_0 ^ 2 + x_1 \ leq 1$

x_{0}^{2} - x_{1} l e q 1

$x_0 ^ 2 - x_1 \ leq 1$

2 x_{0} + x_{1} = 1

$2x_0 + x_1 = 1$

x_{0} + 2 x_{1} l e q 1

$x_0 + 2x_1 \ leq 1$

0 l e q x_{0} l e q 1

$0 \ leq x_0 \ leq 1$

- 0.5 l e q x_{1} l e q 2.0

$-0.5 \ leq x_1 \ leq 2.0$

In our case, there is only one solution at $[x_0, x_1] = [0.4149, 0.1701]$ for which only the first and fourth constraints are valid.
Let's go through the restrictions from the bottom up and consider how you can write them in scipy.
Restrictions $0 \ leq x_0 \ leq 1$ and $-0.5 \ leq x_1 \ leq 2.0$ we define using the Bounds object.

 from scipy.optimize import Bounds bounds = Bounds ([0, -0.5], [1.0, 2.0])

Restrictions $x_0 + 2 x_1 \ leq 1$ and $2 x_0 + x_1 = 1$ write in linear form:

\ begin {bmatrix} - \ infty \\ 1 \ end {bmatrix} \ leq \ begin {bmatrix} 1 & 2 \\ 2 & 1 \ end {bmatrix} \ begin {bmatrix} x_0 \\ x_1 \ end {bmatrix } \ leq \ begin {bmatrix} 1 \\ 1 \ end {bmatrix}

$\ begin {bmatrix} - \ infty \\ 1 \ end {bmatrix} \ leq \ begin {bmatrix} 1 & 2 \\ 2 & 1 \ end {bmatrix} \ begin {bmatrix} x_0 \\ x_1 \ end {bmatrix } \ leq \ begin {bmatrix} 1 \\ 1 \ end {bmatrix}$

We define these constraints as a LinearConstraint object:

 import numpy as np from scipy.optimize import LinearConstraint linear_constraint = LinearConstraint ([[1, 2], [2, 1]], [-np.inf, 1], [1, 1])

Finally, the nonlinear constraint in matrix form:

c (x) = b e g i n b m a t r i x x_{0}^{2} + x_{1} x_{0}^{2} - x_{1} e n d b m a t r i x l e q b e g i n b m a t r i x 1 1 e n d b m a t r i x

$c (x) = \ begin {bmatrix} x_0 ^ 2 + x_1 \\ x_0 ^ 2 - x_1 \ end {bmatrix} \ leq \ begin {bmatrix} 1 \\ 1 \ end {bmatrix}$

We define the Jacobi matrix for this restriction and a linear combination of the Hessian matrix with an arbitrary vector $v$ :

J (x) = \ begin {bmatrix} 2x_0 & 1 \\ 2x_0 & -1 \ end {bmatrix}

$J (x) = \ begin {bmatrix} 2x_0 & 1 \\ 2x_0 & -1 \ end {bmatrix}$

H (x, v) = \ sum \ limits_ {i = 0} ^ 1 v_i \ nabla ^ 2 c_i (x) = v_0 \ begin {bmatrix} 2 & 0 \\ 2 & 0 \ end {bmatrix} + v_1 \ begin {bmatrix} 2 & 0 \\ 2 & 0 \ end {bmatrix}

$H (x, v) = \ sum \ limits_ {i = 0} ^ 1 v_i \ nabla ^ 2 c_i (x) = v_0 \ begin {bmatrix} 2 & 0 \\ 2 & 0 \ end {bmatrix} + v_1 \ begin {bmatrix} 2 & 0 \\ 2 & 0 \ end {bmatrix}$

Now we can define a nonlinear constraint as a NonlinearConstraint object:

 from scipy.optimize import NonlinearConstraint def cons_f(x): return [x[0]**2 + x[1], x[0]**2 - x[1]] def cons_J(x): return [[2*x[0], 1], [2*x[0], -1]] def cons_H(x, v): return v[0]*np.array([[2, 0], [0, 0]]) + v[1]*np.array([[2, 0], [0, 0]]) nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1, jac=cons_J, hess=cons_H)

If the size is large, the matrix can be set in a sparse form:

 from scipy.sparse import csc_matrix def cons_H_sparse(x, v): return v[0]*csc_matrix([[2, 0], [0, 0]]) + v[1]*csc_matrix([[2, 0], [0, 0]]) nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1, jac=cons_J, hess=cons_H_sparse)

or as a LinearOperator object:

 from scipy.sparse.linalg import LinearOperator def cons_H_linear_operator(x, v): def matvec(p): return np.array([p[0]*2*(v[0]+v[1]), 0]) return LinearOperator((2, 2), matvec=matvec) nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1, jac=cons_J, hess=cons_H_linear_operator)

When calculating the Hesse matrix $H (x, v)$ costly, you can use the class HessianUpdateStrategy . The following strategies are available: BFGS and SR1 .

 from scipy.optimize import BFGS nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1, jac=cons_J, hess=BFGS())

Hessian can also be calculated using finite differences:

 nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1, jac = cons_J, hess = '2-point')

The Jacobi matrix for constraints can also be calculated using finite differences. However, in this case, the Hessian matrix using finite differences is no longer calculated. Hessian must be defined as a function or using the HessianUpdateStrategy class.

 nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1, jac = '2-point', hess = BFGS ())

The solution of the optimization problem is as follows:

 from scipy.optimize import minimize from scipy.optimize import rosen, rosen_der, rosen_hess, rosen_hess_prod x0 = np.array([0.5, 0]) res = minimize(rosen, x0, method='trust-constr', jac=rosen_der, hess=rosen_hess, constraints=[linear_constraint, nonlinear_constraint], options={'verbose': 1}, bounds=bounds) print(res.x)

 `gtol` termination condition is satisfied. Number of iterations: 12, function evaluations: 8, CG iterations: 7, optimality: 2.99e-09, constraint violation: 1.11e-16, execution time: 0.033 s. [0.41494531 0.17010937]

If necessary, the Hessian calculation function can be defined using the LinearOperator class.

 def rosen_hess_linop(x): def matvec(p): return rosen_hess_prod(x, p) return LinearOperator((2, 2), matvec=matvec) res = minimize(rosen, x0, method='trust-constr', jac=rosen_der, hess=rosen_hess_linop, constraints=[linear_constraint, nonlinear_constraint], options={'verbose': 1}, bounds=bounds) print(res.x)

or the product of the Hessian and an arbitrary vector through the hessp parameter:

 res = minimize(rosen, x0, method='trust-constr', jac=rosen_der, hessp=rosen_hess_prod, constraints=[linear_constraint, nonlinear_constraint], options={'verbose': 1}, bounds=bounds) print(res.x)

Alternatively, the first and second derivatives of the optimized function can be calculated approximately. For example, the Hessian can be approximated using the SR1 function (quasi-Newtonian approximation). The gradient can be approximated by finite differences.

 from scipy.optimize import SR1 res = minimize(rosen, x0, method='trust-constr', jac="2-point", hess=SR1(), constraints=[linear_constraint, nonlinear_constraint], options={'verbose': 1}, bounds=bounds) print(res.x)

Conditional optimization method = "SLSQP"

The SLSQP method is designed to solve problems of minimizing a function in the form:

m i n_{x} f (x)

$\ min_x f (x)$

c_{j} (x) = 0, j i n m a t h c a l E

$c_j (x) = 0, j \ in \ mathcal {E}$

c_{j} (x) g e q 0, j i n m a t h c a l I

$c_j (x) \ geq 0, j \ in \ mathcal {I}$

l b_{i} l e q x_{i} l e q u b_{i}, i = 1, . . ., N

$lb_i \ leq x_i \ leq ub_i, i = 1, ..., N$

Where $\ mathcal {E}$ and $\ mathcal {I}$ - sets of indexes of expressions describing restrictions in the form of equalities or inequalities. $[lb_i, ub_i]$ - sets of lower and upper bounds for the domain of definition of the function.

Linear and nonlinear constraints are described as dictionaries with the keys type , fun and jac .

 ineq_cons = {'type': 'ineq', 'fun': lambda x: np.array ([1 - x [0] - 2 * x [1], 1 - x [0] ** 2 - x [1], 1 - x [0] ** 2 + x [1]]), 'jac': lambda x: np.array ([[- 1.0, -2.0], [-2 * x [0], -1.0], [-2 * x [0], 1.0]]) } eq_cons = {'type': 'eq', 'fun': lambda x: np.array ([2 * x [0] + x [1] - 1]), 'jac': lambda x: np.array ([2.0, 1.0]) }

The search for the minimum is as follows:

 x0 = np.array([0.5, 0]) res = minimize(rosen, x0, method='SLSQP', jac=rosen_der, constraints=[eq_cons, ineq_cons], options={'ftol': 1e-9, 'disp': True}, bounds=bounds) print(res.x)

 Optimization terminated successfully. (Exit mode 0) Current function value: 0.34271757499419825 Iterations: 4 Function evaluations: 5 Gradient evaluations: 4 [0.41494475 0.1701105 ]

Optimization example

In connection with the transition to the fifth technological mode, consider the optimization of production on the example of a web-studio, which brings us a small but steady income. Imagine yourself a director of a galley, which produces three types of products:

x0 - selling landings, from 10 tr.
x1 - corporate sites, from 20 tr.
x2 - online stores, from 30 tr.

Our friendly working team includes four juns, two midls and one seigneur. The fund of their working time for the month:

Juny: 4 * 150 = 600 * ,
Middle: 2 * 150 = 300 * ,
Senor: 150 * .

Let the first available junior should spend (10, 20, 30) hours on the development and deployment of one site of type (x0, x1, x2), middle (7, 15, 20), senor - (5, 10, 15) best hours time of your life.

Like any normal director, we want to maximize the monthly profit. The first step to success is to write down the objective function value as the sum of income from the products produced during the month:

 def value(x): return - 10*x[0] - 20*x[1] - 30*x[2]

This is not an error; when searching for a maximum, the objective function is minimized with the opposite sign.

The next step - we prohibit the processing of our employees and impose restrictions on the working time fund:

\ begin {bmatrix} 10 & 20 & 30 \\ 7 & 15 & 20 \\ 5 & 10 & 15 \ end {bmatrix} \ begin {bmatrix} x_0 \\ x_1 \\ x_2 \ end {bmatrix} \ leq \ begin {bmatrix} 600 \\ 300 \\ 150 \ end {bmatrix}

$\ begin {bmatrix} 10 & 20 & 30 \\ 7 & 15 & 20 \\ 5 & 10 & 15 \ end {bmatrix} \ begin {bmatrix} x_0 \\ x_1 \\ x_2 \ end {bmatrix} \ leq \ begin {bmatrix} 600 \\ 300 \\ 150 \ end {bmatrix}$

Which is equivalent to:

\ begin {bmatrix} 600 \\ 300 \\ 150 \ end {bmatrix} - \ begin {bmatrix} 10 & 20 & 30 \\ 7 & 15 & 20 \\ 5 & 10 & 15 \ end {bmatrix} \ begin {bmatrix} x_0 \\ x_1 \\ x_2 \ end {bmatrix} \ geq 0

$\ begin {bmatrix} 600 \\ 300 \\ 150 \ end {bmatrix} - \ begin {bmatrix} 10 & 20 & 30 \\ 7 & 15 & 20 \\ 5 & 10 & 15 \ end {bmatrix} \ begin {bmatrix} x_0 \\ x_1 \\ x_2 \ end {bmatrix} \ geq 0$

 ineq_cons = {'type': 'ineq', 'fun': lambda x: np.array ([600 - 10 * x [0] - 20 * x [1] - 30 * x[2], 300 - 7 * x [0] - 15 * x [1] - 20 * x[2], 150 - 5 * x [0] - 10 * x [1] - 15 * x[2]]) }

Formal restriction - output should be only positive:

 bnds = Bounds ([0, 0, 0], [np.inf, np.inf, np.inf])

And finally, the most optimistic assumption is that because of the low price and high quality, a queue of satisfied customers is constantly lining up with us. We can choose the monthly production volumes ourselves based on the solution of the conditional optimization problem with scipy.optimize :

 x0 = np.array([10, 10, 10]) res = minimize(value, x0, method='SLSQP', constraints=ineq_cons, bounds=bnds) print(res.x)

 [7.85714286 5.71428571 3.57142857]

The nasty one is rounded to the nearest whole and we calculate the monthly loading of the rowers with the optimal output of products x = (8, 6, 3) :

Juny: 8 * 10 + 6 * 20 + 3 * 30 = 290 * ;
Middle: 8 * 7 + 6 * 15 + 3 * 20 = 206 * ;
Senor: 8 * 5 + 6 * 10 + 3 * 15 = 145 * .

Conclusion: for the director to get his deserved maximum, it’s best to do 8 landing pages, 6 medium sites and 3 stores per month. The senor must plow without interrupting the machine, the load of mids will be about 2/3, dzunov less than half.

Conclusion

The article outlines the main methods of working with the scipy.optimize package used to solve conditional minimization problems. Personally, I use scipy purely for academic purposes, so this example is of such a joke.

A lot of theory and vinar examples can be found, for example, in the book of I. L. Akulich "Mathematical programming in examples and problems". More hardcore application of scipy.optimize to build a 3D structure for a set of images ( article on Habré ) can be viewed in scipy-cookbook .

The main source of information is docs.scipy.org , those wishing to contribute to the translation of this and other scipy sections scipy welcome to GitHub .

Thank you mephistopheies for participating in the preparation of the publication.

Source: https://habr.com/ru/post/448054/

All Articles