NOMAD: Optimizing Black Boxes at Home

The article will discuss a convenient tool that my colleagues and I often use in our practice. This tool is called NOMAD . This package is designed to optimize functionals of different complexity, mainly - difficult to compute, functionals with a gradient that is inaccessible for some reason, noisy, etc.

The optimized (minimized) functionality is considered as a “black box”; a separately implemented script or program (when working in batch-mode) or a specially implemented C ++ class (when working in library-mode) are responsible for calculating it. Optimization is carried out using the MADS algorithm

Sources are shipped under the LGPL license, versions for Unix, Linux, Mac OS X and Windows are available, registration is not required for downloading, but you need to fill out a small form (name, organization, city, country).
')
NOMAD: A blackbox optimization software

Why do you need it

The basic scenario is as follows. You are developing a vundervaflu, which makes a feasible contribution for the benefit of all. In the course of work, cool super-modern methods and algorithms are used, which are configured through numerical parameters that are passed to your configs (if they simply do not glow in the code in the form of magic numbers). The system has a certain quality indicator, which is calculated as if it is unknown, and now it is at around 98%, and you have a desire to reach 99.5%. I would like to have a tool that could be used to give out “twisters” and go and drink some tea, while he will raise the quality of the system a little bit higher. A good tool for solving this class of problems is NOMAD.

A special case. To solve the same problem, you use two different algorithms. The task implies some classification or assessment of the input object (for example, recognition), at the output you get a vector of assessment values (for example, the output vector of alternatives for a neural network). The first algorithm works better on one type of source data, the second algorithm works better on the other, and for some reason you cannot clearly distinguish between types of input data. A possible way out is to use both algorithms, and “averaging” the results of a-posteriori. The only question is with what coefficients (weights) to averaging the results. To determine the optimal averaging coefficients, you can use NOMAD, if you assign as a functional, say, the total number of errors or a certain total penalty on a given sample of input data.

How to use it

Consider a simple example of using NOMAD in batch mode. The task is this: you want to build a simple function that divides objects from a set of X into two classes: A and B. After conducting a small statistical study, you found that there are signs of f1 , f2 and f3 , indicating that object x belongs to some class. f1 , f2 and f3 are functions of an object whose value is a real number. We will look for the classifying function in the form C (x) = a1 * f1 (x) + a2 * f2 (x) + a3 * f3 (x) + b , where a1 , a2 , a3 are real numbers, and b is an integer from - 1 to 1 . If C (x)> = 0 , then we say that x belongs to A , otherwise x belongs to B. The catch is that if we make a mistake and define object x to set B , although it is actually from A , then this is certainly unpleasant, but not fatal, but if we defined x in A , but we had to in B , then 100 times worse.

There are several clever ways to solve this problem, we, for example, construct an estimate functional and find the coefficients using NOMAD.

Let there is a training base containing such data: the true class (where we should define
x) and the values of attributes f1 (x) , f2 (x) , f3 (x) . The base is such a text file: (say, base.data)

A 1.0 3.0 4.5555 B 2.3 2.3 0.0 B 2.4 2.5 9.0 ...

We define the functional in the following way: we scan the database, if an error has occurred, then we add the value of this error to the value of the functional. The cost of the error “it is necessary A , and we said B ” is equal to one , the cost of the error “it is necessary B , and we said A ” is equal to 100 .

We write a simple program that takes as its sole argument the environment file with coefficients a1 , a2 , a3 , b and outputting the value of the functional to the standard output stream.

 #!/usr/bin/python #  evaluator.py import sys #     def C(f, a): return a[3] + sum([x * y for x, y in zip(f, a[:3])]) #      def penalty(correct, f, a): answer = 'A' if C(f, a) >= 0.0 else 'B' #    -   if answer == correct: return 0.0 #   ,    elif correct == 'A': return 1.0 #   ,    else: return 100.0 if __name__ == '__main__': F = 0.0 #    sys.argv[1] a = list(map(float, list(open(sys.argv[1], 'r'))[0].split())) #      with open('base.data', 'r') as base: for line in base: line = line.strip().split() correct = line[0] f = map(float, line[1:]) F += penalty(correct, f, a) #    print(F)

The next step is the NOMAD configuration file: (say, params.nomad)

 #    DIMENSION 4 # ,    #    BB_EXE "$python evaluator.py" #   : #       BB_INPUT_TYPE ( RRRI ) #   : #    (OBJ) #    - # (   ) BB_OUTPUT_TYPE OBJ #   X0 ( 0 0 0 0 ) #    LOWER_BOUND ( -10 -10 -10 -1 ) UPPER_BOUND ( 10 10 10 1 ) #     MAX_BB_EVAL 1000 #   TMP_DIR /tmp

Run NOMAD:

 kbulatov@node ~> ./NOMAD.3.6.0/bin/nomad params.nomad NOMAD - version 3.6.0 - www.gerad.ca/nomad Copyright (C) 2001-2013 { Mark A. Abramson - The Boeing Company Charles Audet - Ecole Polytechnique de Montreal Gilles Couture - Ecole Polytechnique de Montreal John E. Dennis, Jr. - Rice University Sebastien Le Digabel - Ecole Polytechnique de Montreal Christophe Tribes - Ecole Polytechnique de Montreal } Funded in part by AFOSR and Exxon Mobil. License : '$NOMAD_HOME/src/lgpl.txt' User guide: '$NOMAD_HOME/doc/user_guide.pdf' Examples : '$NOMAD_HOME/examples' Tools : '$NOMAD_HOME/tools' Please report bugs to nomad@gerad.ca MADS run { BBE OBJ 1 41100.0000000000 5 27814.0000000000 12 22459.0000000000 14 5070.0000000000 36 4853.0000000000 44 4828.0000000000 49 4720.0000000000 58 4657.0000000000 78 4583.0000000000 93 4514.0000000000 106 4509.0000000000 115 4495.0000000000 117 4494.0000000000 118 4484.0000000000 119 4453.0000000000 133 4379.0000000000 145 4376.0000000000 153 4217.0000000000 156 4158.0000000000 177 4034.0000000000 181 3982.0000000000 184 3942.0000000000 216 3918.0000000000 237 3905.0000000000 262 3903.0000000000 458 3903.0000000000 } end of run (mesh size reached NOMAD precision) blackbox evaluations : 458 best feasible solution : ( 1.300140381 0.6046962738 -0.9088993073 -1 ) h=0 f=3903

In the last line - the desired coefficients.

Additional features

The configuration file has the ability to specify a huge number of additional parameters of the optimization algorithm, this review describes only the main ones.

It is also possible to use NOMAD as a static C ++ library (library mode). In this case, you need to write a class that calculates the functional, something like:

 class MyEvaluator: public NOMAD::Evaluator { public: MyEvaluator(NOMAD::Parameters const& p) : NOMAD::Evaluator(p) {} ~MyEvaluator() {} bool eval_x(NOMAD::Eval_Point& x, NOMAD::Double const& h_max, bool& count_eval) const { ///   ///  false,  -    ... count_eval = true; return true; } }

Afterword

The example turned out to be slightly strained, for a more efficient search it was necessary of course to use the library mode. In addition, the functionality in this case is the sum of the scaled signal functions, it was possible to significantly simplify the work of the algorithm, if these signal functions were approximated by some kind of sigmoid. The functional, at least, would turn out continuous. But the goal - to illustrate - has been achieved.

I would be glad to hear about other tools that allow you to perform something like this, if someone knows and has experience of successful application.

Source: https://habr.com/ru/post/181828/

All Articles

NOMAD: Optimizing Black Boxes at Home

Why do you need it

How to use it

Additional features

Afterword

More articles: