Assignment task

The problem of the best distribution of a certain number of works between the same number of performers. When solving it, an optimal assignment is sought from the condition of maximum overall performance, which is equal to the sum of the performers' performance. The most effective method of solving it is the Hungarian method. The assignment problem has many interpretations: the distribution of work between mechanisms, the distribution of targets between fire means to maximize the mathematical expectation of the number of targets hit or average damage, etc.

1. Task (wording).

We will be simpler (in terms). =)
')
Imagine a task: you need to effectively implement some relatively large project. For the project was selected a team of first-class developers, versed in various fields. And so lucky that there is information (accumulated statistics) about the degree (efficiency) of technology ownership for everyone. But, unfortunately, there are not so many developers in the team, or rather, as many as there will be technologies used in the project.

Necessary: the most effective way to use the largest number of technologies, assigning each developer their own task. those. optimally distribute among them the areas of responsibility for technology, taking into account their personal abilities.

2. Approach to the solution

In the end, all words can be reduced to three words: people, product, and profits © . If we try to depict the task set for us and the task on a piece of paper, then we will certainly get a similar illustration, as in the figure (with such pictures ).

It turned out - the graph . Vertices on the left are developers, vertices on the right are technologies (tasks). The ribs that connect them - mean how much the developer understands it. These numbers, i.e. The degree of ownership of the developer of this technology is very important, but we turn to them a little later. In the meantime, we have already correctly identified the direction in which this task is being effectively solved:

3. Counts

The very basics of the graphs were presented in the article ( max cost ), so I will immediately turn to terminology relating to this task:

A bipartite graph is a graph for which there is a partition of the set of vertices into two parts (parts) such that the ends of each edge belong to different parts. There is also a clear separation in our task: some vertices are developers, others are tasks, and connections (efficiency of ownership) are only between developers and tasks.
A mapping of an undirected graph G is a subset M of its edges E, chosen so that no two edges of M are adjacent, that is, do not have a common vertex. In terms of our task, synonym for this would be “assignment” , so that each involved developer takes on a separate task. And it did not happen that two developers are working on one problem, or one “poor man” was responsible for two tasks.

In graph theory, our problem, oddly enough, is called the Assignment Task (MF) . =) It is a special case of the problem of finding the maximum matching. In fact, we are trying to maximize the use of resources so that the maximum number of technologies is worked out at the same time, so in terms of graphs we are trying to find the “maximum matching”, to make up the maximum number of developer-task pairs.

4. Maximum matching

To simplify our lives, we are not paying attention to people's abilities. Just want everyone to find a job. Several first-hand developers to offer to work with the familiar technology will not be a problem. Continuing in the same vein, we will distribute several more tasks, but the matching constructed in such a way is unlikely to be maximal. The situation is possible as the one shown in the figure:

How to increase matchings (appointments)?

Select an untapped developer who has not yet been assigned a task. Let's see what he could handle, i.e. familiar to him technology. If you find a free one among them, great, this is what we were looking for. Assign. And if the task is already "busy" by another developer? Let's try a busy developer to find another free technology, because in this case we would assign this to our unused ward. In general, from the top of an untapped developer or developer to whom we are trying to reassign a task, we scan all the familiar technologies for the presence of a free one:

if you find a free one, the search is complete
if the task has already been assigned to someone, then having passed along this edge of the matching, we will try to “reassign” the technology to the developer participating in this assignment

In the course of such a traversal of the graph, we are trying to get into the "free task" from the "untapped developer". Thus, the search "unfolds" in the following tree:

Add some more terminology from graph theory, in simple words:

An exposed vertex is a vertex that does not participate in the current matching. Those. either "untapped developer" or "free task".
An alternating chain is a chain whose edges alternately lie or do not lie in a matching. (... - technology ownership - assigned task - technology ownership - assigned task - ...)
An alternating tree - a tree of alternating chains
An augmental chain is such an alternating chain, the initial and final vertices of which are exposed. This is what we are looking for! =)
An augmental tree is a tree, respectively, in which at least one of the branches is an augmental chain.

So we found a way to increase matching, trying to get the maximum. You need to build an altera-tree. When it becomes augmental, look for augmental chains from the “untapped developer” to the “free task” and then “reassign” the tasks along them. It is beneficial because increases the number of "tasks in processing" by 1:

Now we can already use the most technologies in the project. It's time to take into account another important condition of the problem we are facing: the efficiency of technology ownership. We want to optimally assign tasks to all.

5. Hungarian method.

Find a solution with the maximum total efficiency (cost). It sounds, in a sense, like the task of optimal packing of a backpack. Makes me think. Now, if we had the opportunity to act on the principle of "greedy algorithms."

We would have to start all the developers to stop assigned tasks in accordance with their maximum abilities. If all the developers managed to distribute the tasks right away - great. But this does not happen often. Suddenly two people optimally possess the same technology, who will get it and what to do in this situation? One of the developers will need to find another free task, also the most appropriate to his abilities. If under the current (maximum requirements) conditions there is no free problem, then we will need to try to find among the tasks, having a little underestimated our requirements. How to artificially underestimate the ability of developers in the graph. If under such conditions we find a free problem, we obtain an augmental tree. "Change" in the matching chain, after which it will be +1. And we will continue to appoint in such an optimal way until we find a job for everyone.

The strategy is clear.

We almost guessed the principle of the Hungarian algorithm. But how can we build a solution based on the principle of “greedy algorithms”: assign to max abilities according to max abilities, then slightly lower the capabilities and add new tasks to the analysis, assign them to the limit, underestimate ... etc.? How to assess the ability and optimality of the current assignment?

The whole "trick" of this algorithm is as follows. We are given only one parameter in the graph - the effectiveness of the solution of a specific task by the developer, which is indicated on the edges. This value is assigned to the developer-task pairs. We "divide" (separate from pairs) these values by two. Artificially add two additional parameters. Some values will be assigned to task tops, others to developer tops.

I will try to give this interpretation:

for developers, we indicate their “abilities” , let's say in units of “forces”, simply indicating how effectively we can use them or use them.
for tasks, we will indicate their “knowledge” , or, if I may say so, “oversupply”. This parameter will also be measured in the "force". An overabundance of attention to the problem arises in the following situation. If we are a developer “underloaded”, i.e. he is able to solve a problem on 5, and he only got on 3. Then he still has 2 “forces” that he, in principle, can devote to some of his familiar tasks. To run between cabinets, advise on the phone, give advice to those who are engaged in his beloved technology.

Thus, the values indicated on the edges we “divide” into 2 values assigned to the vertices: the efficiency of the solution of the problem = the developer's ability + the knowledge of the problem. In principle, it is logical. The more capable the developer or the more well-known the technology, the better this part will be implemented in the project. More effective.

In the end, after we find a solution, we will of course consider only the values on the edges, but now this "trick" will help us find a solution. =)

6. Description of the algorithm

Initialize the graph. Being “ stubborn optimists ”, for each developer we will calculate his maximum “ability” according to the technologies familiar to him, and assign this number to him. He is the best suited © . Nothing is known about the tasks yet, therefore we initialize their “knowledge” with zeros.

When searching for a “free task” for an “untapped developer”, we now confine ourselves only to (we call them) the optimal edges of the graph, i.e. those for which the equality is fulfilled: the efficiency of the solution of the problem (edge) = the ability of the developer (vertex) + knowledge of the problem (vertex) .

Further, we do the same as in the search for the maximum matching. We seize unused developers in turn and, looking for free tasks for them, we build an alternating tree (consisting of alternating chains), but only along the optimal edges. Then there are 2 possible situations:

It was possible to find a free task. The tree became augmental. "Reassign" tasks, increasing the matching. We begin to build the alternating tree again, because you never know how the graph has changed
We did not find (did not reach) the free problem on optimal edges. And she is, because After all, we started with a free developer, and in the graph we have the same number of tasks and developers. The resulting alternating tree becomes the so-called Hungarian (the whole method is also called). In this case, we will need to slightly lower our requirements for developers and start the search again. Failure is to intelligently (again) this time more intelligently (c) .

That came to the last moment of the Hungarian method for which all these additional parameters and "abilities" were conceived. Suppose that by building on the alternating tree, we ended up with a Hungarian tree. Consider which vertices fall into this tree:

Uninvolved developers, because it is with them that we begin to cost a tree
Developers and tasks to which you can reach the optimal edges from unused developers. Because it is through their “reassignment” that we are trying to employ the latter.

Outside this tree, in the remaining column will be present:

Developers and tasks that are in the matching, but inaccessible from free vertices (developers). Found them work - there is nothing to touch them.
Tasks that are unattainable by optimal edges - and we need to get to them. Therefore, when building a tree, we will mark the peaks that we managed to hit. And these tasks, accordingly, will remain unmarked.

Further in the cycle, we will run along the “border” of the tree: along those edges that connect unaffected developers or developers that are reachable from them (maybe they can be “reassigned”) to adjacent tasks (along non-optimal edges). For these edges, we calculate the currently minimal “mismatch” of the developer’s abilities so that he can proceed to this task: delta = min (developer’s ability (vertex) + task knowledge (vertex) - efficiency of problem solution (edge)) .

Then inside the Hungarian tree we:

Reduce the ability of developers on delta to “join” in the least painless way, at least one edge to the alternating tree, which next time we will continue to search for a free task
Let us increase the “knowledge” of the tasks on delta, so that the edges of the already constructed augmental graph are optimal. Those. to maintain equality: the efficiency of solving the problem (edge) = the ability of the developer (vertex) + knowledge of the problem (vertex)

Mini-interpretation: we reduce the ability of developers to subsequently "attach" at least one of them. We will attach it, but it will not work in accordance with their qualifications. He could have more. Therefore, he has released some amount of time to advise colleagues on the task in which he is most competent. She becomes more studied in the team. She, in turn, probably was engaged in another developer, who is now also able to be replaced in case of anything. You can also reduce its competence on the study of the problem. And so on, “along the chain” in the team, the “knowledge” of the tasks increases and the developers' abilities to find some purpose for them are slightly reduced.

Everything. All steps of this method are considered. We continue in the same vein ... It is the courage to continue that counts .

7. Algorithm words, very briefly.

Now let's collect everything to the heap:

Initialization. To developers - max abilities. Tasks - not studied.
Not all developers have found the tasks yet.
- So far it is possible to construct an augmental tree (to find free problems) along optimal edges.
  - "Reassign" tasks, increasing matching
- Not achieved free task. Hungarian tree.
  - We lower developers abilities by min

8. Listing

The code, of course, will be shorter than all my description. =)

I took it here . In my opinion, very good implementation. The only difference is that the author has a code for the method of minimizing appointments (if, for example, the salary is on the ribs), and in the article we distributed tasks in order to obtain maximum efficiency. Therefore, slightly modifying the code, here is the implementation of the maximum method:

int n; vector < vector< int > > a; // a[][] vector< int > xy, yx; // : xy[], yx[] vector< char > vx, vy; // vx[], vy[] vector< int > maxrow, mincol; // , bool dotry ( int i) { if (vx[i]) return false ; vx[i] = true ; for ( int j=0; j<n; ++j) if (a[i][j]-maxrow[i]-mincol[j] == 0) vy[j] = true ; for ( int j=0; j<n; ++j) if (a[i][j]-maxrow[i]-mincol[j] == 0 && yx[j] == -1) { xy[i] = j; yx[j] = i; return true ; } for ( int j=0; j<n; ++j) if (a[i][j]-maxrow[i]-mincol[j] == 0 && dotry (yx[j])) { xy[i] = j; yx[j] = i; return true ; } return false ; } int main() { // ... a ... mincol.assign (n, 0); minrow.assign (n, 0); for ( int i=0; i<n; ++i) for ( int j=0; j<n; ++j) maxrow[i] = max (maxrow[i], a[i][j]); xy.assign (n, -1); yx.assign (n, -1); for ( int c=0; c<n; ) { vx.assign (n, 0); vy.assign (n, 0); int k = 0; for ( int i=0; i<n; ++i) if (xy[i] == -1 && dotry (i)) ++k; c += k; if (k == 0) { int z = INF; for ( int i=0; i<n; ++i) if (vx[i]) for ( int j=0; j<n; ++j) if (!vy[j]) z = min (z, maxrow[i]+mincol[j]-a[i][j]); for ( int i=0; i<n; ++i) { if (vx[i]) maxrow[i] -= z; if (vy[i]) mincol[i] += z; } } } int ans = 0; for ( int i=0; i<n; ++i) ans += a[i][xy[i]]; printf ( "%d\n" , ans); for ( int i=0; i<n; ++i) printf ( "%d " , xy[i]+1); } * This source code was highlighted with Source Code Highlighter .

9. Total

If someone sees Hungarian for the first time. And after reading the description, and the listing behind it, there will be a sure impression “yes here on the listing and without these descriptions everything is clear that it was a rattle”. I would still hope that at least in part the description added understanding to the operation of the algorithm. I will be sincerely happy for you! and in turn, it will give me a little sense that I wrote, probably, for good reason. =)

Source: https://habr.com/ru/post/63982/

All Articles