On the market, a cow man sold (computational solution)

As I wrote , recently, when analyzing the structure of one specific service market, a rather entertaining task emerged, the solution of which at that time was not at all obvious to me, and, as the experience of the previous publication showed, it was not at all obvious to many readers here. This decision I owe a lot to your comments and advice. For me, it seems quite instructive that the task, at first glance, from the theory of mass service found a solution by invoking a field of knowledge that is far from it, thus showing the unity of all branches of mathematics.

First of all, I would like to note that the reasoning given below, although they give correct recommendations for action with sufficient accuracy for practice, is also conceptually correct, unfortunately, does not pretend to the simplicity of a qualitative description of characteristic types of strategies. You can compare the proposed strategy search method with the solution of a quadratic equation by the Newton method: the result is correct, and the nature of the solution is hidden.

To begin, let's try to find out if there is any way to act on the market without loss for yourself. The only reason for damages is the need to bear the cost of maintaining the cows. It is clear that the seller applies the most economical strategy, when he has no more than one cow in the stall, and only after selling it, he orders the next one. Applying this approach, you need to sell the available cow to the first buyer, belonging to any class, profitable from the point of view of the Man. It remains to decide which classes recognize profitable. For example, take two classes of customers: let the representatives of the first appear on average 3 times per hour and give 1 rubles, and the representatives of the second - 2 times and give 7. It’s easy to realize that by selling only the first class, you can earn 3 steering wheels for each idle hour , only to the second class of buyers - 14 rubles, and if you sell to any representative of these classes, then 14 + 3 rubles for every hour of paid cow idleness. Indeed, on average, 3 + 2 customers will arrive per idle hour, while on average 3 of them will be from the first class, and 2 from the second. From these considerations, we can conclude that a profitable strategy will be optimal in terms of profit when a cow is sold to any buyer who gives a non-negative premium to the purchase price. If the average profit per hour is less than the cost of feeding, it hardly makes sense to remain on the market.

Let's now move on to a discrete in time analog. To get a good approximation, we divide the time axis into so small equal intervals so that their size does not exceed 1/10 of the time for delivery of cows from the village and the probability will appear for at least one buyer, no matter what class, for such a period did not exceed all that same 1/10. It also suggests a condition that somehow should limit the size of the discretization error of spending on relatively expensive food, as, for example, in the case of a very rich and rarely appearing buyer and a constant opportunity if you want to sell a cow very quickly at the purchase price. However, the condition for selecting such a small period that 1/10 of the buyer appears on average during the course of time limits the speed of reaction to an extra cow, making a further decrease in the time interval impractical.
')
The next step in the construction of the analogy is the transition from independent Poisson flows to dependent discrete flows of customers. Let in a continuous model of probability appear for an infinitely small period of time the buyers of each category were treated as a: b: ...: h, and the probability appears at least one buyer in the time interval obtained during splitting was equal to p, then in the discrete model at the beginning of each step we will carry out a one-time Bernul test for “Success” with probability p, and in the case of “Success” there will be some test with many incompatible outcomes: one for each class of customers and the same ratio of probabilities between outcomes. Due to the fact that in a continuous model, the probability of two buyers appearing at once during one split period compared to the probability of occurrence exactly one was extremely low, the discrete model thus constructed, even despite the relationship between the streams, will give a good approximation.

It is time to describe the very process of action in the discrete market. Let the delivery time T be divided into r intervals. The following is the order of actions within each split interval, hereinafter referred to as rounds or moves. At the beginning of each tour, the Man receives the cows ordered for this time and pays for the feed of all the livestock he has, then the tests described above determine whether the buyer has come and what is his potential profit. At the end of the tour, the Mujik will have to perform two actions: to sell the cow, or to refuse the deal and order the required number of cows, which will be delivered to him exactly in r moves. On this move is completed and a new one begins.

In the described scheme of market functioning, the state of the Guy at the time of the decision is fully described.

1) the number of cows in his stall,
2) the distribution of all the cows he ordered in the nearest r moves,
3) the presence and class of the buyer on the current turn.

I must emphasize here that in 2) it is important not only the total number of cows in the order, but also their distribution on the nearest moves, for which the order is no longer possible.

Hypothetically, the Man has a countable number of possible states, but it seems quite obvious that there are so many M cows, to have an order in excess of which is not beneficial, regardless of the distribution of their delivery on r nearest moves. As a result, the number of states that actually participate in the game with any optimal strategy turns out to be finite.

Another almost obvious statement, used below, is this: let's expand the class of admissible strategies, allowing the peasant to act also based on random experiments, like tossing a coin. Such strategies in game theory are called mixed and make it possible for the opponent not to give an opportunity, having studied the style of your game well enough, to make their strategy more effective. It seems plausible enough that since in the described experiment there is no opponent, then replacing in any optimal strategy all the results of throwing a coin into an "eagle", you will not degrade the strategy itself.

I turn, finally, to the description of the algorithm. We draw on paper all the admissible states of the Man at the beginning of the tour, when the cows have already arrived, and at the end before making a decision with the total number of available and ordered cows not exceeding M. Our goal will be to define between these states the directed edges with slyly chosen costs so that To solve the problem, it was possible to apply a search algorithm for a closed stream of unit power (the stream entering each vertex is equal to the stream coming from it, the sum over all vertices of the intensities of the incoming streams is 1) . So, consider some state at the beginning of the turn. From it we draw the directed edges to the states possible at its end. Each such edge corresponds to the class of the incoming client or their complete absence. Let us attribute these ribs of weight equal to their probabilities and cost equal to the value of the cost of feeding. Now consider the state corresponding to the moment before making a decision on the sale and order. For each pair (order quantity, sell / no), we draw an arrow into the state to which these actions will lead. We will not attribute a fixed weight to these arrows, and we will put their cost equal to the value of the order price minus the value of the order value. For most practical applications, it seems that you can order no more than one cow at a time.

We reduce the problem to the analysis of Markov chains. In this approach, the choice of strategy corresponds to the assignment at each vertex corresponding to the state at the end of the turn to the probabilistic weight of all edges emanating from it. Finding the limit probability distribution (in nondegenerate cases, all the vertices are reachable and non-periodic), we define the limit probability flow and by it the power of income or loss. The strategy will be optimal if it maximizes the power of income.

The search for marginal flow and profit maximization should be solved simultaneously using the linear programming method, which in terms of average computational complexity in experiments is significantly better than the brute force method, also applicable to solving a discrete problem, at least theoretically.

From the comments I understood that the last paragraph requires clarification. If at some point there is a probability distribution in a particular state of a Markov chain, then the product of the probability of being in a given state by the probability of a transition to some other specific state is, by definition, the value of the probability flow at the specified transition. If a chain has a limit stationary distribution, then it is reasonable to call the corresponding probabilistic flow limit. The naturalness of these definitions can be seen by observing, for example, the movement of bees between trees when an apiary is displayed in the garden. After the sun rises, the proportion of bees on each tree will quickly come to almost constant values, exactly like the magnitudes of the flows from one tree to another.

Now, as to the application of the method of solving a linear programming problem: if the probabilistic flow on the constructed state graph is given by its magnitudes on each edge, then the only constraints in the form of inequalities that are imposed are that they are not less than zero. As for the linear connections, they are as follows:

1) the total incoming flow to any vertex corresponding to the state at the beginning of the turn is redistributed along the outgoing edges corresponding to the probability of a customer’s arrival of a given class, proportional to the probabilities for this class or the same as indicated in the text above — the edge weight;
2) flows along edges emanating from vertices indicating states at the end of a turn can be chosen by any, as long as their sum is equal to the sum of the flows entering their peak;
3) the sum of all components of the stream is equal to 1.

Above was given the method of assigning all the edges of their values. The scalar product of the flux vector and cost vector is used as the objective function of the linear programming problem.

Source: https://habr.com/ru/post/324632/

All Articles

On the market, a cow man sold (computational solution)

More articles: