Our warehouse is the size of two Red Squares and is 5 floors high and works all year round and never sleeps - 24/7,364 days a year (the only day off is January 1). We have stored and maintained more than 8,000,000 products, every day more than 300 operators are replacing them. They work with products coming from all over the world and collect orders for users from four countries: Russia, Ukraine, Belarus and Kazakhstan. On such a scale, business requires flawless automation.
Under the cut, I, Pasha Finkelstein - team leader of the development and automation of the warehouse - I will tell you what the open source solution can grow if you attach a good development team and a very specific business problem to it.

The basic logic of the work
Three main processes of any warehouse: acceptance of the goods, their storage and shipment. Simplified, the cycle of our warehouse looks like this: primary identification, quality control, placement, selection and reservation for an order, search, sorting, packaging, transfer to the delivery service. When the customer returns the product, the cycle repeats. Each physical entity involved in these processes has its own informational representation, for example: truck, product, cabinet cell, parcel, packaging material, container, etc. All significant movements and changes in the status of the goods are transmitted to the accounting systems and absolutely every action with the goods inside the warehouse is logged.
')
WMS (Warehouse Management System) controls the life cycle of each item in stock from the moment a truck arrives at the warehouse to the shipment of goods to the customer.

Specific automation fashion
Our company works in the field of fashion and lifestyle, which puts certain tasks in the warehouse: the goods can be fragile (glasses, watches), non-standard size (winter boots or jewelry), premium (in special packaging) - or have other specific characteristics that warehouse must take into account. Therefore, the use of manual labor in storage areas can not be completely abandoned.

All other processes are automated - receiving goods, moving to the shipping area, sorting, packaging and preparing for shipment. Each of these processes requires special equipment and an operational process. Magic happens when all these processes stick together and start working together - thanks to our systems.
Any mistake in the automation of the warehouse - whether it is an interface that contributes to operator errors, a non-optimal process, etc. - This is a delay in shipping, the entire complex is simple, huge losses. In addition, with each mistake we form a negative customer experience. Therefore, it is important for us that the warehouse works like a clock.
Open source and the way to own development
At the opening stage, we used an external warehouse. As volumes grew, we began to realize that we needed complete control over operational processes and a high rate of change in these processes, so we decided to move towards our own warehouse and development.

The main question that then arose before us is the elaboration of operational processes in all details. To the point where and how employees go, how many scans they do, etc. And over these processes it was necessary to deploy a WMS that manages operations and automates routine operations.
To begin with, we took an open source solution in Java and afterwards decided to put together our own development team, especially since there is already a suitable basis. We increased functionality, then we took on the core of the system: we got rid of legacy and fat client, refactored, developed new services to support operational processes.
Automation Stages
The main changes were made by “waves”, along with the restructuring of the processes themselves.
To date, he has experienced nine stages of modernization, and we do not plan to dwell on this.
- At the first and second stages, we automated the processes of shipment of orders - we added pipelines, logic for sorting goods, automated sorting of orders by pallets.
- At the third and fourth stages, we focused on the acceptance processes: we learned how to separate the flows of incoming goods according to different types and storage zones.
- The fifth phase added automated elevators between floors - work began in the storage area.
- The sixth phase was the most responsible when we closed the acceptance and shipment zones, thus looping all the automation.
- In the seventh and eighth phases, we made changes to the processes in the receiving zone and added new zones, elevators and conveyors: we scaled the existing automation.
- In the ninth phase, a new building was attached to the warehouse and integrated with the existing automation system.
Implementation
Our core technologies: Java, Postgres, Wildfly, Redis, ActiveMQ.
WMS is written in Java 8. But not so long ago we corrected the last module, which prevented the transition to Java 11, will be updated in the near future.
Under WMS reserved server rack, installed directly in stock. This gives us much more confidence that the WMS will work even if electricity and / or the Internet is turned off. The only thing that will suffer - messages to the accounting system will come with a delay. WildFly is used as an application server, but not yet the latest version. Migration to the latter is also in the plans. Everything has already been written for the move, but we did not have time to conduct functional and load testing, and the load is relatively high before the new year. Also used proven ActiveMQ.

We store the data in PostgreSQL. The main entity in our system, obviously, is the product. Sometimes warehouse staff think up workarounds to simplify their work, for example, they scan the same bar code 50 times, and the product itself is simply thrown by hand without scanning, without going into details, whether it is jeans or T-shirts, so we entered labels identifying a specific unit. product, supporting it in the infrastructure. Information about these units is stored in a 2-terabyte PostgreSQL database.
Most of the space there is not even the goods, but the audit of the actions of warehouse workers. Being a critical system for a business, the warehouse must know why something appeared in the system or disappeared - we cannot allow non-traceable changes. Right now we are thinking of bringing this part of the database to a separate entity in MongoDB.
Warehouse workers' workstations are thin web clients. Somewhere at the start of automation, it all worked on the principle of a fat client, which created certain difficulties, in particular, with large releases that included changes in the interface: about 150 workstations had to be updated manually. This and the fact that we could not be released without idle time, put restrictions on us - we could deploy no more than twice a week, early in the morning, when the night shift ends working, which cannot be called a convenient schedule. Now we have transferred the WMS to the web and by the end of the year we will finally give up fat clients, which will make it very easy for us to change the user interface. The web and the clustering added at one of the stages remove the restrictions on the frequency and time of releases - now users will learn about releases only if something went wrong.

There is also an interesting "exotic" in our warehouse. For example, the Haskell mentioned in the
Technoradar on which the item sorter visualization backend is written (this is a machine that can make goods from one parcel together and give them to the operator for assembly). There is a purely computational problem, which is conveniently solved in a functional style. Naturally, no one is going to use Haskell for any large-scale projects.
Another element of the warehouse, which we mentioned in the
article on Tekhnoradar, is a samopisnaya
state machine, which “monitors” the correct sequence of actions with each product. It, like the entire system, developed iteratively, starting with a simple set of constraints. Now it is a very handy thing, deeply integrated into our system. We hope in the near future to put it in
open source - perhaps it will be useful not only to us.
Automation equipment
What kind of automation without equipment! The entire warehouse is entangled by a network of conveyors.
The above-mentioned item sorter works at the shipment stage, allowing you to decompose tens of thousands of units assembled from the stock of goods for specific orders. At one time, the sorter saved our operators from having to travel with a trolley throughout the warehouse to collect the necessary goods. Orders are fragmented, each operator collects goods only from his own floor (saving time on movements), and the sorter ensures that products from different floors get into the necessary orders automatically. Changing the operational process 4 times faster order assembly and significantly reduced the number of errors.
All automated equipment is provided by our partner. They have their own system responsible for managing specific units, which is located in the server rack next to our WMS. Integration is set up between systems on a fairly high-level protocol - we communicate using SOAP. From our operational processes inside the WMS, we refer to their system when, for example, we need to move the container with the goods from point A to point B. Ie From the point of view of our system, all this automation looks pretty simple, despite its real internal complexity.
Of course, this apparent simplicity did not work right away. In the early stages of automation, we had a “mutual lapping” of technologies. Once the conveyor literally burned our goods - the speed of the conveyor belt was too high, it “burned” the goods and it burned down, which blocked the assembly of other orders. Perhaps the hardest story happened at the start of automation, when we started the first phase. Yesterday, the warehouse was completely manual, and today, after switching the knife switch, it should become automatic. But it didn’t work: due to an error in the integration of the system, they misinterpreted each other’s messages, which resulted in several days of idle storage and multi-million losses for us.
Now the partner is present in our warehouse, planning equipment placement with us when it comes to a new round of automation, helps to test new units.
Team and scrumban
The development of this whole system is now engaged in a team of 12 people. At one of the last stages in the peaks of modernization, when separately automated processes had to unite into something whole, up to 20 developers alone participated (that stage required 132 person-months and included more than 1500 commits). But as far as the end of large-scale transformations, some people decided to study Go or Python and switched to other development teams.
In the team, we have “classic” product managers who combine product and product functions from the IT side (on average, one PM for 5-6 people). His task is to communicate with our main customer - a warehouse represented by its director and the department of development of operational processes. For our part, we are more concerned with technical modernization — choosing the right stack, updates, etc. - and the guys from the warehouse think about process optimization.
Sometimes we ourselves devote time to "R & D in the field." In the literal sense, we arrive at the warehouse, communicate with senior shifts, with ordinary operators, clarify what problems they have, with which it is convenient and inconvenient to work. In other words, we conduct user experience research.
Thanks to this approach, for example, we have transformed the interface of the workplace of an employee who accepts goods. Initially, it was an enterprise complex interface with many fields, buttons and abbreviations instead of text explanations. But we tried to optimize the process, as well as the design, making it more similar to the main Google search page - not so beautiful, but very functional. The simpler the interface and the smaller the operator’s options, where to click and what to scan, the fewer errors (and time spent on correcting them).
And the accumulated knowledge on the optimization of parts now overtakes us at the most unexpected moments: once our team sat in an institution and at one moment almost all the participants saw the sequence of actions of the cashier. After 40 seconds, a colleague voiced a common thought: “Not very optimal, you can simplify.”
Although the relationship between the roles in the team is quite classic, we chose scrumban as a development methodology.
We experimented a lot with methodologies, while the “introductory” data were non-standard. For example, we had rather rare releases. The above-mentioned limitation of two releases per week was acted on the part of the processes, but in fact we deplored much less often - on average once every two weeks. In addition, we had the hardware part of warehouse automation, the development of which is carried out by an external company on clean waterfall, where all changes are written two years in advance with all the necessary documentation. However, we ourselves could not follow their example: we needed to make some changes to the system on a regular basis, and it was pointless to force the customer to write detailed tasks for each of them.
So scrumban is a compromise that suits everyone. We use an iterative process, but the sprint for us is the release. Once a month we meet with the customer and deal with release planning: we discuss what we roll out on what week. A kanban is implemented inside the sprint - with backlog of tasks, progress, etc. True, this process is gradually changing - for example, we have no kanban boards. Just when one developer finishes his task, he is given the next one from the pool in accordance with the plans for the next release and the competencies of the developer himself.
We like this approach. It provides the necessary flexibility within iterations, and the business customer gives predictability of the dates by which certain commits will be implemented. And we are not as important as this methodology is called. The main thing is that everything works.
Not like everyone else - on the example of inventory and monitoring
Developing operational processes, we were repelled by the needs of our industry; therefore, we have quite a few individual features.
A good example is inventory. By law, it must be performed in a warehouse once a year, but our business requirements determine more closely monitoring the flow. Firstly, we want to reflect on the site up-to-date information on the availability of goods, and secondly, our B2B partners, fashion brands, require the same up-to-date information. Therefore, our inventory takes place daily, 364 days a year, shelf after shelf in the entire 5-storey complex of several buildings. And this process is fully supported by our WMS - on a ready-made solution it would be hard to implement this.
Now inventory is in the process of the next update to improve the efficiency of this process.

Another example of its own design is monitoring. It is implemented through a web client and allows you to display and track very interesting metrics. Moreover, it is important for us to visualize these metrics. In fact, monitoring is a warehouse drawn in a simple graphic, where we clearly see where everything works well and where problems are observed (up to a particular operator). The main thing is that with such an idea we can understand why these problems arise.

KPI Warehouse Workers and Redis
The introduction of new technologies, updates, refactoring is all great. But our WMS works in a real business, so it’s not only these tasks that have to be solved here. Part of our work is protection from internal “hackers” - resourceful warehouse staff who invent new ways to perform KPI to bypass the task.
For example, not so long ago we had to add Redis to the stack in order to exclude users from logging into the system from several workstations at the same time and implement a session timeout. The fact is that warehouse workers have guessed that working under one login and getting a bonus for overfulfilling KPI is much more beneficial than increasing their own productivity.
Since the solution of the business problem required changes in various parts of the system, from a technical point of view it was a very interesting person.
On this surprises from the warehouse staff are not over. Almost immediately after the release of the session, PostgreSQL began to fall. We searched for the reasons for the unexpected degradation of the base for several days, until we discovered that the matter was, again, in resourcefulness. One girl often went to smoke. When she left the workplace, she was knocked out of the session, and to log back in, you need to find a senior shift and scan his badge. Cutting her wanderings around the warehouse, she simply tore off the barcode from one of the carts and fixed the scanner button with tape, setting the barcode to constant scanning. And it could go unnoticed for a long time, if the bar code was not from a cart in which 800 units of goods lay. With each scan, a huge SQL query was generated for validating products, which “killed” the base with such “internal DDoS”. We had to take care of the restrictions on the number of scans per unit of time and on the number of goods in the cart.
Such stories have accumulated quite a lot, and we are constantly faced with new ones. In this case, the system must adapt each time to new conditions. In such situations, one cannot restrict oneself to administrative methods - what happened once may well be repeated.

Where are we going next?
Process optimization and warehouse automation, it seems, is impossible to complete. It has been in the company for 5 years, and, as I said above, even after stage 9, we are not going to stop. The company continues to expand both in B2C and B2B, so that in the near future we are planning another big project - opening another warehouse, this will require either large-scale rewriting of the existing system, or creating a similar one from scratch in a new place. And this is a new interesting challenge at the intersection of business, physical facilities, operational processes and technical solutions.