
Aeroexpress is a young company. A couple of years ago, when we started to implement a project to upgrade the data network, the company developed very quickly. So fast that their internal IT department realized at some point: it’s time to redo the network, because there were too many ticket selling points and other terminals and it’s time to change the manual network configuration procedures. This is a logical stage in the evolution of any company. At this stage, the customer thought out the right architecture and began to optimize the infrastructure, taking into account the margin of safety with further scaling capacity. The goal is to do everything from the first time to avoid possible problems in the future.
The main task was to divide the corporate network, in one of which there are users and terminals of cash registers, and in the other, cash transactions are carried out directly. In no case should they overlap. The ideal way out in this situation is to physically separate the networks, that is, to build two independent ones. However, it is quite expensive. Therefore, the right choice and configuration of network equipment has become an almost ideal solution. In our case, these were hardware firewalls.
')
The next task was to provide the opportunity to sell tickets, even if the meteorites fall into any two random infrastructure objects (including the Aeroexpress data center and the M9 core switches).
And yet - to make IP-telephony within a company that can work even when physically disconnected from the Internet.
In parallel, we raised Ethernet over IP (MAC over IP) and made a couple more fun and useful features.
How it was, how banks do it, and how to do it cheaper
At the time of the start of designing the new network architecture, the customer had a rather simple topology: the main data center with common servers and money machines, backup data center, office staff machines, and ticketing terminals, including those on trains. At each endpoint of the network, one or another uplink was raised, and it was the providers that provided for the contract tunneling, protection, traffic filtering and other services that the corporate network should do. The customer was highly dependent on the capabilities of the providers, the terms of the contracts were firmly fixed. For example, the customer could not see the equipment that is critical for himself, since it was owned by the provider. Accordingly, Aeroexpress could quickly respond to a communication failure either quickly, but manually by all system administrators, or in an SLA provider, which is usually at least 24 hours. And as you probably guess, even a couple of minutes of idle ticketing is a disaster. Crisis did not wait. The customer team was well aware of the limitations of the architecture, correctly anticipated the risks and decided to prevent the problem before it appeared.
In such a situation, banks would connect all their main branches (one point at railway stations in Moscow, airports, railway stations of other cities and a development center in St. Petersburg, as well as in their data centers) with their own optical channels. Since this is not just expensive, but very expensive, it’s natural that apart from banks, large state-owned companies and mobile operators, few can afford it. Therefore, the optimal solution is to create a single virtual network where all traffic to the end users is wrapped in a tunnel and securely encrypted.
A virtual network with a single addressing, so that the data center server can see both the developer in St. Petersburg and the terminal at Sheremetyevo as machines in its local network, allows you to completely abstract away from the iron level and switch providers directly on points, changing the channel. This was probably the most costly item in the project - it was necessary to purchase and put equipment on all nodes.
Architecture
There are two data centers: the first data center in Sheremetyevo (main), and the second in general on M9.
Initially, the LAN architecture at the central communications hub was a flat network with one kernel switch, all other switches were exclusively access switches, without a backup connection. The design solution implied a transition to a fault-tolerant connection, a second one was connected to the already existing kernel switch. The integration work could only be done in a very short nightly period so as not to disrupt the operation of business systems. Therefore, a migration plan was carefully worked out, according to which the work was carried out in several stages.
The entire IP addressing scheme has changed. All connections are reserved. For a time, there were two parallel networks, and only after an experienced two-week test had passed, was the core of the network migrated to new equipment. In parallel with the modernization of the network infrastructure at all facilities of the company, a new infrastructure was created in the data center. The information systems use Ethernet data transfer technology over IP networks, so-called. "MAC to IP".
The customer correctly approached the issue of the levels of accessibility of services and immediately set priorities (usually in manufacturing and even in banks it is easy to identify one or two critical services, but then difficulties begin). The main thing was what is in the Cash zone: payment systems, where transactions for tickets constantly revolve. A lower priority in terms of rescue at the reserve site was given to train service systems (depot, engineering subsystems) and workflow. If they fall, they will not stop repairing the motor, although they will have to write a lot on paper to bring the information to the IT systems later when the network rises.
The Aeroexpress IT service has allocated its own administration service inside it. Prior to this, the customer could not fully administer the infrastructure, since the equipment was not his. Now Aeroexpress sees each piece of hardware remotely, can respond to incidents and carry out maintenance work.
Switching between providers of the last mile is done directly on the routers in Watchdog mode - they look at the quality of the channel and, when they reach the boundary conditions, simply change it. In addition to the two active-active lines, a third separate provider can be put into operation in the backup case (this is especially true for the office).
This is how it looks from a security point of view:
On the left, two internal zones created at the ITU are displayed - these are Inside (common zone for internal services) and Cash (specially protected zone of the cash segment). The cash zone included an automated workplace of cashiers at stations and at the airports, and also ticket machines. The interaction of Cash with the rest of the sites was imposed by the most severe restrictions and the most stringent rules for scanning traffic for malware. The Inside area includes other services of the company: telephony, video surveillance, device management and the general internal segment of users, somewhere else was guest Wi-Fi. These sites are divided into VLANs, which are terminated on the Palo Alto equipment. The equipment located in the Inside zone is switches, telephones, video cameras, access points, ordinary users' computers.
Prior to this, I recall, users and systems had the same rights. There were no roles and access zones, so a competently deployed malware somewhere at the Kursk railway station in the ticket sales terminal could, in theory, put the entire system. Now everything is properly isolated.
Access from the inside to the outside was through the created Outside, Internet and VPN zones (shown on the right). Outside is an external WAN segment, routers were connected there. The Internet is the Internet, the last mile from the provider was connected there. VPN is a zone for VPN connections to the data center. VPN was used to reserve the WAN channel. In the event of a drop in the WAN channel, the traffic went through the VPN. On some sites it was the only way to access corporate resources (one of the points in Vladivostok, there was no WAN).
At the level of traffic processing rules, filtering applications and streams is done. For users, the Palo Alto hardware takes away the rights from the AD group, and for the subsystems there are rule sets - what is possible and what is not.
A sandbox was made on the same hardware to control what users download or receive via mail. First, a signature analysis, then a sandbox, in a couple of minutes the letter arrives at the end user. In the office and the data center, this is done with duplication, at the box office - without a fault tolerance cluster, but with a bypass.
There is also deployed (yet simple) protection against DDoS - attack signatures plus the usual analyzer. If necessary, this module quickly turns on and begins to repel "industrial" attacks.
Telephony
Before the project began, the customer already had a station at Asterisk. This is a great solution for a young company, and it was set up just like a clock. But the peculiarity is that Asterisk is very hard to maintain when the company is growing rapidly, and Aeroexpress has grown rapidly. In general, this is solved by the competences of the system administrators, but then the second task arises - distribution. At the time of the project, any problems in the network led to the inaccessibility of phones.
We implemented infrastructure, the main business requirement for which was fault tolerance and distribution. In the data center there is a cluster of servers, and on each individual branch the customer has a survival server.
When a branch is disconnected from the corporate network, traffic begins to run inside the branch via the survival server. Inaccessibility in 15 seconds - and all phones are re-registered on the spot. A SIP link is raised, and you can call the city, for example, the police or an ambulance. The phones work as usual, but the advanced functionality provided by the central servers falls out (but for emergency mode it is more than enough). So, the basis of the infrastructure is Avaya, connecting to the public telephone network via SIP-trunk and two-wire analog trunk (local connections to the Russian Railways network).
AVAYA IP-DECT Base StationDeck tubes were introduced in the office. DECT base stations are not to drive traffic through traditional Wi-Fi phones. Weifai is better at times. However, with Wi-Fi is very expensive to buy tubes. There are cheap Chinese, but they live half a year. Normal workers cost about a thousand dollars. Dektovskie just about $ 300.
Again, it would be possible, in theory, not to do a DECT, but to install operator femtocells, but it was necessary to protect the channels (and the femto does not know how to work in the internal corporate segment), to establish its own rules and its own filters. Plus, the customer absolutely did not need additional contracts with mobile operators. There is a PoE for connecting to a LAN for power - so as not to depend on direct cable lines.
In the same place, an additional fax server was deployed (it is really needed), UC, in particular, the service of the presence status at the workplace - all employees see that the line is busy for the counterparties from the communication circles. This is very convenient: if you want to call the next department, you immediately see that the person you are talking to is talking. Put the employee up - and immediately dial him.
Total
Since the work was planned, and not “we have a problem, we must urgently,” again, it was very pleasant to interact. No business processes stopped, the passengers didn’t notice that something had changed inside, the IT service and the cashiers regularly switched to the new network. Placed for expansion. All type-top. In general, from our side - just technical work, and from the side of Aeroexpress - a very competent work of the IT department, planning development for several years to come.
Just in case, here is my post for questions - avrublevsky@croc.ru.