It is considered that information technologies in government departments are taking root harder, and there are a number of objective reasons for this opinion. However, as Alf said: “You do not like cats? So you do not know how to cook them! ” And today we want to talk about how projects in state-owned companies differ from the point of view of a business IT integrator, and for what purposes gos create large repositories for analytical projects.
Historically, government departments are more inert, because it is customary for them to coordinate each step longer, because the decision point in them is blurred, because the customer can repeatedly change the task, clarifying what he really needs. Most of the officials themselves perceive IT projects without too much enthusiasm. In state structures, there is usually no strong resistance to the new, but there is no desire for it either, in particular, it is not easy to find a locomotive interested in the results of introducing new solutions. As a result, implementation is slower, and from the side it starts to seem that the customer does not need this or that project at all.

')
However, with all this, the attitude towards IT in state-owned companies has changed radically in recent years. Under the action of the "Electronic Government" program, state bodies realized that the use of information systems was at least inevitable. A change of generations, in which relatively young managers who are ready to develop, take leadership positions, creates a very important precedent - government agencies have people who can enter the project on the part of the customer in order to ensure its support and development.
What do you need gossam
If a business clearly knows what it needs and to what number, then government agencies cannot boast such knowledge. At the same time, the needs of departments, as a rule, turn out to be much wider than originally expected. For example, as practice has shown, in the transition to a system of interdepartmental interaction of the third version (SMEV 3.x), customers needed much more than just the introduction of new modules - they needed format converters that ensure backward compatibility with SMEV 2.x, as well as a number of additional making. Such things happen on any projects in state-owned companies, which requires an integrated approach from the integrator and readiness for everything.
Despite the fact that in government organizations, solutions from industry giants such as SAP or Oracle are being used everywhere, priorities have shifted recently, despite the stability of these platforms. The cost of support becomes more and more important, and also continues to gain momentum in the course of import substitution. For example, if earlier we planned to use the IBM Cognos TM1 and Oracle Hyperion EssBase OLAP engines for large analytical projects, today, due to currency fluctuations, such solutions are no longer included in the budgets. As a result, worthy domestic solutions were found, such as the product of Polimatika. This system places the OLAP-cube data in the RAM, thanks to which the analysis is carried out at maximum speed, and on a standard server you can process several billion records using the cores of the central and graphics processor.
However, the Russian products of the required level in the market are very few, and far from every application can find a worthy replacement. Therefore, government agencies are positive about custom development based on open source platforms, as well as guaranteed support for systems from foreign suppliers. Of course, complete autonomy is not achieved here, and instead of dependence on the vendor, there is a dependence on the developer. The customer begins to wonder who will maintain and develop the system? Therefore, contractors for government customers are most often the same companies: well-deserved trust is one of the main arguments when launching a new project.
Considering all the above, information systems in government structures are similar to a puzzle, the assembly of which requires the use of boxed and custom-made products, open platforms and commercial solutions. In the presence of high-quality integration, this approach avoids closure on some specific products, leaving room for adding new “details” and replacing puzzles if their development or support is in question.
Needs for state structures
The formation of a complex information space in state structures, which covers not one department, but several at once, created the prerequisites for using analytical solutions of the new generation. For example, the third version SMEV also includes an analytical module that collects statistics on the quantity and quality of the provision of public services in electronic form and interagency cooperation. Against the background of the ever-growing number of SMEV users, which now include regional MFC and financial organizations, the volume of interaction is growing exponentially. A large amount of data opens up all the possibilities for introducing additional tools that could use them. Huge amounts of data open up opportunities for analytics, and the use of new IT tools is dictated by the real needs of the state. Unlike commercial organizations, where everything is aimed at making a profit and reducing costs, there are several different tasks in government structures, and these are the main ones:
- Formation of regulated reporting - reporting in a strictly defined form is provided for in each government department. In the manual mode, the preparation of such reports paralyzes the work of departments for several days, and analytical systems allow to solve this issue as clearly and quickly as possible. For example, an automated FIU system prepares more than 300 different reports.
- Maintaining registries - the majority of departments in recent years have been entrusted with maintaining one or another registry. For example, Roskomandzor keeps a register of banned sites and detects their clones, the FIU keeps a register of insurers and policyholders. Tracking the relevance of such a registry is almost an ideal task for a BI system.
- Mobile analytics - heads of departments of various organizations find it convenient to work with operational indicators, which can be found on your tablet. In the case of government agencies, in addition to tracking the overall process, this opportunity also gives the manager the necessary level of personal protection, because “on the carpet” the authorities can give a prompt answer to any question.
Federal data
To solve these problems, it is necessary to use special technologies, such as LARGE data storages and special tools for working with them. Federal agencies use databases containing up to 50 TB or more. For example, the analytical data warehouse of the AIS-2 IAP in the FIU is characterized by the following characteristics:
- Estimated capacity up to 200 TB.
- More than 1500 users with the prospect of connecting up to 6000.
- 17 data sources with different typologies, some of which are geographically distributed.
- More than 2000 entities (tables).
- About 10,000 entity attributes.
- About 300 indicators and 500 data model measurements.
- Over 15,000 data model attributes.
To ensure the operation of such systems in a distributed version (after all, data is stored in different regions, that is, many sources work on geographically-distributed platforms), we used the classic approach of creating data warehouses for Ralph Kimball and Bill Inmon.
Three storage layers have been created that are complemented by one BI layer.
1. The data preparation layer consists of two levels: SRC, where the source data is stored, and Staging, on which we apply data combining and clearing algorithms. This is necessary because, in distributed state systems, regulatory information is not always homogeneous. A concrete example: reference books in each of the 85 regions are a little bit, but they are different.
2. The detailed data layer directly provides data storage. There is no direct access to it, but it is in it that the information important for the customer is constantly updated. The detailed layer is a complete picture of the organization’s substantive activities in terms of information. Let us give an example: in the system of personalized accounting (SPU), information on all insured persons is stored, and in the System of Administration of Insurance Contributions (DIA) there are data of payments from insurers. Moreover, when the insured person retires, his data is entered into the Federal Pension Fund (FBDP). The integration of all these data for each object is made in the layer of detailed data. For this, a serious modification and transformation of data takes place, the construction of links between different systems.
3. Data marts provide information on requests. This is a rather complex layer, since each single showcase can provide information obtained from different parts of the detailed layer. In fact, it is data marts that provide a variety of indicators or data samples. Already through the storefronts a connection to the business logic (BI) layer is created. For example, when a user wants to know the value of some indicator and clicks on a button in the graphical interface of the BI system, the data instantly comes from the storefront, as they are already prepared for the request. If information about people who retired in the current year has been collected throughout the region, there will already be views in the windows on the districts, or on the size of the pension - by any predetermined criterion.

Features of operation
Large-scale information systems used in government departments must also cope with such troubles as changing data sources, which inevitably occurs with time at such scales. Changes to the very logic of the system, characteristics of queries and data representations are not excluded. Therefore, the project of creating a large-scale information system for a state structure also requires its gradual adaptation to changing requirements and conditions.
In addition, the storage sizes and the complexity of the data model make it necessary to search for possibilities of various types of information loading. It must be both full and incremental. For example, after the introduction of a specific module, it is often necessary to completely load a certain segment of data, and then to refine and supplement them in asynchronous mode.
And, finally, the most curious feature of the organizational and economic systems and their repositories is to support the historicity of the data. It is not a secret for anyone that in case of correcting errors or renewing a communication channel that did not work, the data may change retroactively. But for building reports, you may need both up-to-date data and information “on a specific date”. As practice has shown, the support of such functionality requires the allocation of additional storage resources and making adjustments to the design of its topology. Consider, in addition to storing different versions, to all fields you need to bind the time stamps and data validity periods, which should also be taken into account when querying from the BI system.
Parallel request processing
To get high performance and cope with various problems of distributed storage of complex topology, we use asynchronous mass-parallel multiprocessor MPP systems, for example, IBM Netezza. In this case, huge amounts of data are placed on nodes of the geographically distributed cluster, and each analytical request can be processed in parallel on a number of server nodes. We will describe the architecture of MPP and the features of its work in the next post.