📜 ⬆️ ⬇️

How the SAP HANA Platform Works with Big Data

Hi, Habr! In the last article, we talked about the SAP Business One solution for small business and briefly mentioned the capabilities of SAP HANA in computing and analytics. Today we will dwell on how the SAP HANA platform can work with big data and on the scenarios of applying these technologies in business.

SAP HANA: how it works


The main core in SAP HANA is a database component that allows you to process large amounts of data using In-Memory technology and based on the SQL language tool. The SAP HANA database is based on the relational data model, but it is also possible to access data using the WIPE “graph” query language. Flexibility in the choice of query language is due to the architectural capabilities of SAP HANA and is to use a single presentation of data in the in-memory storage. Thus, the user has the ability to access data using various semantic structures, while using a single copy of the data in the DBMS memory. The classical approach adopted in a number of other OpenSource DBMSs differs from the above because it involves the use of at least two data stores and the separation of the storage method of graph structures and relational tables.


Figure 1. The concept of data management
')
The figure above shows the general data management scheme in SAP HANA and the essence of the management concept using different languages ​​- in particular, SQL and WIPE. Using the Data Processing engine, you can create a new semantic level for working with data at the Data Manipulation level, but a single copy of the source data will be applied, which greatly enhances the capabilities of the SAP HANA platform for solving problems where presentation of information in the form of graph structures is required.

The in-memory technology in the SAP HANA DBMS allows you to store and process data in memory using unique algorithms [1] developed by SAP and based on the Intel x86 platform. SAP also recently announced support for the IBM Power platform for SAP Hana. The uniqueness and high speed of processing requests to data is the ability to store and execute them. They are compressed in RAM. Thanks to the developed data processing algorithm in SAP HANA, it was possible to implement the Unified Tables approach, which provides high speed of reading and writing data to a column storage table. Therefore, one of the main advantages of SAP HANA is the ability to perform analytical queries directly on transactional data that is added in real time. At the same time, the system automatically assumes the provision of transparent access to data. Thus, new data in the table are immediately available for analysis without preliminary processing.


Figure 2. Unified Table concept architecture

Architecturally, SAP HANA supports a configuration in which one or more compute nodes are used as part of a single instance DBMS (Scale-out, see Fig.3 and www.hanatutorials.com/p/scale-up-or-scale-out-hana- configuration.html ). This configuration is particularly relevant for the tasks of processing large data arrays in real time. The processing of a SQL query in SAP HANA occurs simultaneously over the entire volume of data, regardless of the location of the data.


Figure 3. Scale-out HANA configuration

Unlike Hadoop Spark and Hadoop Hive, the SAP HANA platform allows for a faster and simpler mechanism for loading data and executing queries for a large amount of structured data using the SQL language.

When processing large arrays of unstructured data (for example, video or photographic materials), it is recommended to use the integration capability of SAP HANA and Hadoop Spark using the HANA Vora tool, which is a compact version of the In-Memory DBMS integrated into Hadoop Spark.

SAP HANA also proposes using different options when choosing a programming language to create applications within the framework of the new concept of Bring your own language. The built-in SAP HANA XS advanced application server allows you to create independent application containers based on JavaScript (Google V8 and Node.JS engines), Java (Tomcat Java), Python, Ruby, C ++.

Let us consider one of the examples from the field of machine learning for problems of recognition and classification of images based on the image base using Hadoop, as well as streaming data using the SAP HANA component Smart Data Streaming (see Figure 4).


Figure 4. Architecture of the moving objects control system based on SAP HANA

When implementing video algorithms in SAP HANA, it is also possible to use the popular Caffe, Theano, Torch, Tensorflow packages and transfer already developed applications without changes to containers based on HANA XS Advanced or Hadoop Spark environment.

In the following articles, we will show real-world code implementation examples for machine learning tasks on the SAP HANA platform.

Examples of scenarios for using SAP HANA for working with big data in moving object monitoring systems:


Digital Warehouse based on SAP HANA

An important task for large distribution companies is to manage the loading and unloading of goods, as well as their routing for placing orders and preparing for departure. Timely tracking of goods and trucks, monitoring and managing the process of loading and unloading allows you to quickly plan and adjust plans for the preparation of goods for shipment, as well as to avoid problems with idle goods in the warehouse.

The “digital warehouse” model, built on the basis of SAP HANA and a component for Smart Data Streaming, helps to collect information about the availability of means of loading and unloading goods, location information, and managing personnel through timely adjustment of the plan. The use of specialized sensors allows you to collect information about the status of the shipping belt, the workplaces of staff and monitor the status of places for loading and unloading goods.

In ordinary warehouses in the process of picking an order, errors may occur due to human factors. To minimize this, the “digital warehouse” uses the built-in capabilities of SAP HANA to recognize specialized tags as QR codes. Labels allow you to automatically determine the ordering and product items based on the order code and information from it from SAP ERP

Using SAP HANA and its ability to analyze information in real time, companies can build a warehouse management system in real time that will take into account changes to plans when processing goods and placing orders, will reduce product downtime and ensure adequate staff utilization.

Additionally, within SAP HANA, using predictive analytics tools, you can build data analysis based on statistics on the work performed to optimize the warehouse operation process.

"Digital Parking" for cars

One of the important tasks in the management of urban traffic is tracking available parking spaces to monitor the load of urban parking. Specialized sensors that are installed in parking lots, can track the number of free and occupied places. SAP HANA Smart Data Streaming based monitoring system allows you to monitor the status of sensors in real time and manage the parking space map.

Additionally, when using DVRs, in order to comply with the conditions of paid parking, it is possible to collect information about car numbers and monitor the parking status.

Digital product quality control system

Managing and tracking the delivery of goods is an important task for large urban delivery networks. In large cities, in conditions of limited delivery time and a large number of orders, it is necessary to react in time to changes in orders and plan the delivery of goods, taking into account changing requirements from customers.
The integration of the SAP HANA Smart Data Streaming system helps to process several million requests for delivery of goods per minute and further using specialized tools to timely adjust the plans for the delivery of goods in real time.

Sources


[1] Vishal Sikka, Franz Färber, Wolfgang Lehner, Sang Kyun, Thomas Peh, Christof Bornhövd “E fi eient Transaction Processing in the SAP HANA Database - The End of the Column Store Myth”. SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. Pages 731-742

Source: https://habr.com/ru/post/321156/


All Articles