SAP HANA functionality as a database for SAP HANA Data Management Suite

We continue the series of articles on SAP HANA Data Management Suite - a hybrid of local and cloud technologies, which includes four components of the product: SAP Data Hub, SAP HANA, SAP Enterprise Architecture Designer and SAP Cloud Platform Big Data Services .

The combination of these solutions allows you to create a complete data management structure with the following functions:

data origin tracing
tracking changes in data and their structure
comprehensive understanding of metadata
maintain the required level of security
centralized monitoring

But today we will talk about the “core” of this system - the SAP HANA platform.
')
SAP has conducted and continues to conduct research, invests large resources and funds in the development of data processing. The result is the SAP HANA platform - High-Performance Analytic Appliance. Our company has already accumulated many years of unique experience in the development of technologies and services for business in its own way - and SAP applied it when creating a business platform for realtime data processing. As a result, SAP HANA emerged, which became the basis and core for the development and construction of intelligent enterprises of a new type (intelligent enterprise). The platform is used to develop applications both within SAP and our customers and partners.

SAP HANA is a multipurpose solution for storing and processing information. One of the features of SAP HANA is the built-in calculation engine that allows you to transfer planning operations from the application level to the SAP HANA database level. With the help of the modern architecture of the hardware platform, calculations are performed more efficiently - the whole “avalanche” of the data being processed is broken down into a strictly defined number of threads, the number of which is equal to the total number of platform cores. This approach allows the most efficient use of the computing power of each core of each processor.

SAP HANA also provides in-memory data storage and processing technology. SAP HANA as a database allows you to store data in line and column format. The in-memory data storage and processing technology ensures fast transaction processing, and together, the Calculation View data analysis technology ensures high performance when performing analytical queries.

Forrester analysts have begun to use a new concept - “translitic database”. By their definition, such a platform "supports many types of use, including real-time information, machine learning, flow analytics and extreme transaction processing."

A recent Forrester report states the following : “SAP HANA is a shared-nothing (without common use of resources), in-memory platform. It is the basis of the SAP platform for transactions and data analytics, it supports many application scenarios: applications for processing real-time data, analytics, translitic applications, deep and advanced analytics systems. Enterprises use a platform for organizing in-memory data marts, for working with realtime data storage for SAP Business Warehouse, as well as for working with SAP S / 4HANA and SAP Business Suite. ”

Translit platforms are suitable for supporting real-time applications and services: for stock trading, fraud detection, combating terrorism, monitoring patient health, analyzing data from various sensors, monitoring earthquakes, and much more. Using the translitic platform, applications can exchange real-time data and ensure the consistency and accuracy of information stored in the enterprise.

Another area of application for SAP HANA is support for machine learning, which allows you to apply complex analytical models to data to more accurately predict operations, business processes, customer behavior, etc.

How does SAP HANA support this functionality?

Let's start with the database service. If we consider HANA in terms of architecture and technology, then there are two ways to store data - row and column.

Line-by-line data storage in the table allows for high speed data writing. If you want to add a new row to the table, all you need to do is find the free memory space for this row and write new data there. However, in case of line storage, there is a problem with data analysis: you must use an index or a materialized representation of the data in a form that will be convenient for analysis. In this case, indexing leads to delays due to the fact that additional time is needed to rebuild the index, the materialization of data in a different format in the process of inserting a row.

If the data is stored in a row, then to add a new line, you need to spend time splitting the values of the row into columns, then wait until the data is posted to different places in memory. All this leads to a decrease in performance during data recording.

The database with column storage allows you to process queries much faster, because in this case the data from the requested columns are located in the memory compactly and compressed. Those. upon request, there is no need to scan the entire table - just look at the columns used in the query. Such a database is optimized for reading, and column information storage allows you to organize data in RAM in a certain way, using grouping. With this approach it is possible to use different compression techniques with greater efficiency, which leads to multiple compression of the initial information.

To solve this problem, a Unified Tables approach was developed, which provides high-speed read and write data to a column storage table. Such a mechanism allows you to quickly carry out transactions (that is, write new lines), analyze data at high speed through column storage in compressed form, parallel data processing, and also store all data in in-memory memory.

During the recording, the changes are not immediately made to the main storage location of the tables. Instead, all edits are entered into a separate data structure - the delta repository (in the picture L1-delta). Here the data is stored in a write-optimized format. When it is necessary to transfer changes from the delta storage, a special Delta merge process is launched - the delta merge. First, the data from L1-delta is converted into a string format in L2-delta, and then combined with the main data storage (main store). And for the data reading mechanism, all three information storage areas (L1-delta, L2-delta and main store) provide data in a complete form. Thanks to this process, it turns out to provide high speed data recording and analysis.

One of the significant advantages of SAP HANA is that all calculations of the aggregated data are made directly during the formation of an analytical query and are immediately displayed as a result. The ability to store detailed or raw data in RAM (rather than aggregated values) makes it possible to abandon the preliminary calculation and storage of aggregate tables, which are an integral part of classical analytical systems.

SAP HANA also supports various internal programming languages: R for creating predictive models, SQL Script for writing computational logic. At the XSA application server level embedded in SAP HANA 2.0, you can develop in many other languages thanks to the support of the Bring Your Own Language concept (and through the use of Cloud Foundry) . Using these languages, you can make the necessary calculations and forecasts directly at the data storage level. This allows you to get rid of unnecessary stages of transferring large amounts of data and give the finished result of calculations at the application level.

Now consider the SAP HANA platform services.

SAP HANA Platform Services

SAP HANA has not only a database, but also a whole range of services for developing applications, integration and data cleansing tools, libraries for analytical data processing, including Machine Learning, as well as opportunities for storing and processing special data types. SAP HANA allows you to download data from various sources without additional tools, to develop various forms for data entry, editing and analysis. Also available are tools for sophisticated intelligent data processing: transformation, transformation, search for patterns, research. And, of course, the platform is open for visual data analysis through various tools.

To tell about all the features of SAP HANA, you will need to write a few additional articles. Many of them are already described in our blog.

Let's look at some of the available services:

SAP HANA includes a geodata storage and processing engine — data that describes the position, shape, and orientation of objects in space. SAP HANA supports spatial data types and processing methods. There is a special method for processing such a structure - graph. SAP HANA in this case provides opportunities for processing hyperlinked data and their relationships. The data processing engine has built-in algorithms for searching neighborhoods, shortest paths, strongly related components, pattern matching, and more.

SAP HANA also has hundreds of pre-packaged machine learning and prediction algorithms with features such as join, clustering, classification, regression, probability distribution, time series, and more. In addition, you can use the TensorFlow library and the R language.

SAP HANA has built-in capabilities for processing and analyzing text files, including various text mining functions - for example, fuzzy logic, searching for synonyms, semantic parsing, etc.

SAP HANA Streaming Analytics can capture, filter, analyze, and impact millions of events per second in real time, saving data or results to the SAP HANA database and sending less critical data to cheaper storage solutions such as Hadoop. SAP HANA Streaming Analytics is also integrated with the Apache Kafka messaging system.

Useful materials and resources to get started with SAP HANA:
A free trial version of SAP HANA , express edition is available for download on our official website. Also, at the beginning of work, you can study a set of tutorials before starting work with SAP HANA:
- virtual machine and Server + XSA Applications version for SAP HANA and video installation instructions
- There is a wide selection in the tutorial suite. For example, for working with spatial data: first and second

Source: https://habr.com/ru/post/426503/

All Articles

SAP HANA functionality as a database for SAP HANA Data Management Suite

More articles: