📜 ⬆️ ⬇️

Should I use the tabular model SSAS?


You can not just take and answer this question, not taking into account a number of factors.

Many people think that it is necessary to switch to a tabular model, because it is newer and better. But often it is impractical or even impossible in principle. However, let's talk about this a little later.

Multidimensional model


The multidimensional database has a certain structure and allows us to generate reports very quickly. Once, to create multidimensional databases, a multidimensional model was the only solution. This model has not changed since the days of SQL Server 2005. If you look at what’s new in each edition of Analysis Services, it becomes clear that most of the innovations are related to the tabular model.
')

Tabular model


The tabular model appeared in SQL Server 2012 and is actively developing, and each subsequent version includes new features.

The tabular model works on a different engine (xVelocity) and it is designed to quickly perform column queries, because it uses column storage (multidimensional models use string storage), in addition to good data compression. Data is stored in RAM (in-memory mode), so it is very important that the server has a lot of memory and a very fast processor. Disks for tabular models are not so important. One of the main advantages of the tabular model is that some queries in it work faster (for example, they work very quickly with measurements based on the distinct count) and it has a high compression ratio - 1/10 (below is a link describing the principle of compression), While in the multidimensional model only 1/3. The degree of compression is indicated approximate, of course, it can vary, depending on the data.

Hardware


It should be noted that the hardware used for multidimensional databases cannot in most cases be used for the tabular model. The tabular model directly depends on the amount of RAM. The more memory, the better the performance. If the memory is not enough, the tabular model will simply stop working, without any warnings.

The frequency of the processor is also very important for the tabular model.

Once again: for the tabular model, the disks are of secondary importance, but the amount of RAM and CPU speed are very important.

So how much memory is needed? There is an expression - the more the better! But it is very abstract and impossible to understand, I would like something more tangible, right? On the one hand, there is a simple formula <The size of a relational database> / 10 * 2, but one should not forget that there are users who will connect to a tabular database, which means that SSAS needs more memory - to cache requests, it also needs memory for the OS and the SQL Server kernel cache (if SSAS and the relational database are on the same machine). In the tabular model, it is possible to create calculated tables and columns, therefore, they will increase the size of the tabular database, despite the fact that the relational database remains unchanged.

Why in the formula the result of dividing the size of the database is multiplied by two? Because by default, processing is performed in the buffer (in fact, a temporary copy of the tabular database is created next to the main model), while the main model continues to exist and work as before (until the processing completes successfully, after which the main model data is replaced with data from the buffer, and in case of an error everything remains unchanged). Therefore, be careful when choosing a SQL Server edition. If the size of a tabular database is more than 5 GB, then the Standard edition (with a limit of 16 GB for SSAS, which includes the cache ), most likely will not work! With a lack of cache will be terrible brakes.

More detailed articles on the amount of required memory can be found here and here.

Statistics on the use of tabular and multidimensional models


According to the data from this article, a survey was conducted among 440 participants of the webinar about comparing two models, of which 212 people responded (~ 48%), on the topic “Which model do you use - tabular or multidimensional?”:


Transition


If you are already using a multidimensional model and it suits you, then it is better to stay on it. If there is reason to believe that the tabular model can solve your needs, then it makes sense to think about the transition. But the transition from the multidimensional to the tabular model is not an easy task. Basically, everything will have to be done manually and there is no easy way to navigate using the magic converter. You can create specialized tools for the transition using tools such as SSIS, PowerShell, or others.

It should also be understood that in multidimensional models there is a functionality that is not supported in tabular models, for example, Writeback.

A complete list of unsupported functionality in tabular models, in comparison with a multidimensional model, can be found in the official documentation . This article discusses some ways to emulate functionality .

Pay attention to this before starting the transition.

New project


For new projects, it is usually recommended to use a tabular model, unless, of course, functionality is required, which is supported only in the multidimensional model. For those who have not come across analytics before, the tabular model will be more understandable, because it looks like a regular relational database. In addition, the functions that it supports are, in most cases, sufficient for full-fledged work.

Recommendations


If you still decide to use a tabular model, then let me give advice on how not to step on a rake.

There are calculated columns in the tabular model. With their help, you can add a field to any of the measurements, which is absent in the relational database, and prescribe the calculation formula. For example, add to the Clients table fields with the names of the country and region that are stored in the next table. Calculated columns are stored in a tabular database and are filled at the time of processing.

It would seem - beauty, logic is stored in the model's metadata, there is no need to correct the presentation to add a field, but there is a nuance. To understand it, let's look at the processing steps:

  1. Getting data from a relational database
  2. Data compression
  3. Calculation of calculated values ​​and indicators

From the stages of processing 2 problems emerge:

  1. The presence of a calculated column increases processing time
  2. Calculated columns are not compressed

Those. Do not abuse the calculated columns, despite their convenience.

In this article, the fingers are told due to what is data compression.

Some more tips on improving the speed of processing a tabular model

Source: https://habr.com/ru/post/342862/


All Articles