📜 ⬆️ ⬇️

Integrating InfoSphere Warehouse data mining with IBM Cognos reports

When working with information from IT departments, there is often a problem how to make the use of advanced data analysis tools accessible to analysts. As you know, such tools require special knowledge in database programming.

For example, there is customer data related to demographic aspects (age, profession, residence, etc.), as well as past transactions with these customers. The marketing department wants to make new offers aimed at specific groups of customers with similar properties. How to distinguish such typical groups? The solution to this problem is provided by data clustering technology. It automatically combines data arrays according to their properties or features. The analyst can then analyze these arrays and interactively refine them until they receive answers to their questions. An important step in the analysis process is to provide users with the results of clustered data. As a rule, analysts are not specialists in low-level database programming.

How to present the results of the in-depth data analysis to analysts and employees so that they reflect the business process in which the user participates? How to comply with security requirements, for example, so that each user can only see what he is supposed to?
')
To give satisfactory answers to these questions, it is necessary to shift the perspective from statistical analysis to the real end user and the business processes that interest him. How can this be achieved? One possible answer to this is to integrate two interesting solutions - InfoSphere Warehouse (the foundation for a corporate-wide data warehouse and a tool for in-depth data analysis directly in the DB2 database) and IBM Cognos (a reporting tool). Each of these products already represents powerful solutions for working with information, but unfortunately, Cognos doesn’t know how to do in-depth analysis, and InfoSphere Warehouse doesn’t provide the ability to consolidate and visualize relevant information. If you combine them (and the programs allow you to do this), then you can get a very interesting tool for work.

First, consider the possibility of such integration in theory. The InfoSphere Warehouse package uses DB2 for data storage. It is equipped with a database segmentation function (DPF) and provides a scalable, reliable and high-performance data warehouse, combining the advantages of a database endowed with on-line transaction processing capabilities with the capacity needed for large data warehouses. The package also includes a number of data mining algorithms, for example, clustering, regressions, associations, etc ... The data analysis process begins with loading information into the database. Then a model is created which, afterwards, can be applied to records for which the target value is not yet known, creating a forecast and calculating its confidence level (Schematically this process is shown in Figure 2). All data mining functions are invoked as normal SQL commands. This makes it easy to integrate analysis tools into almost any design, for example, in Web services.
It should also be noted that in case of a lack of statistical methods presented within the framework of InfoSphere Warehouse, models created in any statistical analysis tool supporting the PMML format can be applied to the data.

image

Fig. 2

The IBM Cognos 8 Business Intelligence module provides a complete set of business analysis functionality and is based on a flexible service-oriented architecture (SOA). The main functions of this module are reporting, analysis, dashboards and rating tables.

To include data in a report, you must perform the following steps:
In Framework Manager, a data architect creates Cognos metadata that describes data from a database from a business point of view (relationships between tables, business names of values, etc.).
After modeling the metadata, the entire package is uploaded to the Cognos 8 content repository. From there, you can retrieve them using Cognos Connection as a new report.

The ability to create reports from relational databases is key to integrating InfoSphere Warehouse and IBM Cognos.

Cognos reports contain a set of results obtained from a relational data source, as described above. The content of a particular report is determined by a (dynamic) SQL query to one or more data sources. This primary communication method can be used to integrate the data mining system and the Cognos reporting system in the following ways:
Cognos can be used to display rating tables, possibly with information about their validity.
Cognos can be used to display model information. This information is extracted from actual XML models using table extractor functions or XQuery queries.
Cognos can dynamically initiate in-depth analysis and evaluation of data by calling SQL stored procedures. This allows:
call the function of in-depth data analysis with different settings set by the user in the interface for working with reports.
Call the function of in-depth data analysis on different data subsets (for example, to create recursive in-depth reports)
Dynamically evaluate records based on user input.

Figure 3 shows the final scheme of the function call used in all the above methods.

image

The data mining function is called as a stored procedure and creates an XML model in the database. You can use this model to evaluate new data, or you can extract information from the model into a table and use such tables in Cognos. The user can call the data mining function interactively by referring to the corresponding database stored procedure from the Cognos report.

This integration provides a number of advantages:
It is very simple and requires only knowledge of SQL, without any additional programming.
Data mining models are stored in a database and can be accessed from Cognos in a safe, efficient way.
The entire data mining process can be launched and monitored from Cognos using stored procedures

Such integration can have a significant impact on the distribution of data mining, as users do not need to know some details of the data mining process.

If the topic is interesting, then in one of the following posts I can describe the step-by-step integration process using a real example.

Source: https://habr.com/ru/post/78189/


All Articles