📜 ⬆️ ⬇️

Process Mining: Acquaintance

Greetings, Habrahabr!

In this article I will try to open the veil over an interesting technology from the field of business process management ( eng wiki BPM ). Intelligent Process Analysis ( eng wiki Process Mining focuses on the detection, analysis and optimization of business processes based on data from event logs (event logs), presenting the missing link between the classical analysis of business processes using their models and intelligent data analysis ( eng wiki Data Mining ).

Disclaimer
The article was prepared on the basis of online course materials. coursera Process Mining: Data Science in Action , owned by Eindhoven University of Technology . The use of materials of the article is possible only with the permission of the authors of the course and with links to the source.

')

Figure 1. Positioning Process Mining.

Next, we will develop the topic of positioning, touch on the use cases, talk about the source data, and look at the different types of process intelligence.

Positioning


Intelligent analysis of processes uses data to analyze business processes, neglecting the analysis of the data itself. In other words, Process Mining, unlike Data Mining, is not interested in low-level patterns in the source data and does not attempt to make decisions based on them, but sets the goal of optimizing business processes (especially end-to-end) arising from the source data.

The questions answered by Process Mining can be divided into two groups (see the left and right arrows in Figure 1):

Use cases


The table below lists some of the uses for predictive processes, as well as related issues, broken down into the above groups.
NoUse caseQuestionsGroup of questions
oneDetection of real business processesWhat does a process look like that actually (and not in words or in theory) describes the current activity?Consistency
2Search bottlenecks (eng. Bottlenecks) in business processesWhere in the process are places that limit the overall speed of its implementation? What causes such places?Performance
3Identify deviations in business processesWhere does the actual process deviate from the expected (ideal) process? Why do such deviations occur?Consistency
fourSearch for fast / short ways to perform business processesHow to complete the process the fastest? How to complete the process in the least amount of steps?Performance
fiveForecasting problems in business processesIs it possible to predict the occurrence of delays / deviations / risks / ... during the process?Performance / Consistency

Initial data


Often, the starting point for process intelligence is data from event logs. Consider us a suitable magazine. Each line in such a log corresponds to a separate event. In turn, each event carries information about the case that gave rise to it, the activities performed within its framework and the time of its registration. Such event logs can be viewed as sets of cases, and individual cases as sequences of events referring to them.

Enlisting the assumptions presented above, we highlight the main event attributes in the logs:


Figure 2. Event log - patient admission data.

Of course, the choice of the above attributes depends on the purpose of the analysis. For example (looking at Figure 2), if we are interested in a process that describes how patients receive proper treatment, we use patients as the identifiers of cases ( patient column), procedures called patients-received procedures ( activity column), and we denote by resources the doctors performing these procedures ( column doctor ). If we are interested in another process that describes how doctors perform the procedures, then the event IDs will be the doctors themselves (column doctor ), the activities performed by these doctors procedures (column activity ), and the resources attention will also be doctors (column doctor ).

Types of Process Mining


Process intelligence focuses on the relationship between business process models and event data. There are three types of similar relationships that determine the types of analysis.

Play out


We start with the finished process model. Next, we simulate various process execution scenarios (according to the model) to fill the event log with data about events recorded during the simulation.


Figure 3. A sample play-out.

Figure 3 shows an example of the simulation for the finished workflow model (eng. Workflow). Process model made using simplified notation eng wiki BPMN . Steps in one of the possible ways of the process are shown in red, and the log below is filled with data on events in the order in which they were registered during the passage of this path.

Play-Out is used to check the developed process models for compliance with the expected data (sequences of events) from their execution.

Play-in


We start with the finished data in the event log. Next, we obtain a process model that ensures the execution of event sequences presented in the log (we train a process model based on data).


Figure 4. Sample Play-In.

Figure 4 shows an example of obtaining a process model from ready-made sequences of events (indicated in red). If you look closely, you can see that all sequences of events in the figure begin with step a and end with step g or h . The resulting process model exactly corresponds to the noticed features, which illustrates the basic principle of its output from data.

The Play-In is useful when a formal description of the processes generating the known data is necessary.

Replay


At the same time, we use the process model (possibly obtained using the Play-In) and the data in the event log (possibly obtained using the Play-Out) to reproduce the real sequences of events according to the model.


Figure 5. Example Replay.

Figure 5 shows an example of an attempt to reproduce the existing sequence of events according to the finished process model. The attempt ended in failure due to the fact that the model requires passing step d before the transition to step e is opened (studying gateways (notation gateways) notation will help to clarify the reasons for the failure) eng wiki BPMN ).

Replay allows you to find the deviations of models from real processes, but can also be used to analyze the performance of processes - when you start playing, you start to note the time of event registration, as delays and speed sections become visible on the paths of the processes.

Additionally


For those who wish to independently try to apply their knowledge in practice, I hasten to report on a tool that will allow you to put your bold endeavors into practice. ProM is a free framework that includes everything you need to perform predictive process analysis. The stable version of ProM is available for download under Windows and for other operating systems . General information (including examples of source data, tutorials and exercises) is available on the ProM Tools website .

Conclusion


The existing gap between the analysis of business process models and data makes it difficult to find solutions to the many interesting and complex problems of the modern world, where the value of data has long been compared with the value of oil (see Data is the new oil ). Process Mining aims to bridge this gap, taking business process analysis to a new level.

Thank you for your attention and strongly recommend to continue studying the topic yourself! An excellent start will be the above online course. coursera Process Mining: Data Science in Action .

Source: https://habr.com/ru/post/244879/


All Articles