📜 ⬆️ ⬇️

Measuring the intensity of the incoming flow of events in the decay model

In the class of flow algorithms there is a subclass that solves the problem of finding heavy elements (heavy hitters). In general, this task is formulated as “identifying the most frequently occurring events in the incoming stream and measuring their intensity”. In this publication, the employee of Qrator Labs, Artem janatem Shvorin, proposes an efficient algorithm for solving this problem.

Introduction


Algorithms for finding heavy elements help to solve problems, such as combating network overload, detecting network anomalies and attacks, and managing dynamic routing. For example, the well-known NGINX web server allows you to limit the intensity of requests to a specific resource, and in order to do this, the intensity must be measured quantitatively.

In this publication we want to show the reader another approach to measuring the intensity of the flow of events in the presence of many different (not identical) flows of events. Let a set of types of events be given. It is necessary to evaluate how often an event of this type occurs, and to pay attention to cases when an event of one type repeats “too often”.

An important characteristic of this tool is its high efficiency, so that it does not become a bottleneck in the entire system for analyzing and filtering traffic. The point in this case is to handle one event for several dozen clock cycles of the central processor of modern computers.
')
In addition to the intensity measurement itself, it is required to select the types of events that occur most frequently. Their intensity must be measured more accurately, while rare types of events do not play a large role, and their intensity need not be estimated with high accuracy.

The paper presents the Decay-based count algorithm developed by us, based on the model of decay of a radioactive substance (see the “Decay Model” section), with high efficiency solving the task of searching for heavy elements.

The task of finding heavy items


Classification of tasks


In the class of problems described, the following subclasses can be distinguished:


The proposed algorithm refers to the last of the listed subclasses.

Target architecture


Depending on the platform on which this threading algorithm is supposed to run, there is a set of hardware limitations. Sometimes you need to embed the necessary functionality in the network equipment (switch, switch, etc.). The proposed algorithm is supposed to be used on a universal processor, but for some values ​​of the parameters it can be embedded in network equipment.

Task parameters



Solution accuracy



Overhead


As a rule, the main characteristic of the algorithm is the maximum intensity of the incoming stream, which it is capable of processing. That is, only the processing time of one event is important. However, in practice, other hardware limitations also affect, such as the size of the memory used to store the accumulated information about the stream. Especially hard hardware constraints occur when the functionality is embedded in network equipment. But even on ordinary computers, the presence of a large amount of RAM DRAM does not always help, because access to DRAM can take an unacceptably long time. In order to achieve the required performance, it is necessary to compromise with the accuracy of measurement and limit the size of the memory used so that it fits in the processor cache. In this implementation, with meaningful values ​​of the parameters, the table is placed in the L2 cache of the processor. As a result, it was possible to ensure that the processing time for one event was several dozen processor cycles.

Methods for assessing the intensity of flow


To estimate the intensity of repetitions of events of a given type, it is necessary to count the number of events over time. To do this, you need to fix the time of occurrence of the event and somehow save it. Usually, a counter is used for this purpose, which is incremented when the corresponding event occurs. Then the intensity is estimated as the ratio of the counter value to the time interval during which the measurement is performed. More accurate methods of measuring the current intensity values ​​use different variants of the moving average . For example, exponential moving average (exponential moving average, EMA) is a kind of weighted moving average with exponentially decreasing weights.

The paper proposes a new method for calculating the EMA. It uses the model of exponential decay, which describes such a physical phenomenon as radioactive decay. This model has several advantages compared with the traditional approach. First, data presentation is more economical in memory. Secondly, the decay model allows for an efficient implementation of the update counter operation — it requires a small number of integer arithmetic operations and one cached memory access.

Multiple Threading Methods


In the task of searching for heavy elements, the problem is not only the high intensity of the input traffic, but also a large number of different flows (types of events) that have to be monitored. A naive implementation involves setting up a separate counter for each thread, which leads to a significant memory consumption. In this case, there is a danger not only in the exhaustion of all the available memory, but in a significant slowdown in the speed of work due to cache misses. Therefore, as a rule, they refuse from the exact solution of the problem of searching for heavy elements, discarding a part of the accumulated information about flows. Many methods are known for limiting the amount of memory used and reducing the event processing time, some of which are listed below:


Task formalization


Before proceeding to the description of the algorithm, it is necessary to formulate with sufficient mathematical rigor the problem of measuring the flow of events, which we will do in this section.

The sequence of events of the same type is defined as a set of elementary events. \ {event_k \} _ {k \ in I}\ {event_k \} _ {k \ in I} where index k runs over a finite or infinite discrete interval I s u b s e t m a t h b b Z   .

In the simplest case, the event is determined only by the time of its occurrence: e v e n t k = t k , with the events ordered in time: tk1<tk2 for k1<k2 . In most implementations of event recording systems, time is considered to be a discrete value: tk in mathbbZ However, for theoretical reasoning it is convenient to generalize and consider time as continuous: tk in mathbbR .

The main question that event recording systems must answer is an estimate of the intensity of the flow. Intensity can be strictly formalized for a uniform flow of events. Uniform flow \ {t_k \}\ {t_k \} is defined as follows:

tk= varphi+pk, quadk in mathbbZ,


Where p>0 ,  varphi in[0,p) - flow parameters - period and phase, respectively. That is, events occur at regular intervals. Then the intensity of such a flow, by definition, is expressed as r=1/p .

For non-uniform flows, formal determination of intensity r = r (\ {t_k \})r = r (\ {t_k \}) may differ depending on the problem statement. The decay model remains operable and useful in this case, however, the assessment of the quality of the calculation of the true intensity is given below only for the case of uniform flows.

In some models, it is required to additionally take into account some characteristic of the event, for example, its weight ck - the value of the contribution of this event to the measured intensity. Then the event is determined not only by the time of its occurrence, but also by weight: eventk=(tk,ck) .

In typical implementations of event recording systems, a counter is started. s which in some way accumulates information about the flow of events, and at any time it is possible to obtain an estimate of the flow intensity from its current value  hatr= hatr(s) such that r approx hatr . Counter update occurs on arrival of the next event. eventk :

s leftarrowupdate(s,eventk),


Where update() - some function that is defined by the implementation. Below are a few examples with options for implementing the function update() :


In some tasks, it is required to distinguish event streams. Suppose there are many different types of events, numbered index i=1, dotsN , and it is required to consider separately message flows of each type. Then the event is described as eventk=(tk,ik) (or, given the weight of the event, eventk=(tk,ik,ck) ) where ik - type of event. Many indexes k related to this type of event i denote I_i = \ {k \ mid i_k = i \}I_i = \ {k \ mid i_k = i \} .

Decay model


The decay model is described as follows:

v(t)=v0e lambdat,


Where v0 - “amount of substance” at time zero, v(t) - quantity at time t ,  lambda - model parameter (the so-called decay constant). The name of this model reflects the fact that it describes the physical phenomenon of radioactive decay .

Additionally we introduce the following notation:

 tau=1/ lambda, quad alpha=e lambda.


In our case, as a model parameter instead of  lambda more convenient to use  tau , insofar as  tau will have the dimension of time and informally can be defined as a characteristic time interval during which the history of events is collected.

We will by definition assume that each type of event corresponds to a value v , which has the physical meaning of “amount of substance” and depends on time, which increases suddenly by one when an event occurs, and decreases at the rest of the time according to the above exponential law.

In fig. 1 shows how the value changes over time. v for uniform flows with different intensity and for uneven flow.



Figure 1: Value v for different threads

Magnitude v not explicitly stored in memory, but can be computed at any time. Stored same value s such that at time tnow magnitude v is expressed as follows:

v= alphastnow.


This representation corresponds to the decay model.

Calculating a counter in a decay model is equivalent to calculating an exponential
moving average with an accuracy of the normalizing factor (constant). An important feature of this view is that the only value stored in the counter replaces the pair (the time of the last event, the accumulated value) that is stored with the traditional approach.

Counter update


Nontrivial operation is to update the counter value. s upon the occurrence of the next event. In this case, you need to replace s on a new meaning s which is determined from the following equality:

 alphastnow= alphastnow+1.


Here on the left is the value v immediately after the event, and on the right - the value v immediately before the event, increased by the contribution of the event (equal to one). Below will be proposed effective ways to calculate the result of the update.

In terms of the function update() from the definition, the update operation is expressed as:

update1(s,eventk)=tk+ log alpha(1+ alphastk).


Here is the execution time of the operation. tnow coincides with time tk the arrival of the message.

Intensity determination


Let there be a uniform flow of intensity events. r that is, events occur with a period p=1/r . Uniform flow is defined according to the definition.

If the measurement is made immediately after the occurrence of the next event, that is tnow=tn for some n , then accumulated over a long time counter value s will be expressed as follows:

 alphastnow= sumnk= infty alphatktnow= sum k=0infty alphakp,


from where

 alpha Deltas+ alphap=1,


Where  Deltas=stnow - the relative value of the counter.

In a more general case, the moment of measurement may be between events:
tnow=tn+ psi ,  psi in[0,p) . In this case, the value of the counter will differ in the lower side by the value  psi :

 alpha( Deltas+ psi)+ alphap=1.


The task of measuring the intensity is to evaluate the intensity by the counter value. Assuming that the flow is uniform, it is possible to obtain estimates of the true intensity of the uniform flow from above and below, substituting extreme values ​​into the previous equation  psi=0 and  psi=p :

r ler<r+,


Where

\ begin {align} r ^ + & = r ^ + (\ Delta s) = \ frac1 {\ log_ \ alpha (1+ \ alpha ^ {- \ Delta s})} \\ r ^ - & = r ^ - (\ Delta s) = \ left \ {\ begin {array} {ll} \ frac1 {- \ log_ \ alpha (1- \ alpha ^ {- \ Delta s})} & \ mbox {for} \ Delta s > 0 \\ 0 & \ mbox {otherwise} \ end {array} \ right. \ End {align}


Both scores r+ and r monotonously depend on the value of the counter s (see Fig. 2), therefore, for example, comparing the current value of the counter with the threshold value does not require additional calculations.



Figure 2: Function Graph r and r+ for  tau=$1

In the decay model, the intensity of arbitrary (not necessarily uniform) fluxes, by definition, is given by the above estimates. r+ and r . Moreover, if the measurement occurs immediately after processing the event, then the intensity is considered to be exactly equal to the lower estimate.

Limits of applicability of the decay model


There are restrictions on the intensity value, which can be correctly measured in the decay model.

First, if the time of occurrence of events is measured as a discrete value, then the flow period cannot be less than one. That is, the flow rate should not exceed rmax=1 . Hence the restriction on the relative value of the counter.  Deltas=stnow - it must not exceed the value  sigmamax which is determined from the formula:

r( sigmamax)=rmax.


Magnitude  sigmamax estimated as follows:

 sigmamax= tau ln tau+ omega( tau),


Where 0 le omega( tau) le1/2 .

Secondly, the assessment of the intensity of weak (low-intensity) flows is difficult: for small relative values ​​of the counter  Deltas upper and lower estimate accuracy r+ and r worsens, and with negative values  Deltas the lower intensity estimate degenerates to zero.

There is also a limitation on the operating time of the decay model implementations associated with overflow of counters. Since the value of the counter cannot escape from tnow more than  sigmamax , the system operation time is actually determined by the digit capacity of the counter and the duration of one clock cycle. If a 64-bit register is used to store the counter, it will not overflow for 100 years. But in implementations with low-digit registers, a reset mechanism must be provided for.

Algorithms for multiple flow accounting


Features of the decay model


The peculiarity of using EMA as a counter value is that when the flow of events stops, the accumulated value quickly (exponentially in time) degrades and becomes indistinguishable from zero. In the decay model, this fact is used to automatically reset the counter: although the counter value s will increase all the time with the arrival of events, some time after the cessation of the flow of events v= alphastnow can be considered equal to zero, without obviously changing the value of the counter. Time Tvanish , for which any previously accumulated value falls to a conditional zero, depends on the parameter  tau and the required accuracy. It is expressed as Tvanish=Tmin+ sigmamax where Tmin - time of decay of the value accumulated as a result of a single event. In the section "Tabular implementation" is given the exact expression Tmin , but here it is enough to bear in mind the following fact: Tvanish=O( tau ln tau) with fixed accuracy.

This implies an upper estimate for the size of the storage of counters when taking into account a multitude of flows for the case when information will not be lost at all - it is enough Tvanish cdotrmax cells There are many applications of the task of searching for heavy elements, where large values ​​are not required.  tau , and all storage is placed in RAM or even in the cache of the L3 or L2 processor. In this case, a small access time to the storage is provided, so that the task becomes feasible at high values ​​of the input stream intensity.

A hash table is suitable for implementing storage, where the key is the type of event. In this case, the cells whose counter value is s satisfies the condition stnow leTmin .

Decay-based count numeric implementation


Value update operation


We introduce the following notation:

 rho tau(x)= tau ln(1+ex/ tau)= log alpha(1+ alphax).


Then the update operation is expressed as follows:

s=tnow+ rho tau(stnow).


Thus, the task is reduced to the effective calculation of the approximation of the function  rho tau . Time is convenient to represent an integer, for example, measure it in processor cycles, so you need to build an integer mapping R: mathbbZ to mathbbZ such that:

R tau(T) approx rho tau(T) quad mboxforT in mathbbZ.


For this problem, the accuracy is not relative, but absolute, because mainly time operations are used with time values. Obviously, due to the integer representation of time, an error of less than 0.5 clock cycle is unattainable.
In addition, the operation of measuring the current time, as a rule, gives some error. For example, if there is a way to measure time with an accuracy of 10 clocks, it is enough to require that R tau gave an approximation of about the same accuracy:  left|R tau(T) rho tau(T) right| le10.

You can suggest several different ways to calculate R tau different complexity and degree of effectiveness.

Calculation R tau : FPU


The easiest way to use floating point arithmetic and directly calculate  rho tau by definition. The execution time of this operation is about 100 processor cycles, which is quite a lot compared to the method proposed below.

Calculation R tau : tabular implementation


The idea of ​​the table implementation is to save the pre-calculated function values ​​in the table.

First, using identity

 rho tau(x)=x+ rho tau(x),


enough to build R  t a u ( t ) only for T l e 0  .

Secondly, because

 r h o  t a u ( x ) t o 0 m b o x f o r x t o - i n f t y ,    


exists T m i n > 0 , such thatR τ ( T ) = 0 at T - T m i n . Where T m i n can be found from the following considerations:

T m i n = t m i n,ρ τ ( t m i n ) = 1 / 2 ,

from where
T m i n = - log α ( α 1 / 2 - 1 ) = - τ ln ( e 1 / 2 τ - 1 ) . Thus, it suffices to determine

R τ ( T ) forT m i n < T 0 .

Function graph ρ τ ( x ) is presented in fig. 3. The peculiarity of this function is that with a change in the parameter t a u will be the same scaling on both axes. Figure 3: Function Graph



ρ τ ( x )

Obvious implementationR τ is to build a table ofT m i n cells where the precomputed values ​​will be stored. Since all the values ​​of the function on this interval are between 0 andρ τ ( 0 ) = τ ln 2 , the final size of the table isT m i n log 2 ( τ ln 2 ) bits. The algorithm for updating the value is as follows:


t n o wg e t T i m e ( ) δmin { | s - t n o w | , T m i n } δ R ( - δ ) s δ + max { s , t n o w }



The absolute accuracy of this method is 1/2, since it gives the best integer approximation of a real-valued function.

Performance Measurement Results


The following three implementations of the exponential moving average were compared with each other:
1. naive implementation of EMA;
2. model of decay through FPU (that is, with a call to the functions exp () and log () of the mathematical library);
3. model of decay by tabular method.

The source code of the test for C: pastebin.com/N1Xg9dud .

Execution time of one call to update () functionτ = 100000 (in this case, the table for calculatingR τ is placed in the cache L1) in implementations 1, 2 and 3 is100,125,11processor cycles, respectively. Using an exponential moving average is a convenient way to count events and estimate intensity. However, this operation is computationally expensive, which greatly limits the possibilities of its application. Our implementation of the algorithm based on the decay model is, firstly, beautiful, and secondly, more efficient than the naive implementation of the EMA calculation. Efficiency is ensured by two factors: tabular calculation of the transcendental function and more economical representation of the counter in memory.



Thanks


This publication was prepared by us in a trial mode as part of a project to highlight the mechanisms of the Qrator filtering network. Thanks to Anton Orlov, Artem Gavrichenkov, Evgeny Naradov and Nikita nikitadanilov Danilov for discussions and ideas.

Source: https://habr.com/ru/post/334354/


All Articles