📜 ⬆️ ⬇️

Kalman filter



On the Internet, including on Habré, you can find a lot of information about the Kalman filter. But it is hard to find an easily digestible conclusion of the formulas themselves. Without conclusion, all this science is perceived as a kind of shamanism, formulas look like a faceless set of symbols, and most importantly, many simple statements lying on the surface of the theory are beyond comprehension. The purpose of this article will be to tell about this filter in the most accessible language.
Kalman Filter is the most powerful data filtering tool. Its main principle is that when filtering information is used on the physics of the phenomenon itself. For example, if you are filtering data from the machine’s speedometer, the inertia of the machine gives you the right to perceive too fast speed jumps as a measurement error. The Kalman filter is interesting because in some sense this is the best filter. We will discuss in more detail below what exactly the words “the best” mean. At the end of the article I will show that in many cases the formulas can be simplified to such an extent that almost nothing will remain of them.

Likbez


Before getting acquainted with the Kalman filter, I propose to recall some simple definitions and facts from probability theory.

Random value


When they say that a random variable is given , they mean that this value can take random values. It takes different values ​​with different probabilities. When you roll, say, a bone, a discrete set of values ​​will appear: . When it comes to, for example, the speed of a wandering particle, then, obviously, we have to deal with a continuous set of values. "Fallen" values ​​of a random variable we will denote by but sometimes, we will use the same letter, which we denote a random variable:
In the case of a continuous set of values, a random variable characterizes the probability density which dictates to us that the probability that a random variable “falls out” in a small neighborhood of a point long equals . As we see from the picture, this probability is equal to the area of ​​the shaded rectangle under the graph:

')
Quite often in life, random variables are distributed in Gauss, when the probability density is .

We see that function has a bell shape with a center at and with a characteristic width of the order .
Once we started talking about the Gaussian distribution, it would be a sin not to mention where it came from. As well as numbers and firmly established in mathematics and are found in the most unexpected places, so the Gauss distribution took deep roots in probability theory. One remarkable statement partially explaining Gaussian omnipresence is as follows:
Let there be a random variable having an arbitrary distribution (in fact, there are some restrictions on this arbitrariness, but they are not at all rigid). Will hold experiments and calculate the amount "Dropped" values ​​of a random variable. We will do a lot of such experiments. It is clear that each time we will receive a different value of the amount. In other words, this sum is in itself a random variable with its own specific distribution law. It turns out that with sufficiently large the distribution law of this sum tends to the Gaussian distribution (by the way, the characteristic width of the "bell" grows as ). We read in more detail in Wikipedia: the central limit theorem . In life, very often there are quantities that are made up of a large number of equally distributed independent random variables, and therefore distributed according to Gauss.

Average value


The average value of a random variable is what we will get in the limit if we carry out a lot of experiments and calculate the arithmetic average of the values ​​that have been dropped. Mean value is denoted differently: mathematicians like to denote by (mathematical expectation or mean value), and foreign mathematics through (expectation). Physics through or . We will designate in a foreign way: .
For example, for Gaussian distribution , the average value is .

Dispersion


In the case of the Gauss distribution, we clearly see that the random variable prefers to fall out in a certain neighborhood of its mean value. .

Once again admire the Gaussian distribution


As can be seen from the graph, the characteristic variation of the values ​​of the order . How to estimate this spread of values ​​for an arbitrary random variable, if we know its distribution. You can draw a graph of its probability density and estimate the characteristic width per eye. But we prefer to go algebraically. You can find the average length (modulus) of deviations from the mean: . This value will be a good estimate of the characteristic variation of values. . But we all know very well that the use of modules in formulas is one headache, so this formula is rarely used to estimate the characteristic spread.
An easier way (simpler in terms of calculations) is to find . This value is called the variance, and is often referred to as . The root of the variance is a good estimate of the spread of the random variable. The root of the variance is also called the standard deviation.
For example, for the Gaussian distribution it is possible to calculate that the variance determined above exactly equal to and therefore the standard deviation is , which agrees very well with our geometrical intuition.
In fact, there is a small fraud. The fact is that in the definition of a Gaussian distribution, under the exponent is the expression . This deuce in the denominator is precisely to ensure that the standard deviation would be equal to the coefficient . That is, the Gaussian distribution formula itself is written in a form specially tailored for the fact that we will consider its standard deviation.

Independent random variables


Random variables are dependent and not. Imagine that you are throwing a needle on a plane and recording the coordinates of its both ends. These two coordinates are dependent, they are connected by the condition that the distance between them is always equal to the length of the needle, although they are random variables.
Random variables are independent if the result of the fallout of the first one is completely independent of the result of the fallout of the second one. If random values and are independent, then the average value of their product is equal to the product of their average values:



Evidence
For example, having blue eyes and graduating from school with a gold medal is independent random values. If the blue-eyed, say and gold medalists the blue eyed medalists This example tells us that if random variables and given by their probability densities and , the independence of these quantities is expressed in the fact that the probability density (the first value fell and the second ) is according to the formula:

It immediately follows that:

As you can see, the proof is carried out for random variables that have a continuous spectrum of values ​​and are given by their probability density. In other cases, the idea of ​​proof is similar.

Kalman filter


Formulation of the problem


Denote by the value that we will measure, and then filter. This may be a coordinate, speed, acceleration, humidity, stink degree, temperature, pressure, etc.
Let's start with a simple example that will lead us to the formulation of a common task. Imagine that we have a radio-controlled machine that can only go back and forth. Knowing the weight of the car, the shape, the road surface, etc., we calculated how the control joystick affects the speed of movement. .



Then the coordinate of the machine will change according to the law:



In real life, we cannot take into account in our calculations the small perturbations acting on the car (wind, bumps, pebbles on the road), therefore the real speed of the car will differ from the calculated one. A random variable will be added to the right side of the written equation. :



We have a GPS-mounted sensor that tries to measure the true coordinate cars, and, of course, can not measure it accurately, and measures with an error which is also a random variable. As a result, we get erroneous data from the sensor:



The challenge is that, knowing the wrong sensor readings , find a good approximation for the true coordinates of the machine . This is a good approximation we will denote as .
In the formulation of the general problem, for the coordinate can answer anything (temperature, humidity ...), and the member responsible for controlling the system from the outside, we denote (in the example with the machine ). The equations for the coordinates and sensor readings will look like this:

(one)

Let's discuss in detail what we know:


It is worth noting that the filtering task is not a smoothing task. We do not strive to smooth the data from the sensor, we strive to get the closest value to the real coordinate .

Kalman algorithm


We will reason by induction. Imagine that on step, we have already found the filtered value from the sensor which brings the true coordinate of the system well . Do not forget that we know the equation controlling the change in the unknown coordinate to us:



therefore, without yet getting the value from the sensor, we can assume that in step the system evolves according to this law and the sensor will show something close to . Unfortunately, while we can not say anything more accurate. On the other hand, on the move we will have an inaccurate sensor reading .
The idea of ​​Kalman is that in order to get the best approximation to the true coordinate , we have to choose the middle ground between reading inaccurate sensor and - our prediction of what we expected to see from him. The sensor reading we give weight and the predicted value will be weight :



Coefficient called the Kalman coefficient. It depends on the iteration step, so it would be better to write , but for now, in order not to overload the calculation formulas, we will omit its index.
We must choose the Kalman coefficient so that the resulting optimal value of the coordinate would be closest to the true coordinate . For example, if we know that our sensor is very accurate, then we will trust its readings more and give the value more weight ( close to one). If the sensor, on the contrary, is not at all accurate, then we will focus more on the theoretically predicted value. .
In general, to find the exact value of the Kalman coefficient you just need to minimize the error:



Use equations (1) (those that are on a blue background in a frame) to rewrite the expression for the error:



Evidence


Now it's time to discuss what the expression means to minimize the error? After all, the error, as we see, is in itself a random variable and each time takes on different values. In fact, there is no unambiguous approach to determining what the error means is minimal. Just as in the case of the variance of a random variable, when we tried to estimate the characteristic width of its spread, so here we will choose the most simple criterion for calculations. We will minimize the average of the squared error:



Let's write the last expression:



key to proof
From the fact that all random variables in the expression for , independent and average values ​​of sensor errors and models are zero: , it follows that all "cross" members are equal to zero:
.
Plus, the formulas for dispersions look much simpler: and (because )


This expression takes the minimum value when (we equate the derivative with zero)



Here we are already writing the expression for the Kalman coefficient with the step index , thus we emphasize that it depends on the iteration step.
Substitute the expression for the root-mean-square error minimizing its value of the Kalman coefficient . We get:

.

Our problem is solved. We obtained an iterative formula for calculating the Kalman coefficient.

All formulas in one place



Example


The advertising picture at the beginning of the article filtered data from a fictional GPS sensor mounted on a fictional car that travels evenly accelerated with known fictional acceleration .



Look again at the result of filtering.


Code on matlabe
clear all; N=100 % number of samples a=0.1 % acceleration sigmaPsi=1 sigmaEta=50; k=1:N x=k x(1)=0 z(1)=x(1)+normrnd(0,sigmaEta); for t=1:(N-1) x(t+1)=x(t)+a*t+normrnd(0,sigmaPsi); z(t+1)=x(t+1)+normrnd(0,sigmaEta); end; %kalman filter xOpt(1)=z(1); eOpt(1)=sigmaEta; % eOpt(t) is a square root of the error dispersion (variance). It's not a random variable. for t=1:(N-1) eOpt(t+1)=sqrt((sigmaEta^2)*(eOpt(t)^2+sigmaPsi^2)/(sigmaEta^2+eOpt(t)^2+sigmaPsi^2)) K(t+1)=(eOpt(t+1))^2/sigmaEta^2 xOpt(t+1)=(xOpt(t)+a*t)*(1-K(t+1))+K(t+1)*z(t+1) end; plot(k,xOpt,k,z,k,x) 



Analysis


If you follow up with an iteration step Kalman coefficient changes then it can be shown that it always stabilizes to a certain value. . For example, when the mean square errors of the sensor and the models relate to each other as ten to one, the graph of the Kalman coefficient depending on the iteration step looks like this:



In the following example, we will discuss how this will help to significantly ease our lives.

Second example


In practice, it often happens that we do not know anything at all about the physical model of what we filter. For example, you wanted to filter readings from your favorite accelerometer. You also do not know in advance by what law you intend to turn the accelerometer. The maximum information you can use is the variance of the sensor error. . In such a difficult situation, all ignorance of the motion model can be driven into a random variable. :



But, frankly, such a system does not at all satisfy the conditions that we imposed on a random variable. because now all the physics of motion unknown to us is hidden there, and therefore we cannot say that at different times the model errors are independent of each other and that their average values ​​are zero. In this case, by and large, the theory of Kalman filter is not applicable. But, we will not pay attention to this fact, and, stupidly apply all the colossus of formulas, choosing the coefficients and on the eye, so that the filtered data looked pretty.
But you can go the other way, a much simpler way. As we saw above, the Kalman coefficient with increasing step number always stabilizes to the value . Therefore, instead of choosing coefficients and and find the complex ratio of Kalman coefficient , we can assume that this coefficient is always constant, and select only this constant. This assumption will not spoil anything. First, we are already illegally using the Kalman theory, and secondly, the Kalman coefficient quickly stabilizes to a constant. In the end, everything is very simplified. We do not need any formulas from the Kalman theory, we just need to find an acceptable value. and paste into the iterative formula:



The following graph shows data filtered in two different ways from a fictional sensor. Provided that we know nothing about the physics of the phenomenon. The first method is honest, with all the formulas from Kalman's theory. And the second is simplified, without formulas.



As we see, the methods are almost the same. A small difference is observed only at the beginning, when the Kalman coefficient has not yet stabilized.

Discussion


As we have seen, the main idea of ​​the Kalman filter is to find the coefficient such that the filtered value



on average, the least would be different from the real value of the coordinate . We see that the filtered value there is a linear function of sensor readings and previous filtered value . And the previous filtered value is, in turn, a linear function of the sensor reading and the previous previous filtered value . And so on, until the chain fully unfolds. That is, the filtered value depends on all previous sensor readings linearly:



Therefore, the Kalman filter is called a linear filter.
It is possible to prove that of all linear filters, the Kalman filter is the best. Best in the sense that the mean square error of the filter is minimal.

Multidimensional case


The whole theory of the Kalman filter can be generalized to the multidimensional case. The formulas there look a little worse, but the very idea of ​​their derivation is the same as in the one-dimensional case. In this beautiful article you can see them: http://habrahabr.ru/post/140274/ .
And in this wonderful video, an example of how to use them is analyzed.

Literature


You can download the original Kalman article here: http://www.cs.unc.edu/~welch/kalman/media/pdf/Kalman1960.pdf .
This post can be read in English http://david.wf/kalmanfilter

Source: https://habr.com/ru/post/166693/


All Articles