Let's try to solve the problem of finding anomalies in the sound.
Microphones, for the time being, are some of the most common universal detectors. They are small, cheap, reliable. And they are by default present in cell phones. They can be used almost everywhere. Therefore, the task of sound processing, not just speech, is right before us. This is a classic example of Low hanging fruit - "low hanging fruit." :)
The sound is a time sequence. In its simplest form, this is the signal amplitude along the time axis. The signal can be converted to a spectral region. Then the signal will be a frequency spectrum that varies over time. The sound is a continuous stream of data. This is very important, since we cannot look at the data [far] ahead of time and then build or adjust our algorithm.
In addition, the signal may change over time, and these changes will no longer be anomalies. It is desirable that the system itself adapts to signal changes so that it does not consider repeated signal changes as anomalies. In general, our task extends to any time sequence , not only the sound.
In one case, we process the signal without any a priori knowledge of which anomalies are possible. Deviation of behavior from the usual is considered an anomaly. These are unsupervised models.
In another case, we know what anomalies might look like. We have samples of sound anomalies. We use these patterns in anomaly search algorithms. These are supervised models.
In the simplest case, we will only detect the fact of the anomaly with reference to time. In a more complicated case, we will still classify anomalies. For example, the anomaly may be weak, strong or catastrophic. We can only determine the time of the beginning of the anomaly, or determine the time window from the beginning to the completion of the anomaly.
Methods for detecting anomalies in sound belong to a subclass of methods for detecting anomalies . The difference is that the input data, sound, is fed in a continuous stream.
The simplest methods are signal change detectors . The threshold detector responds to changes in signal amplitude or changes in the spectrum. Such methods are very simple and in many cases the most reliable. They work in conditions where the input signal is stable and its behavior does not change in time. Any change is considered an anomaly.
The following group of methods relates to signal processing techniques . These methods are deeply studied in radio signal processing. For example, give radars and mobile communications. Methods work great when we know what we are looking for. We must know in advance the sound or spectrum of the anomaly. Signal processing techniques have the most powerful filtering apparatus for the desired signals. It is not entirely clear how we use these methods to look for anomalies.
And the last group of methods is Machine Learning methods. It includes both classical methods, for example, linear regression or classification methods, as well as the latest neural networks. Machine learning methods are good where the system has to adapt to the different behavior of the input signals. The input signal can vary in rather complex dependencies, and methods are trained along the way and adjusted to these dependencies. Any patterns of the signal are not considered anomalies. It is difficult and sometimes impossible to program something like this using conventional methods, because we will have to manually adjust our system to change the signal. The methods of Machine Learning do not require manual adjustment, they are trained on the available data. The main problem with Machine Learning methods is that the input data, time sequences, are fed in a continuous stream. Specialized models have been developed for such data. As an example, you can specify recurrent neural networks (RNN). One of his successful descriptions can be read here .
I will first conduct a brief study of the available examples, in order to choose the most suitable for my task.
Data preparation: KDNuggets has a good example of using spectral sequence .
Now I will try to look for code samples.
Microsoft Anomaly Detection Service is one of the easiest services. Unfortunately, it cannot be fed a continuous stream of data. Its documentation is quite simple and describes in detail typical anomalies in temporal sequences.
Numenta has created a test platform for testing systems and algorithms for detecting anomalies in temporal sequences. Numenta is a very interesting company engaged in Machine Learning algorithms and trying to get closer to understanding how the brain works. I recommend to those interested to watch fascinating videos, in particular the report on the detection of anomalies. Numenta algorithms are very well suited for our task. Numenta in the lead on the test site. Unfortunately, their implementation is quite complicated. A couple of algorithms, presented in second and third place (as of November 2016), are not far behind Numenta, but much easier to implement.
On GitHub you can find a lot of Open Source systems, in particular the system from Twitter .
If we turn to the theory of the detection of anomalies using the classical Machine Learning, then one of the most popular and complete sources of information on this topic is the scikit-learn project.
Azure Machine Learning has many examples of implementations . Naturally, we will have to use Microsoft Azure ML. You can experiment and build models for free, but then you have to pay for hosting models on Azure servers. What could be a plus, and maybe a minus.
Microsoft developed one of the best packages on neural networks, CNTK , and laid it out in the clear. There are a few examples of using a packet to process temporary sequences.
One of the most popular packages on neural networks is Google Tensorflow . Keras is a fairly popular package that runs on top of TensofFlow or Theano and greatly simplifies model creation. Here is one example of its use for RNN.
For a start I am defined with the interface. Regardless of the design of the internals of the service itself, the interface to it will not be much different, since it depends primarily on the task, and not on its implementation.
So, at the entrance there is either a large file with sound, or a port for a stream (stream), through which data is continuously fed.
The output is either a file or a stream. Output format: time of occurrence of anomalies. In addition, the following can still be issued: anomaly completion time; anomaly class.
Additional settings are parameters that are different for different service designs. For example:
Since I started with three approaches, it is logical to work out exactly three designs. The first one is based on detecting a signal change without using Machine Learning methods. In the simplest case, the input signal is averaged, so that random noise does not create too many false positives. Then it is checked for the function “if the amplitude is greater than the threshold, then this is an anomaly”.
A bit more complicated, but much more practical, if we add logic to detect patterns of a signal. For example, anomalies will be cached, and new anomalies will be checked for repetitions. If a similar anomaly is found in the cache, and it is marked “not anomaly”, then this is “not an anomaly”. We have to add the method of determining the “anomaly / not anomaly”. This will be either an additional user interface or some kind of algorithm.
The simplest option is to use Azure ML. As an example, I can take this simple experiment , or more complex .
In another version, if I want to have complete control over the application, I can take this project and this one as a model. Both are made in Python.
Most likely, a ready-made Keras- based sample is suitable , or CNTK can be used. But I did not find a ready-made example for it, so I’ll have to add the code.
These review (POC) projects, of course, will have to be done before starting a real project. Specific data will be very different from the task to the task. Depending on this data, different methods will suit us.
The purpose of this article was to show one of the approaches to solving the problem "how to approach the problem of" search for sound anomalies. "
If someone from readers decides to throw up fresh ideas or point out problems, mistakes, I will be very happy to see them in the comments. Thanks in advance!
Source: https://habr.com/ru/post/315800/
All Articles