More and more attention is paid to optimizing processes, mainly in the form of reducing production costs. Cost reduction can be achieved by upgrading the equipment, but this approach entails a lot of costs for the design, purchase, reconstruction, etc., and is also accompanied by lost profit during the idle time of the reconstructed object. But it is also possible to use a mathematical approach to search for inefficiencies in the technological process, and this will be discussed later.
Briefly about neural networks
A neural network is a system of simple processors (neurons) connected and interacting with each other.
Figure 1. The structural diagram of the neural network (green - the input layer of neurons, blue - hidden (intermediate) layer of neurons, yellow - output layer of neurons).

A neuron is a basic element of a neural network, a single simple computing processor capable of receiving, transforming and distributing signals, in turn combining a large number of neurons into a single network allows solving rather complex tasks.
')
Figure 2. Diagram of a neuron.

The neural network approach is free from model constraints, it is equally suitable for linear and complex non-linear problems, as well as classification problems. Training of the neural network in the first place is to change the "power" of connections between neurons. Neural networks are scalable, they are capable of solving problems both within a single equipment and across the scale of factories in general.
Briefing
The goal is to predict the sulfur content in the product with the greatest possible accuracy, which in turn will allow to keep the main technological parameters in optimal values for both the quality of the product and in terms of process optimization.
Units of measure - ppm (one millionth).
Input data - the historical values of the technological parameters of the object.
Data for checking the forecast of the network - daily laboratory analyzes of sulfur content.
Training and network testing
A total of 531 observations were used, the total sample was divided as follows: 70% of the observations of the sample were used for network training, 30% were used as a control sample for assessing the quality of network training and further comparison of networks among themselves. The average sulfur content in all observations was 316.7 ppm.
In total, 4 networks were selected based on the training results, the networks have the following configuration:
Network No. 1: 20-22-1
Network number 2: 20-26-1
Network number 3: 20-27-1
Network No. 4: 20-16-1
The network configuration is presented in the form of AA-BB-C, where AA is the number of neurons in the input layer, BB is the number of neurons in the hidden layer, C is the number of neurons in the output layer.
The training of networks was carried out in specialized packages, at the moment there are a great many of them (SPSS, Statistica, etc.), below are histograms of the error distribution of trained networks for the entire set of observations:
Figure 3. The histogram of the distribution of errors for the network №1.

Figure 4. Error distribution histogram for network # 2.

Figure 5. Error distribution histogram for network # 3.

Figure 6. The histogram of the distribution of errors for the network №4.

From the obtained histograms, we can conclude that the network error is subject to the normal distribution law, i.e. You can divide the size of the error into 3 areas (for simplicity, the distribution is considered normalized):

± σ1 (area 1 sigma - the magnitude of the error in 68% of the forecasts is in this range);
± σ2 (area 2 sigma - the error value in 95% of the forecasts is in this range);
± σ3 (3 sigma area - gross errors, misses, in less than 5% of cases, the magnitude of the error is greater than in the ± σ2 region).
Errors by distribution areas:Network number and ± σ1 (68% of forecasts)
Network No. 1: ± 16.4ppm
Network # 2: ± 18.3ppm
Network number 3: ± 19ppm
Network No. 4: ± 18.6ppm
Network number and ± σ2 (95% of predictions)
Network No. 1: ± 43.9ppm
Network # 2: ± 47.6ppm
Network # 3: ± 42.8ppm
Network No. 4: ± 41ppm
The cause of gross errors (misses) in the field of ± σ3 is the work of the network with data that is very different from those that were present in the training set.
Also an important indicator of the quality of training of the neural network is the magnitude of the average absolute error.
Average absolute error size:Network №1 - 14,4ppm
Network number 2 - 13,4ppm
Network number 3 - 14,3ppm
Network number 4 - 13,6ppm
Below are graphs of the sulfur content in the product (laboratory analysis) and the magnitude of the absolute error:
Figure 7. Graph of sulfur content and absolute error for the network number 1.

Figure 8. Graph of sulfur content and absolute error for the network number 2.

Figure 9. Graph of sulfur content and absolute error for the network number 3.

Figure 10. Graph of sulfur content and absolute error for the network number 4.

Software implementation
Our own development in C # was used to view real-time forecasts, data was received from the OPC server, an application was originally developed with a minimum set of features (graphs, XML-import, graph export, adding an arbitrary parameter to the graph), in the future we plan to add a history saving in the database, a comparison of network forecasts with real historical values for given time stamps, training of networks already in its package, comparison of networks among themselves and not only.
Figure 11. Screenshot of the first version

findings
Favorable working conditions for the network:
The network produces the smallest error when the sulfur content in the final product is in the range of 240-250ppm ÷ 400-410ppm (the sulfur content obtained as a result of laboratory analysis, and not the network forecast), this is due to the fact that most of the measurements were made in this range, and, actually, the network was trained on them. Neural networks have the ability to summarize information, i.e. able to give a forecast, including based on data with which the network has not worked up to this point, using the patterns of the training sample, but in this case, despite this network feature, it should be remembered that the end result will be unpredictable, but it can be said with certainty that the error will increase.
In case of major changes at the facility, the network must be retrained.
Ways to improve:
- Reaction time
Due to the fact that the reaction cycle (from the moment of measuring the characteristics of the raw materials and its passage through the entire installation to the further measuring point of the characteristics of the final product) has a certain duration, for a higher correlation of the data, an accurate comparison of the parameters of the raw materials to the product parameters is required, which will increase the prediction accuracy . - Noise filtering
The readings measured by the sensors, in addition to the useful signal component, also include noise. This noise is insignificant, but distorts the network training process and, accordingly, its subsequent predictions as a result of training, for this it requires taking into account the noise component and then adding filters before the inputs of the neural network. It is also possible to filter the output of the neural network for a more smooth change in the forecast. The range of filters today is quite extensive: from the simplest filters of the median and the exponent to wavelets. - Increased frequency of analysis
Increasing the number of measurements of sulfur content during the day, which will increase the amount of data for training the network and in turn will provide a better network. - Improving the accuracy of laboratory analysis
If technically possible, increasing the accuracy of the analysis (by increasing the accuracy by an order of magnitude) will make the data more flexible for the network, since for the same value of sulfur, there is a large scatter of independent parameters, which in turn leads to an increase in the error of the neural network. - Increasing the number of input variables
It should be noted that in fact even a slight correlation of data with the target parameter is quite large, so you should use the maximum possible number of parameters on the object, and also, possibly, use data from the object that precedes the current along the technological chain.
Summary
Summarizing, I can say that the work experience turned out to be productive and extremely interesting. Although many comparisons and analyzes of the results have been made, it is too early to draw final conclusions. But as a result of testing neural networks at a real facility, it is possible to make an unequivocal conclusion that this approach has the right to life and has a truly rich potential.