Hello, Habr! This article is intended for those who are roughly fumbling in the mathematical principles of the neural networks and in their essence in general, therefore I advise you to read this before reading. Somehow you can understand what is happening here first, then here .
Recently, I had to make a neural network for recognizing handwritten numbers (today it’s not exactly its code) as part of a school project, and, naturally, I began to understand this rubbish topic. Having looked approximately enough about it on the Internet, I understood a little more than nothing. But suddenly (as is usually the case), it came across a book by Simon Haykin (I don’t know why I didn’t google it before). And then began the sweaty tasting of the materiel of neural networks, consisting of one matan.
In fact, despite the abundance of mathematics, it is not so prohibitively complex. Understand Satanist the average 11-grader comrade-fizmat or 1 ~ 2-course technician sharaga can do the scribbles and letters of this manual. In addition, even if the book is rather voluminous and difficult to read, the things written in it really explain what is “going on at the wheelbarrow under the hood”. As you understand, I highly recommend (in no case do not advertise) "Neural networks. Simon Khaikin's complete course" for reading in case you have to deal with the use / writing / development of neural networks and other similar stuff. Although there is no material in it about new-fashioned convolutional networks, no one bothers to google lectures from some charismatic employee of Yandex / Mail.ru / etc. no one bothers.
Of course, realizing the device grids, I could not just stop, because ahead of them was writing code. In connection with his parallel lesson, which is to create games on Unity, the implementation language was lamp and nyashny Shar-pei version 7 (for it is the latest one). It was at this moment that, being on the Internet, I realized that the number of intelligible tutorials on writing neural networks from scratch (without your frameworks) on Sharpe is infinite. Okay. I could use all sorts of Theano and Tensor Flow, BUT under the hood of my death-machine in my laptop there is a “red” video card without special support from the API, through which the power of the GPU is accessed (after all, Theano / Tensor Flow / etc.).
My video card is called ATI Radeon HD Mobility 4570. And if anyone knows how to use its capacity to parallelize neural network computing, please write in a comment. Then you will help me, and perhaps this article will continue. The proposal of other PL is not condemned.
Simply, as I understand it, it is so old that it does not support it. Maybe I'm wrong.
What I saw (the third is a kind of esoteric with an ugly code) can undoubtedly shock you as well, as those issued for neural networks are connected to them just like Yanix with high-quality rap. Soon I realized that I can only rely on myself, and decided to write this article so that all users would not mislead others.
Here I will not consider the network code for recognizing numbers (as mentioned earlier), because I left it on a flash drive, removing it from the laptop, but I’m lazy to look for this information carrier, and in this regard I will help you design a multilayered fully connected perceptron to solve the problem XOR and XAND (XNOR, xs how else).
Before you start programming it, can it is necessary to draw on paper in order to facilitate the presentation of the structure and operation of the neuron. My imagination resulted in the following picture. And yes, by the way, this is a console application in Visual Studio 2017, with the .NET Framework version 4.7.
Multi-layer fully connected perceptron.
One hidden layer.
4 neurons in the hidden layer (the perceptron converged on this amount).
The learning algorithm is backpropagation.
The stopping criterion is overcoming the threshold value of the root-mean-square error over the epoch. (0.001)
Learning rate - 0.1.
The activation function is logistic sigmoid.
Then we need to realize that we need to write down the weight somewhere, carry out calculations, debug a little, and of course, tuples (but I don’t need a user for them). Accordingly, using'and we have such.
The release || debug folder of this project contains the files (for each layer one by one) by the type name (fieldname) _memory.xml you know for what. They are created in advance, taking into account the total number of weights of each layer. I know that XML is not the best choice for parsing, I just had a little time to do this.
using System.Xml; using static System.Math; using static System.Console;
We also have two types of computational neurons: hidden and weekend. And weights can be read or written to memory. We implement this concept with two listings.
enum MemoryMode { GET, SET } enum NeuronType { Hidden, Output }
Everything else will happen inside the namespace, which I will simply call: Neural Network.
namespace NeuralNetwork { //, , }
First of all, it is important to understand why I depicted the neurons of the input layer in squares. The answer is simple. They do not calculate anything, but only capture information from the outside world, that is, they receive a signal that will be passed through the network. As a result, the input layer has little to do with the other layers. That is why the question is: do you have a separate class for it or not? In fact, when processing images, video, sound, it is worth making it, only to accommodate the logic to transform and normalize this data to the form supplied to the input of the network. That's why I still write the class InputLayer. It contains a training sample organized by an unusual structure. The first array in the tuple is the combination signals 1 and 0, and the second array is the pair of results of these signals after performing XOR and XAND operations (first XOR, then XAND).
class InputLayer { private (double[], double[])[] _trainset = new(double[], double[])[]//-, 2 { (new double[]{ 0, 0 }, new double[]{ 0, 1 }), (new double[]{ 0, 1 }, new double[]{ 1, 0 }), (new double[]{ 1, 0 }, new double[]{ 1, 0 }), (new double[]{ 1, 1 }, new double[]{ 0, 1 }) }; // - public (double[], double[])[] Trainset { get => _trainset; }// C# 7 }
Now we are implementing the most important thing, without which no neural network will become a terminator, namely, a neuron. I will not use offsets, because I just don't want to. The neuron will resemble the McCulloch-Pitts model, but have a different activation function (not a threshold), methods for calculating gradients and derivatives, its own type, and combined linear and non-linear transducers. Naturally without the designer is not enough.
class Neuron { public Neuron(double[] inputs, double[] weights, NeuronType type) { _type = type; _weights = weights; _inputs = inputs; } private NeuronType _type; private double[] _weights; private double[] _inputs; public double[] Weights { get => _weights; set => _weights = value; } public double[] Inputs { get => _inputs; set => _inputs = value; } public double Output { get => Activator(_inputs, _weights); } private double Activator(double[] i, double[] w)// { double sum = 0; for (int l = 0; l < i.Length; ++l) sum += i[l] * w[l];// return Pow(1 + Exp(-sum), -1);// } public double Derivativator(double outsignal) => outsignal * (1 - outsignal);// public double Gradientor(double error, double dif, double g_sum) => (_type == NeuronType.Output) ? error * dif : g_sum * dif;//g_sum - }
Okay, we have neurons, but they need to be combined into layers for calculations. Returning to my diagram above, I want to explain the presence of a black dotted line. He separates the layers so as to show what they contain. That is, one computational layer contains neurons and weights for communication with neurons of the previous layer. Neurons are combined in an array, rather than a list, as this is less resource intensive. Weights are organized by a matrix (two-dimensional array) of size (it is not difficult to guess) [the number of neurons in the current layer X the number of neurons in the previous layer]. Naturally, the layer initializes neurons, otherwise we say null reference. At the same time, these layers are very similar to each other, but they have differences in logic, so the hidden and output layers must be implemented by the heirs of the same base class, which by the way turns out to be abstract.
abstract class Layer// protected {//type protected Layer(int non, int nopn, NeuronType nt, string type) {// WeightInitialize numofneurons = non; numofprevneurons = nopn; Neurons = new Neuron[non]; double[,] Weights = WeightInitialize(MemoryMode.GET, type); for (int i = 0; i < non; ++i) { double[] temp_weights = new double[nopn]; for (int j = 0; j < nopn; ++j) temp_weights[j] = Weights[i, j]; Neurons[i] = new Neuron(null, temp_weights, nt);// null } } protected int numofneurons;// protected int numofprevneurons;// protected const double learningrate = 0.1d;// Neuron[] _neurons; public Neuron[] Neurons { get => _neurons; set => _neurons = value; } public double[] Data// null , {// set//(, , etc.) {// input' , for (int i = 0; i < Neurons.Length; ++i) Neurons[i].Inputs = value; }// } public double[,] WeightInitialize(MemoryMode mm, string type) { double[,] _weights = new double[numofneurons, numofprevneurons]; WriteLine($"{type} weights are being initialized..."); XmlDocument memory_doc = new XmlDocument(); memory_doc.Load($"{type}_memory.xml"); XmlElement memory_el = memory_doc.DocumentElement; switch (mm) { case MemoryMode.GET: for (int l = 0; l < _weights.GetLength(0); ++l) for (int k = 0; k < _weights.GetLength(1); ++k) _weights[l, k] = double.Parse(memory_el.ChildNodes.Item(k + _weights.GetLength(1) * l).InnerText.Replace(',', '.'), System.Globalization.CultureInfo.InvariantCulture);//parsing stuff break; case MemoryMode.SET: for (int l = 0; l < Neurons.Length; ++l) for (int k = 0; k < numofprevneurons; ++k) memory_el.ChildNodes.Item(k + numofprevneurons * l).InnerText = Neurons[l].Weights[k].ToString(); break; } memory_doc.Save($"{type}_memory.xml"); WriteLine($"{type} weights have been initialized..."); return _weights; } abstract public void Recognize(Network net, Layer nextLayer);// abstract public double[] BackwardPass(double[] stuff);// }
The Layer class is an abstract class, so you cannot create instances of it. This means that our desire to preserve the properties of the "layer" is accomplished by inheriting the parent constructor through the keyword base and the empty constructor of the heir in one line (for all the constructor logic is defined in the base class, and it does not need to be rewritten).
Now directly classes-heirs: Hidden and Output. Immediately two classes in a single piece of code.
class HiddenLayer : Layer { public HiddenLayer(int non, int nopn, NeuronType nt, string type) : base(non, nopn, nt, type){} public override void Recognize(Network net, Layer nextLayer) { double[] hidden_out = new double[Neurons.Length]; for (int i = 0; i < Neurons.Length; ++i) hidden_out[i] = Neurons[i].Output; nextLayer.Data = hidden_out; } public override double[] BackwardPass(double[] gr_sums) { double[] gr_sum = null; // // -, // for (int i = 0; i < numofneurons; ++i) for (int n = 0; n < numofprevneurons; ++n) Neurons[i].Weights[n] += learningrate * Neurons[i].Inputs[n] * Neurons[i].Gradientor(0, Neurons[i].Derivativator(Neurons[i].Output), gr_sums[i]);// return gr_sum; } } class OutputLayer : Layer { public OutputLayer(int non, int nopn, NeuronType nt, string type) : base(non, nopn, nt, type){} public override void Recognize(Network net, Layer nextLayer) { for (int i = 0; i < Neurons.Length; ++i) net.fact[i] = Neurons[i].Output; } public override double[] BackwardPass(double[] errors) { double[] gr_sum = new double[numofprevneurons]; for (int j = 0; j < gr_sum.Length; ++j)// { double sum = 0; for (int k = 0; k < Neurons.Length; ++k) sum += Neurons[k].Weights[j] * Neurons[k].Gradientor(errors[k], Neurons[k].Derivativator(Neurons[k].Output), 0);// gr_sum[j] = sum; } for (int i = 0; i < numofneurons; ++i) for (int n = 0; n < numofprevneurons; ++n) Neurons[i].Weights[n] += learningrate * Neurons[i].Inputs[n] * Neurons[i].Gradientor(errors[i], Neurons[i].Derivativator(Neurons[i].Output), 0);// return gr_sum; } }
In principle, I described all the most important things in the comments. We have all the components: training and test data, computational elements, their "conglomerates". Now it’s time to bind everything with learning. The learning algorithm is backpropagation, therefore the stop criterion is chosen by me, and my choice is to overcome the threshold value of the root-mean-square error of the epoch, which I chose to be 0.001. For this goal, I wrote the Network class, which describes the state of the network, which is taken as a parameter of many methods, as you might have noticed.
class Network { // InputLayer input_layer = new InputLayer(); public HiddenLayer hidden_layer = new HiddenLayer(4, 2, NeuronType.Hidden, nameof(hidden_layer)); public OutputLayer output_layer = new OutputLayer(2, 4, NeuronType.Output, nameof(output_layer)); // public double[] fact = new double[2];// 2 // double GetMSE(double[] errors) { double sum = 0; for (int i = 0; i < errors.Length; ++i) sum += Pow(errors[i], 2); return 0.5d * sum; } // double GetCost(double[] mses) { double sum = 0; for (int i = 0; i < mses.Length; ++i) sum += mses[i]; return (sum / mses.Length); } // static void Train(Network net)//backpropagation method { const double threshold = 0.001d;// double[] temp_mses = new double[4];// double temp_cost = 0;// do { for (int i = 0; i < net.input_layer.Trainset.Length; ++i) { // net.hidden_layer.Data = net.input_layer.Trainset[i].Item1; net.hidden_layer.Recognize(null, net.output_layer); net.output_layer.Recognize(net, null); // double[] errors = new double[net.input_layer.Trainset[i].Item2.Length]; for (int x = 0; x < errors.Length; ++x) errors[x] = net.input_layer.Trainset[i].Item2[x] - net.fact[x]; temp_mses[i] = net.GetMSE(errors); // double[] temp_gsums = net.output_layer.BackwardPass(errors); net.hidden_layer.BackwardPass(temp_gsums); } temp_cost = net.GetCost(temp_mses);// //debugging WriteLine($"{temp_cost}"); } while (temp_cost > threshold); // "" net.hidden_layer.WeightInitialize(MemoryMode.SET, nameof(hidden_layer)); net.output_layer.WeightInitialize(MemoryMode.SET, nameof(output_layer)); } // static void Test(Network net) { for (int i = 0; i < net.input_layer.Trainset.Length; ++i) { net.hidden_layer.Data = net.input_layer.Trainset[i].Item1; net.hidden_layer.Recognize(null, net.output_layer); net.output_layer.Recognize(net, null); for (int j = 0; j < net.fact.Length; ++j) WriteLine($"{net.fact[j]}"); WriteLine(); } } // static void Main(string[] args) { Network net = new Network(); Train(net); Test(net); ReadKey();// :) } }
The result of training.
Total by brain rape simple manipulations, we got the basis of a working neural network. In order to make it do something else, it is enough to change the class InputLayer and select the network parameters for the new task. After a while (which I don’t know specifically), I’ll write a continuation of this article with a guide on creating a convolutional neural network in C # from scratch, and here I’ll update it with links to the MLP reconciler for MNIST pictures ( but it is not exactly ) and article code in Python ( Exactly, but wait longer ).
Behind this all, I will be glad to answer the questions in the comments, but for now, if you please, new things are waiting.
UPD1: the second part
PS: For those who wish to code the code.
PPS: The network on the link above is the untrained nyasha-shyness.
Source: https://habr.com/ru/post/335052/
All Articles