
Nowadays, optical fibers have become an integral part of the most diverse spheres of human life: from home Internet to endoscopy. The use of optical fibers is due to a number of advantages: transmission speed, physical strength, throughput, information security, etc.
In order to increase the bandwidth, a multimode fiber (MMF) was created when information is transmitted over several parallel channels. Despite all its advantages, MMF has a number of shortcomings, one of which the researchers decided to eliminate in order to improve the process of transferring images. The bottom line is this: when a sample is projected onto the proximal side of the MMF, the image we get on the distal side is speckle, because its incoming data is distributed over a variety of modes with varying degrees of propagation along the fiber length. Scientists suggest using a combination of multimode fiber and deep learning for artificial neural networks to get accurate images, including when using endoscopy. Let's dig in the report of the researchers and try to understand how it works and what results. Go.
')
The basis of the studyTechniques for using artificial neural networks to decrypt images transmitted via MMF have been developed for a long time. Thus, in early works, a two-layer network was described, capable of recognizing about 10 images that passed through 10 meters of stepped fiber.
In this study, the system is much more complicated, but, according to scientists, much more efficient. The initial stage was the collection of a large number of speckle samples obtained by passing the image through the MMF. They have become the knowledge base for learning DNN (artificial neural network based on
deep learning * ).
Sample speckle imageDeep Learning * is a combination of machine learning methods based on representation, rather than a specialized algorithm for a specific task.
The DNN architecture is very complex and has about 14
hidden layers * .
Hidden layer * - an artificial neural network consists of computing units (neurons), which are divided into 3 categories: input, hidden and output. Input accept information, hidden perform various calculations, and output transmit information further.
For experiments on DNN, a database of 20,000 hand-written numbers was created. Next, the database in a random order of division into groups:
- 16,000 digits - training;
- 2,000 digits - check;
- 2,000 numbers - test.
Preparing for the experimentThe image below shows the layout of the optical system that was used to collect data.
Image number 1: installation diagram:
Laser source - a source of laser radiation (beam);
HWP - half wave plate;
M1 is a mirror;
SLM - spatial light modulator;
P - linear polarizer;
L is a lens;
BS - beam splitter;
OBJ - microscope lens;
OF - optical fiber;
CCD - CCD camera.And now in order. A laser beam with a wavelength of 560 nm directs light through a
gradient optical fiber * with a core diameter of 62.5 ÎĽm and a
numerical aperture * 0.275.
Gradient MMF * is a fiber with a non-uniform refractive profile when the refractive index smoothly decreases from the edge to the fiber axis.
Comparison of fiber types: step multimode, gradient multimode and singlemode (top to bottom).
Numerical aperture * - the sine of the maximum angle between the beam and the axis. At the same time there is a total internal reflection in the distribution of radiation over the fiber.
At a specific wavelength, the fiber is capable of supporting about 4500 spatial modes. The input samples (images) are displayed on the spatial light modulator, after which they are redirected by the 4f system to the proximal (close to the center) edge of the MMF. At the far end of the fiber, another 4f system visualizes a speckle emanating from the distal (far from the center) edge of the fiber onto a CCD camera.
CCD * is a charge-coupled device, which implements the technology of controlled charge transfer in the volume of a semiconductor.
To check the phase and amplitude models, as input signals for the gradient MMF, a half-wave plate was installed before SLM, and a linear polarizer after SLM.
As mentioned earlier, hand-written numbers were used as samples. They were taken from the
MNIST database .
Before being processed by DNN, each of the images recorded on CCD1 or CCD2 was cut to a size of 1024 Ă— 1024 pixels. Next, the resulting speckle images were reduced to 32 Ă— 32 pixels and used as input for DNN.
Image number 2In images
2a and
2b we see samples of numbers (0 and 4).
2c and
2d are the same numbers, but after amplitude modulation, when the amplitude of the transmitted signal was affected.
2e and
2f are sample numbers after phase modulation, when the phase of the carrier oscillation changed in direct proportion to the signal. We also see the speckles themselves, which were fixed on the distal edge of the fiber after passing a distance of 2 cm.
To distinguish speckles (
2g and
2h ) is quite difficult. However, if we compare the images
2d and
2h (for example, consider the sample "4"), then we can isolate the difference that the DNN can determine (
2i ). Thus, these distinctive features will allow the system to distinguish "0" from "4", "2" from "9", etc.
Data processingThe basis of the system for determining speckles and reconstructed input images was the
convolutional neural network * of the “Visual Geometry Group (VGG)” type (3a).
Convolutional neural network * - architecture of the INS, characterized by a convolution operation, when each image fragment is multiplied by a convolution matrix element by element, after which the result is summed up and written to the similar position of the output image.

An example of a convolutional neural network architecture.
The introduction of such a system made it possible to decipher images with greater accuracy. For the reconstruction of images, a “U-net” type of convolutional neural network with 14 hidden layers was used (
3b ).
Image number 3Recall that the base of 20,000 digits was divided into three groups (16,000 - training, 2,000 - testing and 2,000 - test).
The workout group was processed in batches of 50 pieces for the reconstruction network and 500 each for the definition network. At the same time, the parties were changed to avoid
retraining * .
Re-training * is a case when the system handles examples from a training sample well, but does not do well with examples from a test sample.
In order to minimize the root-mean-square error, an optimization algorithm with a learning rate of 1 x 10
-4 was used .
The networks passed the training stage for no longer than 50 epochs (backward cycles). For each case, the training was repeated 10 times in order to collect statistical data on the accuracy of the training system.
All DNNs were implemented on the basis of a single NVIDIA GeForce GTX 1080Ti graphics processor using the TensorFlow 1.5 Python library.
Research resultsReconstructionThe first parameter that the scientists decided to consider in more detail was the ability of the system to reconstruct the input data.

The image above shows the results of the reconstruction of numbers (0 ... 9), after passing the data through a fiber with a length of 0.1 m, 10 m and 1000 m.
As we can see, the result of the procedure is very accurate, which confirms the ability of the U-net system to isolate the extreme distinctive features of the future image.
The degree of accuracy of the reconstruction was also checked. This figure decreases with increasing fiber length from 96.9% (0.1 m) to 90.0% (1000 m).
The decrease in accuracy is due to the fact that with a fiber length of 1 km, temperature irregularities (expansion of the material due to heat and / or a change in the refractive index) occur in it, which change the optical signal path. These processes lead to the fact that the speckle pattern at the distal end becomes unstable, which makes it more difficult to reconstruct it into the necessary image.
The researchers note that the external impact on the fiber also reduces the degree of accuracy of image reconstruction. Therefore, with further improvement of the system, the optical fiber should be provided with thermal insulation and an isothermal environment to achieve the maximum level of reconstruction accuracy.
The reconstruction procedure also perfectly levels artifacts on the processed image.

For example, the system isolates the image (
2a ) from the distal speckle (
2g ), simultaneously removing defects projected onto the proximal fiber face (
2c and
2e ). In addition, the system tries to eliminate artifacts caused by contamination or sample defects or structural inaccuracies of the fiber itself.
Classification of zirph samplesThe system can recreate the image, and the accuracy of this process is very impressive. We now turn to the analysis of how accurately the system is able to determine where what is the image (figure), that is, to classify the data after their reconstruction.

From the graph and the table above it can be seen that the classification accuracy decreases with increasing length of the fiber involved in the transmission. A similar trend was with the accuracy of reconstruction. Regardless of whether the amplitude model or phase, accuracy decreases. At 2 cm of fiber - 90% accuracy. This is a good indicator, but the fiber is too short. But with a length of 1 km, the accuracy drops to 30%. Researchers attribute this to an increase in scattering losses, mode connectivity, and drift of the distal speckle. All these “interferences” are caused by an increase in fiber length.
Distal speckle changesThe recording was made with a frame rate of 83 fps. As an experiment on a fiber of 1 km, an empty image was transmitted.
(a) and (b) - 2 frames taken from the record above, (c) - their comparison.These frames were recorded with a difference of 2 seconds. And as we see in the image (s), the difference between them is quite significant. Such sharp changes in speckle can be associated with temperature fluctuations of the environment or air flows over the device (image No. 1), which can cause small disturbances in the fiber. But when the fiber length increases, the force of such disturbances becomes palpable.
It turns out that all the work of the system will be in vain because of these "interference". However, scientists do not stop such difficulties, but rather encourage them to think.
It was decided to conduct a study on speckle displacement and how they affect the accuracy of image classification. For this, the VGG network was trained on the basis of 10,000 samples (half of the available ones), then testing was conducted, but with the other half of the samples. The process was repeated, changing 2 groups of samples in places. The results showed that there is no significant change in the accuracy of the classification, since the shift of speckles is not accidental, which means that the INS is able to study, remember and determine it in the process.
The difference between amplitude and phase modulation was insignificant. With a fiber length of 10 m and phase modulation, the classification was slightly better than with amplitude modulation. This is due to a more uniform distribution of light on the modes of the optical fiber. With amplitude modulation, the number of modes involved in the transmission is limited due to selective spatial excitation of the fibers.
If we consider the option of fiber length of 1 km, then the amplitude modulation is already superior to phase. When the light passes through a long optical fiber, all modes are involved in the transmission of information at once.
Error matrices (confusion matrices)In order to improve the classification accuracy, the INS was also trained with the help of already reconstructed samples. Error matrices were also applied, which significantly improved the classification accuracy.
For example, in the case of a fiber with a length of 1 km there is a confusion between the numbers 4 and 9, as well as between 3, 5, 6 and 8.
To confirm, you just have to look at the results of the reconstruction.
Figures 4 and 9
Figures 3, 5, 6 and 8
The graphs above show changes in image classification accuracy over time:
a - 10 m of fiber and distal speckles;
b - 10 m of fiber and reconstructed images;
- 1 km of fiber and distal speckles;
d - 1 km of fiber and reconstructed images.
For a detailed acquaintance with the nuances of the study I strongly recommend to look at the report of scientists. A PDF version is also available on this page (“Get PDF” button).
EpilogueThis study showed excellent results, which indicates its future development and practical implementation. The above techniques can be applied to telecommunications (decoding in multiplexing) and even in medicine (endoscopy).
By calculating the time costs, scientists have found that most of them are spent on the preparation of the system, more precisely on its training. And this suggests that an already trained system can perform its functions incredibly quickly, down to milliseconds. The only limitation will be the power of the hardware.
Of course, there is still a lot to be studied in the field of artificial neural networks based on deep learning. But their usefulness is visible now. Improving existing systems, whatever their application, is just as important as creating new ones. It is not always necessary to reinvent the wheel, if you can just improve it. The main thing, as practice has shown, to think outside the box, to learn from our own and others' mistakes, to set ourselves sometimes impossible tasks and believe in our own strength. If an idea can benefit humanity, it must be implemented.
Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends,
30% discount for Habr users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).
3 months for free if you pay for new Dell R630 for half a year -
2 x Intel Deca-Core Xeon E5-2630 v4 / 128GB DDR4 / 4x1TB HDD or 2x240GB SSD / 1Gbps 10 TB - from $ 99.33 a month , only until the end of August, order can be
here .
Dell R730xd 2 times cheaper? Only we have
2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about
How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?