📜 ⬆️ ⬇️

Adjusted sliding exam, coclassifiers, fractal classifiers and local error probability

This paper presents elements of an introduction to the classification with training on small samples - from a convenient notation to special assessments of reliability. The constant increase in the speed of computing devices and small samples make it possible to neglect the considerable amount of computation needed to obtain some of these estimates.

Definitions and designations

Let some initial partition of the set be given image of objects image into two subsets (classes) image such that
image , image .
(one)

We will identify a two-class classifier with a binary function of the form
image
(2)

Where image - random sampling subsets image from classes image , image -examined object that must be attributed to one of the classes. The values ​​of this function will be interpreted as “solutions” in accordance with the rule

(3)
')
Depending on whether or not the solutions of the classifier correspond to the original partition image into classes, we will consider them respectively “correct” or “erroneous”. We also agree elements of samples image denote so that respectively:
,
(four)

Where image - the volume of training samples. Set a lot image "Immersed" in image -dimensional right-side Euclidean space image . Then all the elements of the classes image , including, of course, the elements of the training samples and the object under study can be regarded as its points. Coordinates of point features from the set image we will mark the right subscript image . Object coordinates training samples will be written as object under study image as - image . Depending on the context, image understood either as an object name or as a radius vector.
We proceed from the absence of a test sequence, and the estimation of the probability of a classification error image feasible in a sliding exam
image ,
(five)

Where
image ,
(6)

image .
(7)

Objects image the training samples classified in the sliding exam mode will be called quasi-replaceable in the future.

Adjusted sliding exam

Sliding exam, as you know, has several drawbacks. These deficiencies can be eliminated to some extent through the correction of the sliding exam. The adjusted scores marked with the left stroke will be written as follows.
image ,
(eight)

image ,
(9)

image .
(ten)

The disadvantages of the adjusted sliding exam include the increase in the number of operations and the fact that this assessment is carried out on both samples per unit of smaller volume. Thus, for small samples, the estimate of the error probability is somewhat overestimated, however, with an increase in the sample size, this effect loses its value.

Classifier

Due to the large laboriousness of the corrected sliding exam, the method of binary assessment of the reliability of the classification - classifier is of interest. Like the sliding exam, it can be based solely on information from training samples, but it can also be used if there are test sequences.
Let's enter selections image objects, respectively, correctly and erroneously classified by the classifier (2) in the sliding exam mode
image ,
(eleven)

image .
(12)

Then the solution of the coclassifier of the first order classifier (2), defined as
image ,
(13)

interpreted as follows
if a image then the classifier (2) made the right decision regarding image ,
if a image then the classifier (2) made the wrong decision regarding image .
(14)

In this case, we assume that the sample extracted from certain classes objects, potentially correctly or mistakenly referred by the classifier (2).

In determining (13), it is assumed that the sample size image not too small. Thus, if we only have the material of training samples, and there are no check sequences, then it is recommended to use a coclassifier in conditions when the classifier (2) makes a significant amount of errors. If you still use the co-classifier in a small sample image then it should be chosen in a fairly simple form. For example, if the co-classifier is of the Fisher type, one can assume the diagonality of the covariance matrix or even its singularity.

Like adaptive boosting, classifier composition can be considered as a collective classifier, organized much more nonlinearly compared to those proposed in [1].

Let us dwell on issues related to the choice of a particular form of co-classifier. Let, for example, the sample image extracted from classes image with density of distributions image and the classes overlap a lot. In this case, it can often turn out that the sample density have close mean values. In this case, the co-classifier can be chosen, for example, in the form of a linear Fisher classifier modified using the Peterson-Mattson procedure [2,3].

Fractal classifier

The process of synthesis of co-classifiers of higher orders can be continued in the framework of the recurrent procedure, when the initial substitution is performed
,
(15)

then, repeating the above algorithm, we get the second-order co-classifier at the output
image .
(sixteen)

and continue this procedure. An imperative stop occurs when building a co-classifier of this order. image in which image or even image . As a result, we obtain an iterated classifier system — the fractal classifier. This collective classifier should not, of course, be confused with image classifiers using fractal and wavelet transformations to process them.

In practice, we had to use only co-classifiers of the first order. They were developed by us many years ago and have proven to be useful tools in solving various practical problems, in particular, in the analysis of reflected radio signals for installations searching for anti-personnel plastic mines [4], as well as in creating the LEKTON system. This system made it possible to fully automatically verify the authenticity of signatures on checks, bills and other documents and was the first system of this type actually used at the bank.

Local error probability

In practical studies, local ones have proven themselves well. image - estimates of the probability of classification error. Let the classifier (2) be in the form
image ,
(17)

Where image - density estimates image by samples image respectively. Then local image - the estimate of the probability of error of this classifier can be defined as
image ,
(18)

Where image .
We introduce a special image - assessment, which can be considered as a “blurred” classifier
image
(nineteen)

Where image . Then we will assume that interpreted as a blurry classifier solution that image , image - as a decision that image . At the same time, the closer image to zero or to one, the corresponding decisions of the fuzzy classifier are more trustworthy.

Based on the estimate (19), one can summarize the definition of the classifier (2) by entering the zone of failure or zone of failure. Denoting the width of these zones respectively image , we will present them in the following form
image
(20)

Where image - boundaries of zones. In the absence of asymmetry in the requirements for zones is selected image .

Literature:

1. Archipov GF Collectives of determinative rules: - In the Collection of articles “Statistical problems of control”, Vilnius, 1983, vol.61, pp. 130-145.
2. Myasnikov V.V. Modifications of the method for constructing a linear discriminant function based on the Peterson-Mattson procedure. computeroptics.smr.ru/KO/PDF/KO26/KO26211.pdf.
3. Fukunaga K. Introduction to the statistical theory of pattern recognition. M. "Science" 1979. pp. 105-130.
4. Archipov, G., Klyshko, G., Stasaitis, D., Levitas, B., Alenkowicz, H., Jefremov, S.C. MIKON-2000., XII International Conference on Microwaves, Radar and Wireless Communications, Volume 2, pp. 495-498 ./>

Source: https://habr.com/ru/post/276017/


All Articles