ASN.1 in simple terms (part 3, final)

I continue to publish my article "ASN.1 in simple words." The previous parts of the article, placed on Habré, can be found here: ASN.1 in simple words (encoding type REAL) and ASN.1 in simple words (part 2) .

Chapter 7. Coding Bit Sequences (Bit Strings)

In the discharge of types encoding bit strings, I attribute two types - BIT STRING and OCTET STRING.

The BIT STRING type is intended for storing sequences of the smallest blocks of information - bits. In fact, no coding occurs — a block of the coded value is placed in the value block for the encoded value. With only one exception - the number of unused bits is stored in the first octet of the encoded value. Values of the number of unused bits can vary from 0 to 7. Unused bits are located in the bit string to the extreme right. That is, if the bit sequence is coded 0000 1111 = 0F, and the number of unused bits is 4, then when decoding this bit string should be interpreted as 0000.
')
When encoding type BIT STRING can be used as a primitive encoding method, and constructive. A constructive method of encoding bit strings allows you to logically split one bit string into many smaller ones. When decoding such a bit string encoded by the constructive method, the values from all the nested bit strings are combined into one large bit string. In the case of using a constructive method for specifying bit strings in a block, only bit strings should be encoded, and for all bit substrings except the last, the octet with the value of unused bits should be zero.

Let's give an example of constructive coding of bit strings. Suppose the initial bit string contains the values 0B 0B 0F, and the number of unused bits is 4. Now divide this string into 3 embedded bit substrings and obtain the following sequence of octets encoding the initial bit string in a constructive form:

23 0C
03 02 00 0B
03 02 00 0B
03 02 04 0F

Here, in the first two octets, the identification octet for the constructive form of the bit-string coding and the total length of the constructive-form value (12) are given. Next, set the coding of the bit string with a value of 0B (the number of unused bits is zero!). Then again the bit string is encoded with the value 0B (again, the number of unused bits must be equal to zero). The last coded bit string is 0F, which has the number of unused bits equal to 4. Thus, by connecting successively coded bit substrings, we get the 01 01 0 bit coded bit string (the last 4 bits are not used, and therefore F = 1111 is cut off from the final value) .

The OCTET STRING type is used to store simple sequences of bytes (octets). That is, anything can be encoded into this type! That is, in this type, you can even encode the contents of the operating system files. One “but” in the case of using the “implicit length setting” (see the chapter on ASN.1 encoding basics) for the OCTET STRING type, you should independently monitor the appearance of two zero octets in a row (00 00) in the encoded stream and create additional nested substrings dividing null octets in the input octet stream.

Chapter 8. Coding Prefix Types

It is often necessary to distinguish between elements encoded in the same type, one after the other (for example, when encoding type SET, where the order of nested elements can be arbitrary). For this purpose, additional coding of elements in blocks with new, specific tags is used. For example, the type REAL has the UNIVERSAL (00 ₂ ) tag class and the tag number 9. In order to logically distinguish two consecutive REAL values, additional “wrapping” value coding is used. To do this, use tag classes other than the UNIVERSAL (00 ₂ ) class. That is, a new type is introduced, with its own distinctive tag number and a tag class different from UNIVERSAL (00 ₂ ), and the entire standard block of the encoded type REAL is encoded as the value for this new type (the new type encapsulates all the coded value of another type, together with blocks of the identification octet and blocks of length).

For example, “wrap” the encoded value for the REAL type for a value of 0.15625 (coded in binary form, the base is 2, see the chapter on REAL encoding). This value is fully encoded by five octets 09 03 80 FB 05. For the external “wrapper” we use the PRIVATE tag class (11 ₂ ) and choose the tag number equal to 2. Since the value for the indoor unit will be used not the primitive type, but the coded standard value for the REAL type, the outer “wrapper” will have a constructive encoding form. Therefore, the complete REAL type together with the “wrapper” will be encoded with the help of E2 05 09 03 80 FB 05 octets.

In the case when there is a pre-agreed and well-known ASN.1 message scheme between the recipient of the encoded information and the sender, then using the “wrapping coding” you can omit the information octet value for the internal, “wrapped” value. In this case, the primitive type will be used as the value for the “wrapper” and the encoded value can be written as C2 03 80 FB 05 (here C2 is an indication of the use of the PRIVATE + tag class and the application of the primitive encoding form + tag number is 2; 03 is the length value block; the remaining octets are a value block, taken directly from the value block for standard encoding of the type REAL). Thus, it can be said that the use of pre-agreed ASN.1 schemes allows you to encode messages in selected places of common ASN.1 with selected types using selected values of both tag classes and tag class numbers (complete freedom of action!).

In conclusion, I will say a few words about the notation, which refers to “prefix types”. Consider a few examples:

Type1 :: = [0] BOOLEAN

It describes Type1, which has a Context-specific (10) ₂ tag class, a tag number of 0 and a constructive encoding form. In the value block for Type1, the fully encoded value of the standard type BOOLEAN is transmitted.

Type2 :: = [PRIVATE 2] IMPLICIT BOOLEAN

It describes the type Type2, which has the class of the tag "Private" (11) ₂ , the tag number is 2 and the primitive form of coding. That is, in the value block for Type2, only the corresponding value block is transmitted from the standardly encoded type BOOLEAN (the identification block and the length block are now taken from Type2).

That is, by default in ASN.1, Context-specific (10) ₂ is used as a tag class, and a constructive encoding form is used. That is, the first example can be written in an equivalent form (although such a record cannot be directly present in the file describing the types of ASN.1):

Type1 :: = [CONTEXT 0] EXPLICIT BOOLEAN

By the way, the formally full equivalent of the standard type, for example BOOLEAN, can be written as:

BOOLEAN_Eq :: = [UNIVERSAL 1] IMPLICIT BOOLEAN

Another interesting situation arises when the IMPLICIT notation is applied to the initially constructive types, for example, the SEQUENCE type. In this case, the new type will also have a constructive form, and the value block for the new type will still be taken from the value block of the encoded type SEQUENCE. Thus, for the IMPLICIT notation, the following rules can be defined:

The value block for a new type is always taken from the value block of the type being encoded;
The use of constructive or primitive coding depends on the type being encoded - the type of coding of a new type is equivalent to the type of coding used in the type being encoded;

For example, in the following notation, the PrimType type, despite the use of IMPLICIT, will have a constructive type:

ConstType :: = [0] REAL
PrimType :: = [PRIVATE 2] IMPLICIT ConstType

In conclusion, for the type of CHOICE, the ASN.1 encoding rules should always apply the EXPLICIT (constructive coding) notation.

Chapter 9. Coding Type SEQUENCE

The SEQUENCE type serves for the logical grouping (union) of coded values for different types. In fact, the name itself SEQUENCE (in translation “sequence”) indicates the scope of this type. The sequence of types in a sequence is predefined and cannot be changed.
The coding for this type is “constructive,” that is, the block with the coded value contains additional subblocks encoding individual values. For a more detailed description I refer the reader to chapter 1 of this article with a more detailed description.

An example of coding type SEQUENCE. Suppose that two numbers must be encoded in the sequence: one integer (INTEGER), equal to -128, and one floating-point number (REAL) equal to 0.15625 (represented in the base 2 decomposition). From the corresponding previous chapters, you can find out that an integer is encoded as a sequence of octets (02 01 80), and a floating-point number is encoded as a sequence of octets (09 03 80 FB 05). Then the SEQUENCE type containing these two coded numbers will be encoded in the form:

30 08 02 01 80 09 03 80 FB 05

Here, the first octet is an identification octet and informs that a SEQUENCE value is encoded and that a constructive encoding method is used. The second octet stores the number of octets in which the SEQUENCE value is encoded. The next three octets represent the encoded integer value (-128), and the last 5 octets represent the value of the encoded floating point number (0.15625).

Chapter 10. SET Type Coding

The SET type encoding is actually fully consistent with the SEQUENCE type encoding, with the exception that the order of the types encoded with SET may change. That is, if, for example, we take the numbers from the chapter on the type of SEQUENCE, then in fact we get two possible (absolutely equivalent) variants of encoding them for the type SET:

Option 1: 31 08 02 01 80 09 03 80 FB 05
Option 2: 30 08 09 03 80 FB 05 02 01 80

The second option simply places the coded integer after the coded floating point number.

In the SET type, two (or more) values that have the same type can also be encoded. When decoding, their values are distinguished by the use of “prefix types”. About them it was already told in chapter 8.

Chapter 11. Coding Type BOOLEAN

The type BOOLEAN can encode only two values - either TRUE (true) or FALSE (false). If the value is FALSE, then there should be only one octet equal to 00 in the value block. If the value is TRUE, then there should be only one octet in the value block, the value of which is non-zero. That is, the following two encoding options for TRUE for type BOOLEAN are equivalent:

Option 1: 01 01 01
Option 2: 01 01 FF

Chapter 12. NULL Encoding

A value of type NULL is always constant and always encoded with only two octets of 05 00, where the first octet is an information octet, and the second octet is an octet of length, which always encodes a zero length.

Conclusion for Habr

The full text in PDF format (along with additional examples enclosed in PDF) can be downloaded for a direct link . It is better to use the latest Acrobat Reader for viewing.

Source: https://habr.com/ru/post/150888/

All Articles