Generating a list of IPv4 addresses on TCL and some number systems

Not so long ago, it was necessary to solve the problem of mass device configuration update. A standard system administration task if you have more than one device in service that performs the same type of function. For the solution, there are both universal products, for example, from the available redmine.nocproject.org , and many scripts widely presented on thematic forums and portals. Just in this case, the handwritten script should have turned out to be its own written script, but it didn’t turn out, therefore, considering that there was time for maneuvers, the script was written anew, executed and put on the shelf, so that there would be lost again.
For writing, we used expect - expect.sourceforge.net , an add-on for TCL that allows processing and responding to responses from various interactive console utilities, in particular, telnet. Considering that for TCL it was not necessary to write earlier, the code needed to be rethought. The key point of the script is the generator of the list of IPv4 addresses to process, after a careful assessment of this piece of the program it was possible to significantly, in my opinion, optimize, at least reduce the number of lines by one third and add the new functionality without serious consequences. Moreover, all these abbreviations did not relate much to the specifics of TCL, but concerned fundamental approaches to the construction of the algorithm as a whole.
I selected this code into a separate utility, which I will try to disassemble in great detail later in the text - how it was “before” and what became “after”, and why it was impossible to write right away as “after”. I still don’t like everything in it: confusing both algorithmic problems and TCL problems, for example, using lists instead of arrays (which is faster?, Safer?, Ideologically more correct?), All doubts are also present in the text, with the hope of constructive comments.
The logic of the utility (cipl.tl) is as follows: we set two parameters in the command line: the IPv4 address from which we will start building our list and the IPv4 address by which the list will end or the number indicating how many items should be in the list. The order of construction from a smaller address (the first parameter) to a larger one. If the second parameter is omitted, then a list consisting only of the initial IPv4 address is displayed:

 > cipl.tl 192.0.2.1 1
 192.0.2.1
 192.0.2.2
 > cipl.tl 192.0.2.1 192.0.2.2
 192.0.2.1
 192.0.2.2
 > cipl.tl 192.0.2.1
 192.0.2.1

For Windows, the script runs along with the tclsh interpreter, in fact, you can also do it in * nix

 > tclsh cipl.tl 192.0.2.1
 192.0.2.1

Next, I will quote the code, supplying it with line numbers and versions, and then commenting on it. The resulting version, you can pick up the links at the end of the topic.

Ver.1, №1-12

#!/usr/bin/tclsh8.5 set exip {^(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d{0,1})(\.(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d|\d)){3}$} set exdg {^(0?|([1-9]\d{0,5}))$} if {$argc == 0 || $argc > 2} then { puts "cipl.tl -   IP    " puts ": cipl.tl <start IP> \[<count>|<finish IP>\]" puts ": <start IP> - IPv4     <finish IP>" puts "\t <finish IP> - IPv4 " puts "\t <count> -   0  999999" } else {

In the first line we show what we need to use an interpreter, this is a line for Linux. In general, you must specify the full path to tclsh, for example, for FreeBSD this line will look like this:

 #!/usr/local/bin/tclsh8.5

Next, we set the variables exip and exdg, which are regular expressions that will be used later in the program. The first variable we need to verify the correctness of the input IPv4 address. Under this regular expression get the correct addresses written in decimal form from 1.0.0.0 to 255.255.255.255, that is, to set the address of the form 192.0.02.010 will not work. The second variable determines the number, without leading zeros - the valid bounds of the list are from 0 to 999999, the empty string is also true. The limitation on top of 999999, in my opinion, was reasonable and, moreover, I did not waste time searching for a regular expression of the corresponding number 2 to the power of 32. These regular expressions did not appear immediately, but were added based on the needs of the solution, which explains how much such, but it will be seen a little further.
Next, the if condition is checked - the number of parameters passed from the command line, if there are 0 or more than 2, then display a small help. At this point, you can not output anything, but simply take the first two parameters as required, thereby concentrating more on the batch operation of the utility without cluttering output in case of an error.
The last line opens an else block in which the main processing takes place.

Ver.1, №12-17

 } else { set startip [lindex $argv 0] set countip [lindex $argv 1] set getcountip $countip if {[regexp $exip $startip]} then { set octsip [split $startip {.}]

In this part, we first save the parameters obtained at the input to the startip variables - the first parameter, countip - the second parameter, if there is no second parameter, then lindex returns us an empty string. And also save the second parameter in the additional variable getcountip .
Next, we check that the first parameter matches the correct IP address using regexp and the exip variable with the regular expression specified earlier. In this condition, it is important that the IPv4 address is fully consistent with what we expect, since the next line we create is the octsip list using split , the dot symbol is used as a separator. The resulting list must contain only decimal digits from 0 to 255 in the right places without leading zeros, in order to operate with them without further additional checks. Leading zeros play a role here, insofar as, for example, the number 011 in the substitution will be perceived as an octal number, that is, it will be equal to 9 in the decimal number system.
Here you should pay attention to the fact that a query in a search engine quite often leads to regular expressions that do not check all these conditions, often it is just a test for 4 groups of 3 digits per group. For example, the expression from habrahabr.ru/blogs/webdev/123845 allows the construction 000.100.1.010, which of course is the IP address, but does not uniquely define its octal or decimal form, this introduces uncertainty and requires further checks.
')
Ver.1, №18-35

 if {[regexp $exip $countip]} then { set octfip [split $countip {.}] set octsub {0} set countip {0} for {set i 3} {$i>=0} {incr i -1} { if {[set octsub [expr [lindex $octfip $i] - [lindex $octsip $i]]] < 0} then { if {$i > 0} then { set si [expr $i - 1] set octfip [lreplace $octfip $si $si [expr [lindex $octfip $si] - 1]] set octsub [expr 256 + $octsub] } else { break } } set ni [expr 3 - $i] set countip [expr $countip + ($octsub * (1 << ($ni * 8)))] } }

Here we check if the second parameter is the correct IPv4 address and if this is the case, then we try to calculate the difference between this address and the one that is specified in the first parameter, that is, at the output of this block, the countip variable must contain the correct list length value. The if condition check is nested in the superior check (line 16), so if the previous check is failed (the first parameter is not an IPv4 address), the program will not get to this site.
The solution of this subtask (subtraction of IPv4 addresses) is made as if we had read two numbers into the bar:

  192.168.3.  one
 -192.168.2.  ten
 = 0. 0.0.247

Of course, these numbers are not decimal, but on the basis of 256. That is, when we occupy the value from the previous digit, we should add not 10, but 256 (0x100) in the hexadecimal representation of octets, it looks clearer:

  0xC0.0xA8.0x03.0x01
 -0xC0.0xA8.0x02.0x0A
 ---------------------- (perform the loan, remove the unit from the senior level)
 = 0xC0.0xA8.0x03.0x01
          -.0x01
 ---------------------- (continue the loan, adjust the result by the value of the base of the number system)
 -0xC0.0xA8.0x02.0x00A
               + .0x100
 ---------------------- (the result is such an operation)
 = 0xC0.0xA8.0x02.0x101
 -0xC0.0xA8.0x02.0x00A
 = 0x00.0x00.0x00.0x0F7

To implement this, we also, like with the first parameter, create an otcfip list, then create a variable octsub which will contain the difference between octets and zero out the countip which stores the size of the address list, and not the IPv4 address that was there when we entered this code block.
We organize a for loop with the variable i in the reverse order from 3 to 0. In this cycle, we must go through all the octets of the IPv4 address starting with the youngest, that is, all the elements of the octsip and octfip lists starting with the last element.
We calculate the difference between the current octfip-octcip octets and save it in octsub . Here I really wanted to use arrays, because the lindex construct is very cumbersome, but I did not see a simple way to form an array (in TCL, it is only associative) from the list, so only lists are present everywhere. The calculated difference is immediately checked for a condition less than 0, that is, do we need to make a loan from the senior octet or not?
If a loan should be made and it is not the most significant octet ( i greater than 0), then we add 256 to the octsub difference and subtract 1 from the next octet of a decrement ( octfip ).
If this is the most senior octet, then we have a reduced octfip less than a deductible octsip , that is, the difference is negative, which cannot be according to the condition of the problem, in this case we exit the cycle - break
If a loan is not necessary, then the result will satisfy us. However, the result is in any case represented in the number system with a base of 256, which is not convenient, since we need to do further calculations in a system that the interpreter understands. Therefore, we translate the resulting result in a standard way for the positional number system:
... + A _i * B ⁱ ... + A ₃ * B ³ + A ₂ * B ² + A ₁ * B ¹ + A ₀ * B ⁰ = N _B , where B is the base of the number system.
In our case: countip = octsub _ni * 256 ⁿⁱ , where ni varies from 0 to 3, or ni = 3 - i , where i changes from 3 to 0, which allows us to include translation into the existing cycle. Since the base of the number system is a multiple of 2, we use a shift that is a multiple of 8 to calculate the degree, since 256 is 100000000 in binary representation, that is, the unit is shifted 8 bits to the left. Thus, shifting first to 0, then to 8 (the last significant line in this section of the code), then to 16 and 24, we thereby multiply by 1 (256 ⁰ ), 256, 256 ² and 256 ³ .
Returning to this site for the second time, it caused some confusion in me, which seemed simple and understandable during implementation, now it seemed too complicated and complicated. This can be judged even by the amount of text describing this code.
What is wrong? Why did it take anew !, to invent a subtraction operation, even for numbers in the number system with a base of 256, instead of translating these values into a numeric form that is understandable to the programming language, and doing the subtraction using standard tools, especially since we still do the translation? In the end, for myself, I came to the conclusion that the subjective human perception has played a cruel joke once again. What can be easier to perform actions in the bar, which take place in the first class? Nothing, because this is the first thing that everyone learns after the actual numbers. Translations, shifts, seem complicated compared to the simplest operation from the first class. Understanding that this is not a decimal number system comes a little later, but an understanding of the incorrectness of what is happening, only when I had to look at the written code a second time. As a result, the second version of the same area.

Ver.2, №18-25

 if {[regexp $exip $countip]} then { set octfip [split $countip {.}] set nfip [expr ([lindex $octfip 0] * 0x1000000) + ([lindex $octfip 1] * 0x10000)\ + ([lindex $octfip 2] * 0x100) + ([lindex $octfip 3])] set nsip [expr ([lindex $octsip 0] * 0x1000000) + ([lindex $octsip 1] * 0x10000)\ + ([lindex $octsip 2] * 0x100) + ([lindex $octsip 3])] if {$nfip >= $nsip} then {set countip [expr $nfip - $nsip]} }

After forming the octfip list (as well as in the first version), we generate numbers corresponding to the IPv4 address value (which they are) in the nfip variable — for the address in the second argument and the nsip variable — for the address in the first argument. We do the translation the same way as we did, only without any cycles, substituting the values into one line: nfip = listitem ₀ * 256 ³ + listitem ₁ * 256 ² + listitem ₂ * 256 + listitem ₃ , where listitem _n is the corresponding list item directly in the expression using lindex . 256 to some extent, in the code for ease of perception is presented in the form of round hexadecimal values 0x100xxxx. Then we check that the second argument is greater than the first one and subtracts the first argument from the second while keeping the result value in countip .
As a result, it turned out a little easier than it was, even much easier. The only thing that confuses me in this version is the hypothetical possibility of overflowing the nfip and nsip variables when calculating expr . Although for current C compilers this should not be scary. From the documentation about calculations and overflows http://www.tcl.tk/man/tcl8.5/TclCmd/expr.htm#M23 . For version 8.4, www.tcl.tk/man/tcl8.4/TclCmd/expr.htm#M5 it was explicitly stated that the numeric constants are 32-bit signed numbers, which, if necessary, will be perceived as 64-bit signed, for version 8.5 of this mention no. In the previous version, the hypothetical possibility of overflow was also present, but there we were processing the already obtained difference, which in real cases of application would be much less than even a 16-bit number.
Then the second part of the utility begins, in which the displayed list of IPv4 addresses is generated.

Ver.2, №26-27

 if {[regexp $exdg $countip]} then { puts $startip

We check the countip variable for compliance with a numeric value from 0 to 999999. The value of this variable can be passed in the second argument, that is, the previous check for its belonging to an IPv4 address failed. Or already calculated difference between the addresses specified in the arguments. If the value of this variable is too large, or does not correspond at all to the number (this may be after our calculations, for example, if the difference between IPv4 addresses is negative), then no further processing will be performed. If everything is in order, then we display the first element from the list (IPv4 address specified by the first argument). Then I will call the resulting list of IPv4 addresses a sequence, so as not to be confused with the internal concept of TCL - the list.

Ver.2, №28-29

 for {set i 0} {$i<$countip} {incr i 1} { set octsip [lreplace $octsip {3} {3} [expr [lindex $octsip {3}] + 1]]

We form the remaining elements of the desired sequence, again I really want to use arrays, but it seems to me worse to translate from a list to an array than to use lists in this form (how to do it correctly and simply?). Here is the for loop for variable i running through a value from 0 to the maximum calculated (or given) element of the sequence countip . Inside the loop, the last element of the previously formed octsip list (the least significant octet in our address) is increased by 1 ...

Ver.2, №30-36

 for {set j 3} {$j>=0} {incr j -1} { if {[lindex $octsip $j] > 255 && $j > 0} then { set sj [expr $j - 1] set octsip [lreplace $octsip $j $j {0}] set octsip [lreplace $octsip $sj $sj [expr [lindex $octsip $sj] + 1]] } }

... and check whether it is necessary to adjust other digits. For this, we also organize a for loop with variable j running through values from 3 to 0. Next in the if condition, we check that the current octet is greater than 255 (an overflow occurred) and this is not the leading octet j greater than 0, but not equal to 0. If overflow occurred , the current octet is zeroed out, we add 1 to the most significant octet (which corresponds to the element of the octsip list closer to its beginning). If the overflow occurred in the senior octet, then we do not make an adjustment, so that we have the wrong IPv4 address.

Ver.2, №37-44

  set oip [join $octsip {.}] if {[regexp $exip $oip]} then { puts $oip } else { puts ":    " exit 3 } }

Merge the resulting list containing the octets of our address together, join the variable oip separator - a point. Next, we check the result for belonging to an IPv4 address using our regular expression specified at the very beginning. If everything is correct, we deduce, if not, then an overflow or another error has already occurred in the process of forming a sequence, we exit exit . This moment is also not quite beautiful, since we have several exit points, which can be inconvenient if we want, for example, to perform the same type of actions at the end.
The last closing parenthesis is the completion of the for loop that forms the output sequence and is open in line 28.

Ver.2, №45-51

  } else { puts "    \"$getcountip\"" } } else { puts "   IP  \"$startip\"" } }

The final lines, in which we display error messages on the else branches for conditions of 26 and 16 lines, where we check the arguments given at the start of the program to meet the expectations. This is the only place where the getcountip variable is used that stores the second received argument of the program unchanged, which is strange and seems like brute force, but the obvious (simple) other option could not be implemented in this case.
Looking through the second time already this part of the program (where the sequence is formed for output), I first thought that it would be nice to implement a full adder of 4-bit numbers in the base 256 base system and a translator into the additional code of these same numbers so that subtraction on the same adder. By that time, I had not yet changed the first part, and was dominated by ideas about the simplicity of calculations by the bar. The desire to implement this (wild) venture did not pass, as it is interesting in itself, but perhaps not at TCL. It was already clear that the second part should be changed in the same way as the first, that is, to make a translation from the usual representation to the one we need (and this is already a translation into a 256-number system).
The concept of busting has also changed. If we can sort through IPv4 addresses in the for loop, then we need not pre-calculate the size of a given sequence, we will simply move from one address in a row to another. Also in this approach, it turned out to be very easy for us to move not only in the forward direction from the smallest to the larger, but also in the opposite direction - this does not require additional efforts, we just need to correctly set the increment of the loop variable when it is formed (here there is an opportunity for additional functionality that allows form a sequence in any direction).

Ver.3, №31-32

 if {$nfip > $minip && $nfip < $maxip} then { if {[set d [expr $nfip >= $nsip]]} then {set di {1}} else {set di {-1}}

We check the ownership of nfip , in which, I recall, the second argument contains the IPv4 address as a number in the specified range ( minip and maxip are defined at the beginning of the program). If we fall into the range, then we set the direction of the search, if the second IPv4 nfip address is greater than the first nsip (we already have the address in the form of numbers), then the search in the direct order is a variable di = 1, if less, then the search in the reverse order, di = -one. The result of the comparison is also remembered in d .

Ver.3, №33-37

 for {set i $nsip} {($i<=$nfip && $d) || ($i>=$nfip && !$d)} {incr i $di} { set octip [list [expr ($i & 0xFF000000) >> 24] [expr ($i & 0xFF0000) >> 16]\ [expr ($i & 0xFF00) >> 8] [expr ($i & 0xFF)]] puts [join $octip {.}] }

We organize a for loop on a variable i whose initial value is set to nsip , and the exit condition is adjusted with the condition nfip> = nsip , the result of which we store in d : i <= nfip if we approach nfip from below, or otherwise i> = nfip . The increment i is already calculated and stored in di .
In the body of the loop, we form an octsip list of octets of the IPv4 address. That is, we need to form the address in the decimal representation of its numerical representation — translate into a 256-number system. In the general case, following the theory, we need to divide the number in one number system, on the basis of another number system and form a number from the residuals in the new number system (on the basis of which we divide):

   3,221,225,985 |  256
  -3 221 225 984 |  -----------
  -------------- |  12,582,914 |  256
               1 |  -12 582 912 |  ------- 
                    ---------- |  49 152 |  256
                             2 |  -49 152 |  ---
                                       0 |  192

Starting from the result of 192, for all balances in the reverse order of 0, 2, 1 we get 192.0.2.1. Dividing is a complicated operation and does not bring any optimization, but in our very special case: IPv4 address and dividing by 256 - everything is very simple. We will shift by 8 (divide by degrees 256) and mask the bits we don't need (binary operation “AND”). Imagine in hexadecimal:

    0xC0000201 |  0xC0000201 |  0xC0000201 |  0xC0000201
   & 0xFF000000 |  & 0x00FF0000 |  & 0x0000FF00 |  & 0x000000FF
  -------------------------------------------------- -------------------
    0xC0000000 >> 24 |  0x00000000 >> 16 |  0x00000200 >> 8 |  0x00000001
  -------------------------------------------------- -------------------
  = 0xC0 (192) |  = 0x00 (0) |  = 0x02 (2) |  = 0x01 (1)

All this is done in one line, each digit is placed in its list list item. The second operator in the body of the loop displays a combined list.

Ver.3, №38-45

  } else { puts " IP      " exit 3 } } else { puts "   IP  \"$startip\"" } }

The final lines are almost indistinguishable, besides the conclusion about the error of setting the second argument is shifted slightly higher in the program. The first part of the program, compared with the second version, also changed a little.

Ver.3, №1-7

 #!/usr/bin/tclsh8.5 set exip {^(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d{0,1})(\.(2(5[0-5]|[0-4]\d)|1\d{2}|[1-9]\d|\d)){3}$} set exdg {^-?(0?|([1-9]\d*))$} set maxip {0xFFFFFFFF} set minip {0xFFFFFF}

A regular expression for checking a numeric parameter now returns a positive answer for any numerical values of any length, without a leading 0, but with a possible negative sign "-" in front. Here we have simplified the check and expanded the boundaries, because the length of the resulting sequence is checked in numeric form using the following maxip and minip variables . These values do not duplicate the regular exip expression, since it now checks only the correctness of user input, and not the results of calculations.

Ver.3, №15-20

 set startip [lindex $argv 0] if {![string length [set finiship [lindex $argv 1]]]} then {set finiship {0}} if {[regexp $exip $startip]} then { set octsip [split $startip {.}] set nsip [expr ([lindex $octsip 0] * 0x1000000) + ([lindex $octsip 1] * 0x10000)\ + ([lindex $octsip 2] * 0x100) + ([lindex $octsip 3])]

Lines 8-14 almost completely repeat lines 6-12 of the first option, only slightly corrected messages in accordance with the new functionality. Then we perform almost the same actions as the second option. The only thing we force is to set the finiship value to 0 if the second argument was not set so that the variable was always defined. finiship has the same meaning as the countip from the second option, and was renamed to fit the new concept. Ultimately, this variable will not contain the size of the sequence of IPv4 addresses, but the last address of this sequence. We perform the nsip calculation immediately after decomposing the first argument into its components.

Ver.3, №21-30

 if {[regexp $exip $finiship]} then { set octfip [split $finiship {.}] set nfip [expr ([lindex $octfip 0] * 0x1000000) + ([lindex $octfip 1] * 0x10000)\ + ([lindex $octfip 2] * 0x100) + ([lindex $octfip 3])] } elseif {[regexp $exdg $finiship] && [expr abs($finiship)] < $maxip} then { set nfip [expr $nsip + $finiship] } else { puts "    \"$finiship\"" exit 5 }

, — .
elseif , , IPv4 , , . , nfip nsip . IPv4 , ( ).
, IPv4 — exit. Again, we obtain a multiplicity of exit points, and in this version there are three of them. This can be overcome by wrapping the entire program in an endless loop and, if necessary, interrupting its break at the end of the program, but here it was not at all necessary. As mentioned earlier, all error checking can be eliminated by replacing them with default actions, this is more appropriate for the command mode. In this variant, there is no getcountip variable - the second argument finiship is directly present in the error message , since it does not change, but is only used in the course of work.
(cipl.tl) , IPv4 IPv4 , . , .

> cipl.tl 192.0.2.1 -1
192.0.2.1
192.0.2.0
> cipl.tl 192.0.2.1 192.0.2.0
192.0.2.1
192.0.2.0

, , habrahabr.ru/blogs/complete_code/135340 , , : « , », .

: cipl.zip
wikibooks — ru.wikibooks.org/wiki/_
TCL www.tcl.tk/doc wiki.tcl.tk

Source: https://habr.com/ru/post/135920/

All Articles

Generating a list of IPv4 addresses on TCL and some number systems

More articles: