📜 ⬆️ ⬇️

AWK C-like word processor

This article will present the main features of the AWK language, followed by examples. It will also address the issue of its use. This publication is for informational purposes only. Well, let's go ...

A bit of history


The very first version of AWK was created by Bell Laboratories in 1977. AWK stands for the initials of its creators Alfred A ho, Peter W einberger and Brian K ernighan. I would like to note that in this language the influence of C, SNOBOL4 and Bourne Shell is traced. Intended for processing character and numeric fields in structured text records.

The structure of programs written in AWK


AWK programs usually consist of several key blocks, which are the BEGIN block, the body block and the END block. It is not necessary that all blocks are present. And now a little more in detail about each of them.

BEGIN block

Syntax:
BEGIN {awk-command}
This block is executed only once for the duration of the program. The main purpose is to initialize variables. As previously noted, the absence of this block is allowed.
')
body block

Syntax:
/ pattern / {action}
This block is executed by the awk interpreter once for each record (line) of the file being processed. Suppose if a file contains 100 records (lines), then this block will be executed 100 times for each record (line) separately. / pattern / is optional. If / pattern / is not specified, awk will process all entries from the input file. If / pattern / is specified, then only those records from the input file that match this pattern will be transferred to the processing. {action} is the command that will be applied to each line that matches the pattern. An example of {action} could be print .

END block

Syntax:
END {awk-command}
This block is executed only once at the very end after the body block. END block is usually used to generate a report.

It's time to give examples


Let the awk structured file be employee.txt , the contents of which:
Jane Li, IT Manager, 3000
Kate Moon, Nurse, 2000
Steve Zima, Writer, 4250
Andrew Sky, Policeman, 4000

By the way, I note that the information in this file is structured: for all records, the first field is the name of the last name, the second is the position, and the third is the salary.

First, let's run a simple command in the Linux console to display all the content:

$ awk {print} employee.txt

Since the same template was not specified (/ pattern /), the interpreter displayed the entire contents of the file employee.txt .
And now we will display in the console information about employees who receive a salary higher than or equal to 3000 (cu):

$ awk 'BEGIN {FS = ","} {if ($ 3> = 3000) print $ 0}' employee.txt

FS ( from file separator ) is a built-in variable that stores the space as a field separator by default. In this example, the value "," of the variable FS ensures that each record is divided into three fields. Fields are accessed using the variables $ 1, $ 2, $ 3. By the way, $ 0 means the whole record with all the fields.
In the following example, we calculate how much the company's direct salary costs are:

$ awk 'BEGIN {fs = ","; total = 0} {total + = $ 3; print} END {printf ('' total = $% d \ n '', total)} 'employee.txt

By the way, print with no arguments just means printing the whole record, i.e. $ 0. AWK is a slightly different language compared to C, because It does not oblige the declaration and initialization of the variable. Those. The following examples are identical with respect to the latter:

$ awk 'BEGIN {fs = ","; total} {total + = $ 3; print} END {printf ('' total = $% d \ n '', total)} 'employee.txt
$ awk 'BEGIN {FS = ","} {total + = $ 3; print} END {printf (' 'total = $% d \ n', total)} 'employee.txt

Cycles



As for the cycles, everything is very clear. AWK supports while, do-while and for loops. The syntax is traditional. For example:

$ awk 'BEGIN {while (i <17) {str = str "#"; i ++} print str}'

str = str "#" is nothing but concatenation.
By the way, awk is not forgotten about the purpose of break , exit , continue .
Sometimes you have to write a lot of lines of code to get the expected result. It is absolutely clear that the console will no longer be the place where it would be convenient to write n dozens of lines of the same code. This problem is easy to get around. It is enough to write the structural part of the AWK to a file and, if necessary, refer to it. The last example could be done as follows:
in the script.awk file save the following:

BEGIN {while (i <17) {str = str "#"; i ++} print str}

To run the last example:

$ awk -f script.awk

Associative arrays

Theme associative arrays awk is a good topic for the next post. Therefore, until I am very keen on this topic, I will confine myself to a simple example that demonstrates this feature:

$ awk 'BEGIN {arr [1] = "a"; arr [2] = 2; arr ["n"] = 777; for (i in arr) print arr [i]} '

Conclusion
In this article, I tried to show the surface capabilities of the powerful AWK word processor. Such important aspects as working with strings, working with associative arrays, deserve more attention and is a good topic of the following publications. Therefore, I purposefully left them for later. I hope that my first article has much closer to everyone who has read it some of the possibilities of the not-so-famous AWK .

Literature


1. Sed & Awk 101 Hacks
2. The Awk Programming Language by Alfred V. Aho, Brian W. Kernighan, Peter J. Weinberger

Source: https://habr.com/ru/post/150185/


All Articles