📜 ⬆️ ⬇️

cut and grep or awk?

Often in scripts you can find something like foobar|awk '{print $1}' (“often” is really often ).

Such an awk call only leaves the first (nth) column from the output of the previous command. But this is a clear overkill! awk is a fairly powerful language for stream processing, and using it as a simple field-separator is not good.

To cut a specified field from a string, it is better to use the cut command. She is able to less, and therefore easier to use and faster.
In modern Linux, handling awk call is much more complicated than calling cut. In Debian, for example, awk is a link to / etc / alternatives / awk, which leads (most often) to gawk. Which is almost 10 times larger in size than cut. Of course, cut load faster.

cut can cut not only bytes, but also the necessary fields (option -f). The field is the text between delimiters. The default delimiter is a space / tab, but it is easily changed with the -d option.

The second approach is to use the -o option on grep. This option does not display the entire string, but only the grep that matches the search criteria. Obviously useless when searching for an exact substring, but very useful when using regular expressions.

For example,
grep -v "#" /etc/inittab |cut -f 4 -d : -s
displays the list of programs started by init (fourth field, fields are separated by a colon).

grep http://\\S\\+ -o /var/log/apache2/error.log
will display a list of URLs from the file with errors (first in line).

... and no awk.

UPD: The comments suggest an even more interesting construct without launching an external file (the read command is implemented using bash):
foobar | (read p1 p2; echo p1)

PS It's not about a single call (there is no difference awk, grep or even python / perl). Speech about a set of calls in a cycle in a script. All examples are compared in a loop with hundreds (preferably thousands) of calls.

Source: https://habr.com/ru/post/104546/

All Articles