📜 ⬆️ ⬇️

What is grep and what it eats

This article is inspired by the posts of two topics that have recently flashed on Habré - “interesting unix commands” and “how I selected a programmer”. And the teams described there, of course, are interesting in some places, but rarely practically useful, and it turns out that we really do not know how to use useful tools.
Small lyrical digression:
About three years ago I was asked to conduct an interview with applicants for the position of unix-sysadmin. At the two largest freelance exchanges at the time, eight applicants responded to the vacancy, two of which were in the TOP-5 rating of these exchanges. I never require administrators to memorize configs and I believe that the necessary software is always comfortable, if there is a desire to read, logic in actions and the ability to correctly use the tools of the system. Therefore, for the beginning, the applicants were given two tasks, something like this:
- put the task in crowns, which will be performed at each even hour and at 3 o'clock;
- print processor information from the /var/run/dmesg.boot file.

To my surprise, none of the applicants with both questions failed. Two, in principle, did not know about the existence of grep.

image
')
Therefore ... Summer ... Friday ... Before the kebabs, let's talk a little about grep.

Knowing the local public and in order not to cause unnecessary insinuations, I inform you that all of the following is true for
# grep --version | grep grep grep (GNU grep) 2.5.1-FreeBSD 

This is important in connection with
 # man grep | grep -iB 2 freebsd -P, --perl-regexp Interpret PATTERN as a Perl regular expression. This option is not supported in FreeBSD. 


First of all, about how we usually grep'ay files.
Using cat:
 root@nm3:/ # cat /var/run/dmesg.boot | grep CPU: CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz (2833.07-MHz K8-class CPU) 

But why? After all, you can and so
 root@nm3:/ # grep CPU: /var/run/dmesg.boot CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz (2833.07-MHz K8-class CPU) 

Or like this (I hate this construction):
 root@nm3:/ # </var/run/dmesg.boot grep CPU: CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz (2833.07-MHz K8-class CPU) 

For some reason, we consider selected rows using wc:
 root@nm3:/ # grep WARNING /var/run/dmesg.boot | wc -l 3 

Although you can:
 root@nm3:/ # grep WARNING /var/run/dmesg.boot -c 3 

Let's make a test file:
test.txt
 root@nm3:/ # grep ".*" test.txt one two three seven eight one eight three thirteen fourteen fifteen sixteen seventeen eighteen seven sixteen seventeen eighteen twenty seven one 504 one one 503 one one 504 one one 504 one #comment UP twentyseven #comment down twenty1 twenty3 twenty5 twenty7 


And begin to search:
The -w option allows you to search by word entirely:
 root@nm3:/ # grep -w 'seven' test.txt seven eight one eight three sixteen seventeen eighteen seven twenty seven 

And if you need at the beginning or end of the word?
 root@nm3:/ # grep '\<seven' test.txt seven eight one eight three sixteen seventeen eighteen seven sixteen seventeen eighteen twenty seven root@nm3:/ # grep 'seven\>' test.txt seven eight one eight three sixteen seventeen eighteen seven twenty seven twentyseven 

Standing at the beginning or end of the line?
 root@nm3:/ # grep '^seven' test.txt seven eight one eight three root@nm3:/ # grep 'seven$' test.txt sixteen seventeen eighteen seven twenty seven twentyseven root@nm3:/ # 

Want to see the lines in the neighborhood you are looking for?
 root@nm3:/ # grep -C 1 twentyseven test.txt #comment UP twentyseven #comment down 

Only bottom or top?
 root@nm3:/ # grep -A 1 twentyseven test.txt twentyseven #comment down root@nm3:/ # grep -B 1 twentyseven test.txt #comment UP twentyseven 

And we can do it
 root@nm3:/ # grep "twenty[1-4]" test.txt twenty1 twenty3 

Conversely, excluding these
 root@nm3:/ # grep "twenty[^1-4]" test.txt twenty seven twentyseven twenty5 twenty7 

Of course, grep supports other basic quantifiers, metacharacters, and other delights of regulars.
A couple of practical examples:
 root@nm3:/ # cat /etc/resolv.conf #options edns0 #nameserver 127.0.0.1 nameserver 8.8.8.8 nameserver 77.88.8.8 nameserver 8.8.4.4 

We select only lines with ip:
 root@nm3:/ # grep -E "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" /etc/resolv.conf #nameserver 127.0.0.1 nameserver 8.8.8.8 nameserver 77.88.8.8 nameserver 8.8.4.4 

It works, but it's more attractive:
 root@nm3:/ # grep -E '\b[0-9]{1,3}(\.[0-9]{1,3}){3}\b' /etc/resolv.conf #nameserver 127.0.0.1 nameserver 8.8.8.8 nameserver 77.88.8.8 nameserver 8.8.4.4 

Remove line with a comment?
 root@nm3:/ # grep -E '\b[0-9]{1,3}(\.[0-9]{1,3}){3}\b' /etc/resolv.conf | grep -v '#' nameserver 8.8.8.8 nameserver 77.88.8.8 nameserver 8.8.4.4 

And now we will choose only ip themselves
 root@nm3:/ # grep -oE '\b[0-9]{1,3}(\.[0-9]{1,3}){3}\b' /etc/resolv.conf | grep -v '#' 127.0.0.1 8.8.8.8 77.88.8.8 8.8.4.4 

That is bad luck ... The commented line has returned. This is due to the pattern processing feature. How to be? Like this:
 root@nm3:/ # grep -v '#' /etc/resolv.conf | grep -oE '\b[0-9]{1,3}(\.[0-9]{1,3}){3}\b' 8.8.8.8 77.88.8.8 8.8.4.4 

Here we’ll focus on inverting the search with -v
Suppose we need to run “ps -afx | grep ttyv »
 root@nm3:/ # ps -afx | grep ttyv 1269 v1 Is+ 0:00.00 /usr/libexec/getty Pc ttyv1 1270 v2 Is+ 0:00.00 /usr/libexec/getty Pc ttyv2 1271 v3 Is+ 0:00.00 /usr/libexec/getty Pc ttyv3 1272 v4 Is+ 0:00.00 /usr/libexec/getty Pc ttyv4 1273 v5 Is+ 0:00.00 /usr/libexec/getty Pc ttyv5 1274 v6 Is+ 0:00.00 /usr/libexec/getty Pc ttyv6 1275 v7 Is+ 0:00.00 /usr/libexec/getty Pc ttyv7 48798 2 S+ 0:00.00 grep ttyv 

All anything, but the line "48798 2 S + 0: 00.00 grep ttyv" we do not need. Use -v
 root@nm3:/ # ps -afx | grep ttyv | grep -v grep 1269 v1 Is+ 0:00.00 /usr/libexec/getty Pc ttyv1 1270 v2 Is+ 0:00.00 /usr/libexec/getty Pc ttyv2 1271 v3 Is+ 0:00.00 /usr/libexec/getty Pc ttyv3 1272 v4 Is+ 0:00.00 /usr/libexec/getty Pc ttyv4 1273 v5 Is+ 0:00.00 /usr/libexec/getty Pc ttyv5 1274 v6 Is+ 0:00.00 /usr/libexec/getty Pc ttyv6 1275 v7 Is+ 0:00.00 /usr/libexec/getty Pc ttyv7 

Ugly design? Shuffle a bit:
 root@nm3:/ # ps -afx | grep "[t]tyv" 1269 v1 Is+ 0:00.00 /usr/libexec/getty Pc ttyv1 1270 v2 Is+ 0:00.00 /usr/libexec/getty Pc ttyv2 1271 v3 Is+ 0:00.00 /usr/libexec/getty Pc ttyv3 1272 v4 Is+ 0:00.00 /usr/libexec/getty Pc ttyv4 1273 v5 Is+ 0:00.00 /usr/libexec/getty Pc ttyv5 1274 v6 Is+ 0:00.00 /usr/libexec/getty Pc ttyv6 1275 v7 Is+ 0:00.00 /usr/libexec/getty Pc ttyv7 

Also, do not forget about | (OR)
 root@nm3:/ # vmstat -z | grep -E "(sock|ITEM)" ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP socket: 696, 130295, 30, 65, 43764, 0, 0 

well, the same thing, otherwise:
 root@nm3:/ # vmstat -z | grep "sock\|ITEM" ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP socket: 696, 130295, 30, 65, 43825, 0, 0 

Well, if many people remember the use of regulars in grep'e, they somehow forget about using POSIX classes, and this is also sometimes convenient.
Posix
[: alpha:] Any alphabetical character, regardless of case
[: digit:] Any numerical character
[: alnum:] Any alphabetical or numerical character
[: blank:] Space or tab characters
[: xdigit:] Hexadecimal characters; any number or a – f or a – f
[: punct:] Any punctuation symbol
[: print:] Any printable character (not control characters)
[: space:] Any whitespace character
[: graph:] Exclude whitespace characters
[: upper:] Any uppercase letter
[: lower:] Any lowercase letter
[: cntrl:] Control characters

Select the lines with uppercase characters:
 root@nm3:/ # grep "[[:upper:]]" test.txt #comment UP 

Poor see what you found? Highlight:
image

Well, a couple of tricks for the seed.
The first is rather academic. For 15 years I have never used it:
It is necessary to choose from our test file the lines containing six or seven or eight:
So far, everything is simple:
 root@nm3:/ # grep -E "(six|seven|eight)" test.txt seven eight one eight three sixteen seventeen eighteen seven sixteen seventeen eighteen twenty seven twentyseven 

And now only those lines in which six or seven or eight occur several times. This feature is called Backreferences.
 root@nm3:/ # grep -E "(six|seven|eight).*\1" test.txt seven eight one eight three sixteen seventeen eighteen seven 

Well, the second trick, much more useful. It is necessary to display lines in which 504 on both sides is limited to tabs.
Oh, how it lacks PCRE support ...
Using POSIX classes does not save:
 root@nm3:/ # grep "[[:blank:]]504[[:blank:]]" test.txt one 504 one one 504 one one 504 one 

The [CTRL + V] [TAB] construction comes to the rescue:
 root@nm3:/ # grep " 504 " test.txt one 504 one 

What else did not say? Of course, grep can search in files / directories and, of course, recursively. We’ll find in the source code where Intel uses third-party SFP-NIS. How do I spell allow_unsupported_sfp or unsupported_allow_sfp I do not remember. Well, okay - these are grep's problems:
 root@nm3:/ # grep -rni allow /usr/src/sys/dev/ | grep unsupp /usr/src/sys/dev/ixgbe/README:75:of unsupported modules by setting the static variable 'allow_unsupported_sfp' /usr/src/sys/dev/ixgbe/ixgbe.c:322:static int allow_unsupported_sfp = TRUE; /usr/src/sys/dev/ixgbe/ixgbe.c:323:TUNABLE_INT("hw.ixgbe.unsupported_sfp", &allow_unsupported_sfp); /usr/src/sys/dev/ixgbe/ixgbe.c:542: hw->allow_unsupported_sfp = allow_unsupported_sfp; /usr/src/sys/dev/ixgbe/ixgbe_type.h:3249: bool allow_unsupported_sfp; /usr/src/sys/dev/ixgbe/ixgbe_phy.c:1228: if (hw->allow_unsupported_sfp == TRUE) { 


I hope not tired. And that was just the tip of the grep iceberg. Enjoy your reading, and my appetite on kebabs!
Well and good luck to you grep'a!

Source: https://habr.com/ru/post/229501/


All Articles