Capture filters for network analyzers (tcpdump, Wireshark, Paketyzer)

1. Capture Filters

Traffic analyzers are a useful and effective tool in the life of a network administrator, they allow you to “see” what is actually transmitted on the network, which makes it easier to diagnose various problems or to study the principles of operation of various protocols and technologies.
However, quite a lot of various data blocks are often transmitted in the network, and if you force to display everything that passes through the network interface, it is difficult to select what is really needed.
To solve this problem, filters are implemented in traffic analyzers, which are divided into two types: capture filters and display filters. Today we will talk about the first type of filters - about capture filters.
Capture Filters is a type of filter that allows you to limit frame capture to only those that are necessary for analysis, thus reducing the load on computer computing resources, as well as simplifying the process of analyzing traffic.

2. Capture Filters Syntax

The capture filter expression consists of a set of special primitives, which are built from the so-called classifiers and object identifiers (addresses, names of network objects, port numbers).

Note : all classifiers are case-sensitive and must be written in small letters only.

Let's deal with them in more detail.
Classifiers can be of the following types:

type - object type
- host - the node (the default type, if the type is not specified, it is assumed that this is the host)
- net - network
- port

For example :
host 192.168.0.1 - traffic capture in which IP 192.168.0.1 is used as the address (sender or recipient)
net 172.16.0.0/16 - traffic capture in which the IP from the network 172.16.0.0/16 is set as the address (the sender or the receiver) (more precisely, it is in the range from 172.16.0.0 to 172.16.255.255), while just a search filter matching addresses, it does not matter what the mask is configured on the interface, and you should not be embarrassed that 172.16.0.0 by mask / 16 is the network number, we absolutely do not know which mask is configured on the interface, and formally, such a node address is valid.
port 80 - traffic capture in which there is data belonging to port 80 (udp or tcp)
10.0.0.1 - traffic capture in which IP 10.0.0.1 is specified as the address (sender or recipient), the host classifier is not specified, but it is assumed by default.

dir - direction relative to the object (direction)
- src - the object is the sender
- dst - the object is the recipient

For example :
src host 192.168.0.1 - traffic capturing in which IP 192.168.0.1 is set as the sender (not the receiver) address
dst net 172.16.0.0/16 - traffic capture in which the IP from the network 172.16.0.0/16 is set as the recipient's address (not the sender) (more precisely, it is in the range from 172.16.0.0 to 172.16.255.255).

proto - interaction protocol
- ether - basic Ethernet networking technology, usually indicates that the filter uses the hardware MAC address
- ip - IPv4 protocol
- ip6 - IPv6 protocol
- arp - ARP protocol
- tcp - TCP protocol
- udp - UDP protocol
- if the protocol is not specified, it is considered that all traffic that is compatible with the type of object should be captured

For example :
src ether host 00: 11: 22: 33: 44: 55 — traffic capture in which 00: 11: 22: 33: 44: 55 is used as the sender's MAC address.
ip icmp - capture ICMP packets.
tcp port 80 - traffic capture in which there is data belonging to TCP port 80

In addition to object identifiers and classifiers, filters can contain the keywords gateway , broadcast , multicast , less , greater, as well as arithmetic expressions.

For example :
ip multicast - capture of ip packets containing addresses from class D.
less 1000 - frame capture, in which the size is less than 1000 bytes.

A bunch of several conditions can occur using logical operations:

“And” - and (&&)
"OR" - or (||)
"NOT" - not (!) - inversion of value

The priority of these operations is as follows:

inversion operation has the highest priority
then the logical "and"
the lowest priority has operation "OR".

As in ordinary mathematical expressions, the priority can be changed using round brackets (), in which actions are performed first.

For example :
net 192.168.0.0/24 and tcp port 21 — captures the traffic belonging to the network (range) 192.168.0.0/24 (or the sender or receiver) and sends data using the TCP protocol and uses port 21.
host 192.168.0.1 or host 192.168.0.221 - capturing traffic belonging to either host 192.168.0.1 or host 192.168.0.221 (and it’s not important who is the sender, who is the receiver, and just enough of one of the two conditions addresses present in the frame)
host 192.168.0.1 or host 192.168.0.2 and tcp port 22 — capture either any traffic belonging to host 192.168.0.1 or TCP protocol traffic and using port 22 owned by the host 192.168.0.2.
(host 192.168.0.1 or host 192.168.0.2) and tcp port 22 — captures TCP traffic using port 22 owned by host 192.168.0.1 or host 192.168.0.2 (either of them, or both).
(host 192.168.0.1 || host 192.168.0.1) && not tcp port 22 — capture any traffic except TCP traffic and using port 22 owned by host 192.168.0.1 or host 192.168.0.2 (either of them, or both).

If there are several identical duplicate classifiers in the filter, then to shorten the record, you can not write them.

For example :
net 192.168.0.0/24 and (tcp port 21 or tcp port 20 or tcp port 25 or tcp port 80 or tcp port 110)
can be reduced to
net 192.168.0.0/24 and (tcp port 21 or 20 or 25 or 80 or 110)

Attention :
The expression excluding packages in which there are addresses 1.1.1.1 and 1.1.1.2:
not (host 1.1.1.1 and host 1.1.1.2)
Can be cut as:
not (host 1.1.1.1 and 1.1.1.2)
But not like:
not host 1.1.1.1 and 1.1.1.2 - in this case, the packets will be shown in which there is no first address and there is a second one.
And not so
not (host 1.1.1.1 or 1.1.1.2) - in this case, packets in which there is at least one of the specified two addresses will be excluded.

A list of basic primitives that can be used to write capture filters is shown in Table 2-1.

Table 2-1. A list of basic primitives that can be used to write capture filters.

Primitive	Description
dst host ip_address	Capture frames in which the IPv4 / IPv6 header field contains the specified node address
src host ip_address	Capture frames in which the source address of the IPv4 / IPv6 header contains the specified node address
host ip_address	Capture frames in which the IPv4 / IPv6 header contains the specified node address in the source or destination address field. Equivalent to the filter: ether proto ip and host ip_address
ether dst mac_address	Capture frames in which in the field of the recipient address of the link layer header contains the specified MAC address of the node
ether src mac_address	Capture frames in which in the field of the sender address of the link layer header contains the specified MAC address of the node
ether host mac_address	Capture frames in which in the address field of the sender or recipient of the link level header contains the specified MAC address of the node
dst net network	Capture frames in which the IPv4 / IPv6 header field in the destination address field contains the specified address belonging to the range of the specified class network
src net network	Capture frames in which the source address of the IPv4 / IPv6 header contains the specified address belonging to the range of the specified class network
net network	Selects all IPv4 / IPv6 packets containing addresses from the specified network in the sender or recipient field
net network mask	Capture frames in which the IPv4 / IPv6 header field contains the specified address in the specified network range in the source or destination address field.
net network / mask_length	Capture frames in which the IPv4 / IPv6 header field contains the specified address in the specified network range in the source or destination address field.
dst port port	Capture frames in which the destination port of the UDP or TCP header contains the specified port number
src port port	Capture frames in which the sending port of the UDP or TCP header contains the specified port number
port port	Capture frames in which the sending port of the UDP or TCP header contains the specified port number
less length	Capture frames that are no larger than the specified value.
greater length	Capture frames whose size is not less than the specified value
ip proto protocol	Capture frames in which the Protocol field of the IPv4 header contains the identifier of the specified protocol. In this case, you can specify not only the numerical values of the protocols, but also their standard names (icmp, igmp, igrp, pim, ah, esp, vrrp, udp, tcp, and others). However, it should be borne in mind that tcp, udp and icmp are also used as keywords, so a backslash character (“\”) should be prevented before these character identifiers.
ip6 proto protocol	Capture frames in which the Protocol field of the IPv4 header contains the identifier of the specified protocol. In this case, you can specify not only the numerical values of the protocols, but also their standard names (icmp6, igmp, igrp, pim, ah, esp, vrrp, udp, tcp, and others). However, it should be taken into account that tcp, udp and icmp6 are also used as keywords, therefore, a backslash character ("\") should be prevented before these character identifiers.
ether broadcast	Capture all Ethernet broadcast frames. The ether keyword may be omitted.
ip broadcast	Capture frames containing broadcast addresses in the IPv4 packet header. It also uses a subnet mask for the interface that is used to capture packets to determine if the address is a broadcast. Also captures packets sent to a limited broadcast address.
ether multicast	Capture all Ethernet multicast frames. The ether keyword may be omitted.
ip multicast	Capture frames containing multicast addresses in the IPv4 packet header
ip6 multicast	Capture frames containing multicast addresses in the IPv6 packet header
ether proto protocol_type	Capture Ethernet frames with the specified protocol type. The protocol can be specified by number or name (ip, ip6, arp, rarp, atalk, aarp, decnet, sca, lat, mopdl, moprc, iso, stp, ipx, netbeui)
ip, ip6, arp, rarp, atalk, aarp, decnet, iso, stp, ipx, netbeui, tcp, udp, icmp	Capture frames transmitting data of the specified protocol. Used as abbreviations for: ether proto protocol
vlan [vlan_id]	Capture frames in accordance with IEEE 802.1Q. If the vlan_id number is indicated, then only frames belonging to the specified VLAN are captured.

3. Advanced examples of capture filters

In addition to simple indications of addresses and protocols, more sophisticated constructions can be used in capture filters, allowing for a more subtle analysis of headers.
For this, expressions returning a logical value of the following format are used:

expression operation expression

In which the expression can be constants, the results of arithmetic (+, -, *, /) or binary bitwise operations (& - “AND”, | - “OR”, << - left shift, >> - right shift), operator length offset , data or frame header fields. As an operation, the characters “>” (more), “<” (less), “> =” (more than), “<=” (less than), “=” (equal), “! =” (not equal). Thus, it is possible to check for coincidence or non-coincidence of certain fields or frame bytes with the required values, compare various header fields to each other, and also perform some arithmetic and logical operations on them and compare the results of these operations with certain values.
The simplest example of using an advanced filter is “ 5 = 3 + 1 ”, where “ 5 ” and “ 3 + 1 ” are expression, and “ = ” is an operation. As a result of calculating this string, a logical value will be returned, false in this case.
')
A proto [ offset : size ] primitive is used to get data or frame headers.

Attention : square brackets in this case is an element of syntax, not a sign of an optional field.

The proto parameter contains the name of the protocol, from the header of which you need to select certain data (ether, fddi, tr, wlan, ppp, slip, link, ip, arp, rarp, tcp, udp, icmp, ip6 and others).
The offset parameter indicates the offset in bytes relative to the beginning of the header of the specified protocol, the numbering of bytes starts from zero: ip [0] is the first byte from the beginning of the IP packet, tcp [1] is the second byte from the beginning of the TCP segment, ether [3] is the fourth byte from the start of the Ethernet frame.
The size parameter indicates the number of bytes to be taken, starting from the byte specified in the offset , the size field is optional, and if it is missing, it is considered that it is necessary to take 1 byte: ip [2: 2] is the third and the fourth byte from the beginning of the IP packet, tcp [4] - the fifth byte from the beginning of the TCP segment, ether [6-6] - bytes from the seventh to the twelfth, from the beginning of the Ethernet frame.
If the offset field is set to a negative value, then the bytes of the previous header will be selected, reaching the protocol header specified in the proto parameter. But this will necessarily require the presence in the frame of the protocol header specified in the proto primitive. Thus, filters ether [11] = 0x37 (take the 12th byte of the Ethernet frame and compare it with the value 0x37) and ip [-3] = 0x37 (take the 3rd byte from the end of the header before the IP header and compare its value with 0x37) are not identical. The first will skip all frames in which the sender's MAC address ends at 37, while the second will also require the presence of the IP protocol, and frames that do not contain the IP protocol, for example, ARP frames, will not be captured.

For example :
The ip [1: 1] and ip [1] expressions will produce the same result — the second byte of the IPv4 header will be selected
The tcp [8: 2] expression will select the ninth and tenth bytes (Source Port field) of the TCP header.
The ip [-3] = 0x37 expression will select all IPv4 packets whose MAC address of the sender ends with “0x37”.

It should be noted that when selecting data using the proto [ offset : size ] construct for the TCP and UDP protocols, the fragmentation of IP packets is taken into account. As a result, tcp [0] will always mean the first byte of the TCP header, and will never result in the selection of the first byte of data packets that are sending the first fragment from the fragment chain.
For some protocols, certain fields and offset values can be specified not only by numbers, but also by names. For example, ICMP supports the icmptype parameter, which can be icmp-echoreply , icmp-unreach , icmp-sourcequench , icmp-redirect , icmp-echo , icmp-routeradvert , icmp-routersolicit , icmp-timxceed , icmp-paramprob , icmp -tstamp , icmp-tstam-preply , icmp-ireq , icmp-ireqreply , icmp-maskreq , icmp-maskreply . You can use the tcpflags parameter identifiers tcp-fin , tcp-syn , tcp-rst , tcp-push , tcp-ack, and tcp-urg to analyze TCP flags .

For example :
The expression tcp [tcpflags] & (tcp-syn | tcp-fin)! = 0 selects all frames containing TCP segments in which the session is opened or completed.
The expression i cmp [icmptype]! = Icmp-echo and icmp [icmptype]! = Icmp-echoreply selects all frames containing the ICMP protocol, except for echo requests and echo responses.

There may be situations in which it is necessary to analyze only a fraction of the bits of a particular byte. To solve these problems, the bit operation “AND” (&) is used. With its help, you can save only certain bits of a byte, and the rest to zero.
For example, we need to allocate only those frames that are transmitted on the data link layer to broadcast or group frames. We know that we can determine the type of MAC address by its high byte:

Address Type	The high byte value in the 16th system	The value of the high byte in the 2nd system
Directed \ Unicast	00	0000000 0
Group \ Multicast	01	0000000 1
Administratively assigned \ Admin ID	01	0000001 0
Broadcast \ Broadcast	FF	1111111 1

Based on this information, it can be concluded that in the broadcast or multicast addresses the low-order bit of the high byte of the address is one, and in the remaining ones - zero. If we take the high byte of the address, clear all its bits except the youngest, and the value of the byte becomes equal to one, then this address was either broadcast, or group, if the value of the byte becomes zero, then this address was either directional or administrative given. As a result, to perform this test, you need to use the following expression: ether [0] & 1 = 1 , where ether [0] - gets the value of the first byte of the Ethernet header, and & 1 - the logical bit operation “AND”, which resets all the bits of this byte, except the youngest , " = 1 " - check the result for a match with the unit.
Let us examine one more example in more detail.
We need to get the contents of the Type Of Service (ToS) field of the IPv4 header. To do this, referring to RFC-791, we will see that this field is a single-byte field, and the second byte of the header:

 3.1.  Internet Header Format
   It follows:
     0 1 2 3   
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 3 5 6 7 8 9 0 1 
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    | Version |  IHL | Type of Service |  Total Length |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Identification | Flags |  Fragment Offset |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Time to Live |  Protocol |  Header Checksum |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Source Address |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Destination Address |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Options |  Padding |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +

In order to get its value, we need to use the following primitive:
ip [1: 1] - get one byte of the IP header starting from byte number 1 (the numbering of bytes starts from zero).
Now we can build filters based on the contents of this field.
If we want to display all frames containing an IPv4 header, in which the ToS field is zero, you need to write the following: ip [1: 1] = 0 .
If we want to display all frames containing an IPv4 header, in which the ToS field is not equal to zero, write the following: ip [1: 1]! = 0 .
But you can go further, according to RFC-791, the ToS field is a composite field and has the following structure:

          0 1 2 3 4 5 6 7
       + ----- + ----- + ----- + ----- + ----- + ----- + ----- + ----- +
       |  |  |  |  |  |  |
       |  PRECEDENCE |  D |  T |  R |  0 |  0 |
       |  |  |  |  |  |  |
       + ----- + ----- + ----- + ----- + ----- + ----- + ----- + ----- +
       Bits 0-2: Precedence.
       Bit 3: 0 = Normal Delay, 1 = Low Delay.
       Bits 4: 0 = Normal Throughput, 1 = High Throughput.
       Bits 5: 0 = Normal Reliability, 1 = High Reliability.
       Bit 6-7: Reserved for Future Use.

The first three bits - preference, the fourth describes the requirements for delays, the fifth describes the requirements for bandwidth, the sixth describes the requirements for reliability of the communication line, the seventh and eighth - are reserved for future use.
If we look at newer standards (RFC1349), the value of the seventh bit has already been determined - the price requirements are “Cost” (monetary value).
So, let's say we want to determine if there are frames on the network in which the seventh bit of the ToS field is set in the IPv4 header. How to do it? To solve this problem, we need to remember (or learn: D) the binary system of calculus. In a byte, each bit has its own weight, which starts from one and increases, from right to left, each time multiplying by two.

          0 1 2 3 4 5 6 7
       + ----- + ----- + ----- + ----- + ----- + ----- + ----- + ----- +
       |  |  |  |  |  |  |
       |  PRECEDENCE |  D |  T |  R |  C |  0 |
       |  0 0 0 |  0 |  0 |  0 |  1 |  0 |
       |  |  |  |  |  |  |
       + ----- + ----- + ----- + ----- + ----- + ----- + ----- + ----- +
         128 64 32 16 8 4 2 1

It turns out that the weight of the bit of interest is equal to 2.
What if we compare the value of the ToS field with the two?
ip [1: 1] = 2
Will we get an answer to the question, is there a bit of interest in this header? On the one hand, yes, but on the other hand, no.
For example, if we have in the ToS field, in addition to the “Cost” bit, there are also other bits set to one? Let's say it will be the bit responsible for the bandwidth requirements - “Throughput”.

          0 1 2 3 4 5 6 7
       + ----- + ----- + ----- + ----- + ----- + ----- + ----- + ----- +
       |  |  |  |  |  |  |
       |  PRECEDENCE |  D |  T |  R |  C |  0 |
       |  0 0 0 |  0 |  1 |  0 |  1 |  0 |
       |  |  |  |  |  |  |
       + ----- + ----- + ----- + ----- + ----- + ----- + ----- + ----- +
         128 64 32 16 8 4 2 1

As a result, the value of this byte will no longer be 2, but 10, and a simple comparison cannot be answered to the question whether a certain bit is set in the field of interest.
What prevents us from getting the answer that interests us? We are hampered by the value of others, perhaps also set in unit bits. Accordingly, it is necessary to get rid of them. To solve this problem, we use the operation of a bitwise logical “AND” (sometimes called logical multiplication), denoted by the symbol “ & ”. As you know, in the logical operation "And" the output will be only one when both the first operand and the second operand are equal to one. Accordingly, if we produce a bitwise multiplication of the value of the ToS field by a special mask in which only the bit that is in the position of the bit of interest in the ToS field will be set to one, we will exclude all other bits from the result:

 ToS field: 00001010 = 10
 Mask: 00000010 = 2
 Result: 00000010 = 2

Whatever the value of the remaining bits, if you overlay this mask, only the value of the field of interest will fall into the result. Even if we set all the bits to one, it will not affect the result:

 ToS field: 11111111 = 255
 Mask: 00000010 = 2
 Result: 00000010 = 2

And only if the bit of interest to us is zero, as a result of the imposition of the mask will also be zero.

 ToS field: 11111101 = 253
 Mask: 00000010 = 1
 Result: 00000000 = 0

Thus, if the bit of interest to us is equal to one, as a result of masking, we get the weight of this bit, if it is zero, then we get zero.
Based on this, to solve this problem we need to apply the following filter:
ip [1: 1] & 2 = 2

It will take the value of the second byte, impose on it the mask “cutting out” the value of a certain bit and compare the result with the weight of this bit.

We can give another example based on the analysis of the Type Of Service field of the IP header: we need to see all the frames where the Precedence bits (preference) in the IPv4 header in the ToS field is not zero. For this, we apply a mask, in which the ones we select the bits that are responsible for Precedence:

 ToS field: 10111101 = 189
 Mask: 11,100,000 = 224
 Result: 10,100,000 = 160

The result is not zero, and this suggests that the Precedence field is also non-zero.

 ToS field: 00011111 = 31
 Mask: 11,100,000 = 224
 Result: 00000000 = 0

The result is zero, and this indicates that the Precedence field is also zero.

As a result, checking for a non-zero value of the ToS field in the IPv4 header will look like this:
ip [1: 1] & 224! = 0
or the same, but using the hexadecimal option:
ip [1: 1] & 0xe0! = 0

Consider an example with a different protocol. Take the TCP protocol.
For example, we need to capture all frames that transmit TCP segments with options. In order to understand what needs to be looked for and where to go to RFC-793.

   TCP Header Format
     0 1 2 3   
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 3 5 6 7 8 9 0 1 
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Source Port |  Destination Port |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Sequence Number |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Acknowledgment Number |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Data |  | U | A | P | R | S | F |  |
    |  Offset |  Reserved | R | C | S | S | Y | I |  Window |
    |  |  | G | K | H | T | N | N |  |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +

    |  Checksum |  Urgent Pointer |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  Options |  Padding |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +
    |  data |
    + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + - + - + - + - + - + - + - +

To determine whether there is an option in the segment, the “Data Offset” field is used, it is shown in the four-byte word length of the header. The minimum length of a TCP segment header is 20 bytes, which is 5 four-byte words. Accordingly, if there are options in the TCP segment header, the value of this field will be greater than 5.
In order to get the value of this field it is necessary to use the primitive tcp [12: 1] . True, given the fact that the minimum piece that we can take is one byte, and we need only 4 bits, we will have to think a little.
Applying the primitive tcp [12: 1] we got the following piece of the header:

    + - + - + - + - + - + - + - + -
    |  Data |           
    |  Offset |  Reserved  
    |  |           
    + - + - + - + - + - + - + - + -

If the “Data Offset” field were in the lower part of a byte, then the number “5” in binary representation would look like this:

         128 64 32 16 8 4 2 1
          0 0 0 0 0 1 0 1 = 5

But the bits we are interested in are not in the left, lower, but in the right, higher part of it, therefore, to obtain a decimal equivalent, we transfer them to the right side of a byte:

         128 64 32 16 8 4 2 1
           0 1 0 1 0 0 0 0 = 80 (0x50 in hex)

To select the most significant bits, you must apply a mask:

 Field Data Offset: 01010000 = 80
 Mask: 11110000 = 240
 Result: 01010000 = 80

If there are options in the header, then the “Data Offset” value will be greater than 5. For example, if there is one eight-byte option in the header, the value of this field will be 7 (5 four-byte words of the fixed part of the header, and 2 four-byte option words):

         128 64 32 16 8 4 2 1
          0 0 0 0 0 1 1 1 = 7

Transferring the corresponding bits in the upper part we get:

         128 64 32 16 8 4 2 1
           0 1 1 1 0 0 0 0 = 112 (0x70 in hex)

Select the upper bits by applying the mask:

 Field Data Offset: 01110000 = 112
 Mask: 11110000 = 240
 Result: 01110000 = 112

Thus, it turns out that if the result is a value greater than 80, then there are options in the TCP header. In principle, the mask could not be superimposed, since the extra bits are still backup, and should always be zero, but you never know what can change, and in order not to rewrite the filter, if suddenly the standard changes, we better cut them off.
The resulting filter, which shows those TCP segments in which the TCP header is longer than 5 four-byte words, is as follows:
tcp [12: 1] & 240! = 80
or
tcp [12: 1] & 240> 80
or
tcp [12: 1] & 0xf0> 80
Also let's consider the possibility of working with TCP flags. They can be distinguished by the same method using a mask, but you can also use character classifiers, which were cited above.
For example, in order to capture frames containing segments with SYN or FIN flags, you need to write the following filter:
tcp [tcpflags] & (tcp-syn | tcp-fin)! = 0
I think it is quite readable and does not require special explanations.
The implementation of a similar task through bits and masks would lead to the following filter format:
tcp [13: 1] & 2! = 0 or tcp [13: 1] & 1! = 0
In order to consolidate the understanding of the topic, try to independently figure out how this filter option will work.

For example :
The expression ether [0] & 1! = 0 selects all broadcast frames.
The expression ether [0] & 1 = 0 and ip [16]> = 244 selects all broadcast or multicast IP packets that do not use a broadcast or multicast MAC address in the data link layer.
The expression ip [0] & 0xf = 5 will select all IP packets with no options.
The ip [6: 2] expression & 0x1fff = 0 will select all non-fragmented IP packets, and the first fragments of fragmented packets.
The expression ip [-3] & 0xff = 0x37 will select all IP packets whose MAC address of the sender ends with “0x37”.

Another interesting set of bit operations is bit shift operations. These operations are denoted by the symbols “a pair of arrows”: “<<” is a shift to the left and “>>” is a shift to the right.
How do they work?
Take an arbitrary byte, for simplicity, take a unit, and write its value in binary numbering system:

         128 64 32 16 8 4 2 1
           0 0 0 0 0 0 1 = 1

Now we perform a bit shift to the left, shifting the values of all bits by one position, and delivering zero to the least-significant free bit:

         128 64 32 16 8 4 2 1
          0 0 0 0 0 0 1 <<

         128 64 32 16 8 4 2 1
          0 0 0 0 0 0 1 0 = 2

As a result, the value of the byte became equal to two, that is, doubled. And once again apply this operation:

         128 64 32 16 8 4 2 1
          0 0 0 0 0 1 0 <<

         128 64 32 16 8 4 2 1
          0 0 0 0 0 1 0 0 = 4

As a result, the value of the byte became equal to four, that is, again doubled. Thus, we can conclude that the operation of the bit shift to the left is equivalent to multiplying the value of a byte by two (as we work with the binary system of calculus).
Let's check the validity of this rule on a more complex number, for example, 100. We write it in binary form:

         128 64 32 16 8 4 2 1
           0 1 1 0 0 1 0 0 = 100

And now we perform the left shift operation:

         128 64 32 16 8 4 2 1
          1 1 0 0 1 0 0 <<

         128 64 32 16 8 4 2 1
          1 1 0 0 1 0 0 0 = 200

As a result, the value of the byte was equal to 200 - doubled.
Accordingly, we can immediately conclude that a bitwise shift to the right is equivalent to dividing a number by two.
For example, the number 240:

         128 64 32 16 8 4 2 1
           1 1 1 1 0 0 0 0 = 240

Perform a right shift operation:

         128 64 32 16 8 4 2 1
          >> 1 1 1 1 0 0 0

         128 64 32 16 8 4 2 1
          0 1 1 1 1 0 0 0 = 120

The value of the byte became equal to 120 - decreased twice.
How can we use this operation? Let's remember that the IHL (Internet Header Length) field of the IP header indicates the length of the header not in bytes, but in four-byte words, and to check whether the package contains options, we used the following operation:
ip [0] & 0xf = 5
That is, it was not compared with a real value, but with a value divided into four (20 bytes are 5 four-byte words). If for some reason it is more convenient to work with the header length in bytes (for example, if you need to subtract this value from the total packet length), then it must be multiplied by 4. In order to multiply the number by 4, it must be doubled by two , that is, perform a bit-shift left operation twice, and then compare with the required long IP header in bytes:
ip [0] << 2 = 20

And of course, all this can be combined into multiple sets of rules:
(icmp [icmptype]! = icmp-echo and icmp [icmptype]! = icmp-echoreply) or (udp and udp port not 67 and ip [16] <224) or (tcp [0: 2] <1024 and tcp [ 2: 2] <1024)
With this filter, the program will capture only those frames that fit one of the three descriptions:

Contain ICMP messages other than echo and echoreply (used by the ping utility)
UDP datagrams are transmitted, except for those that use port 67 as the sender's port or receiver's port, and other than those that are sent to the multicast addresses and the limited broadcast address
Transmit TCP segments in which both the port of the sender and the port of the receiver are in the range of “Well Known Ports”

4. Task for self-test: D

To consolidate work with complex capture filters, try to understand what this filter describes and how it works:
tcp port 80 and (ip [2: 2] - ip [0] & 0xf << 2 - tcp [12] & 0xf0 >> 2! = 0)

Note :
This is one of the standard examples, and if necessary, you can easily check yourself by using the Internet search, but still try to make an effort and deal with it yourself.

Successful sniffing)

Source: https://habr.com/ru/post/211042/

All Articles

Capture filters for network analyzers (tcpdump, Wireshark, Paketyzer)

1. Capture Filters

2. Capture Filters Syntax

3. Advanced examples of capture filters

4. Task for self-test: D

More articles: