Bash scripts: start
Bash scripts, part 2: loops
Bash scripts, part 3: command line options and keys
Bash scripts, part 4: input and output
Bash Scripts, Part 5: Signals, Background Tasks, Script Management
Bash scripts, part 6: functions and library development
Bash scripts, part 7: sed and word processing
Bash scripts, part 8: awk data processing language
Bash scripts, part 9: regular expressions
Bash scripts, part 10: practical examples
Bash scripts, part 11: expect and automate interactive utilities
Last time we talked about functions in bash scripts, in particular, how to call them from the command line. Our topic today is a very useful tool for processing string data - a Linux utility called sed. It is often used to work with texts that look like log files, configuration files, and other files.
If you, in bash-scripts, somehow process the data, you will not interfere with the
sed and gawk tools. Here we will focus on sed and on working with texts, as this is a very important step in our journey through the vast expanses of bash script development.
')
Now we will analyze the basics of working with sed, as well as consider more than three dozen examples of using this tool.
Sed basics
The sed utility is called a stream text editor. In interactive text editors, like nano, they work with texts using the keyboard, editing files, adding, deleting, or changing texts. Sed allows you to edit data streams based on the developer-defined rule sets. Here is the scheme of calling this command:
$ sed options file
By default, sed applies the rules specified when invoked, expressed as a set of commands, to
STDIN
. This allows data to be transmitted directly to sed.
For example:
$ echo | sed 's/test/another test/'
That's what happens when you run this command.
A simple example of calling sed
In this case, sed replaces the word “test” in the line passed for processing with the words “another test”. For the design of the rules for processing text enclosed in quotes, use forward slashes. In our case, a command of the form
s/pattern1/pattern2/
. The letter "s" is an abbreviation of the word "substitute", that is - we have a replacement team. Sed, executing this command, will scan the transferred text and replace the fragments found in it (about which
pattern1
,
pattern2
talk below), corresponding to
pattern1
, with
pattern2
.
The above is a primitive example of using sed, which is needed to get you up to speed. In fact, sed can be used in much more complex word processing scenarios, for example, for working with files.
Below is a file containing a fragment of text, and the results of its processing with such a command:
$ sed 's/test/another test' ./myfile
Text file and its processing results
This is the same approach that we used above, but now sed processes the text stored in the file. At the same time, if the file is large enough, you can see that sed processes the data in chunks and displays what is processed on the screen, without waiting for the entire file to be processed.
Sed does not change the data in the file being processed. The editor reads the file, processes the read, and sends what came out in
STDOUT
. In order to make sure that the source file has not changed, it is enough to open it after it has been transferred to sed. If necessary, the sed output can be redirected to a file, it is possible to overwrite the old file. If you are familiar with one of the previous
materials in this series, where it comes to redirecting input and output streams, you may well be able to do this.
Executing command sets when calling sed
To perform multiple actions on data, use the
-e
switch when invoking sed. For example, here's how to organize the replacement of two pieces of text:
$ sed -e 's/This/That/; s/test/another test/' ./myfile
Using the -e switch when calling sed
Both commands are applied to each line of text from a file. They should be separated by a semicolon, while there should be no space between the end of the command and the semicolon.
To enter several text processing templates when you call sed, you can, after entering the first single quote, press Enter, and then enter each rule from a new line, without forgetting about the closing quote:
$ sed -e ' > s/This/That/ > s/test/another test/' ./myfile
That's what happens after the command presented in this form will be executed.
Another way to work with sed
Reading commands from a file
If there are many sed commands with which you need to process the text, it is usually best to pre-write them to a file. To specify a sed file containing commands, use the
-f
:
Here are the contents of the
mycommands
file:
s/This/That/ s/test/another test/
Call sed, passing the editor a file with commands and a file for processing:
$ sed -f mycommands myfile
The result of calling such a command is similar to that obtained in the previous examples.
Using the command file when calling sed
Replacement team flags
Take a close look at the following example.
$ sed 's/test/another test/' myfile
This is what the file contains and what will be obtained after it is processed by sed.
Source file and its processing results
The replace command normally processes a file consisting of several lines, but only the first occurrences of the required text fragment in each line are replaced. In order to replace all occurrences of a pattern, you need to use the appropriate flag.
The scheme of the replacement command when using flags looks like this:
s/pattern/replacement/flags
The execution of this command can be modified in several ways.
- When transmitting the number, the sequence number of the occurrence of the pattern in the string is taken into account.
- The
g
flag indicates that you need to process all occurrences of the pattern that are in the string.
- The
p
flag indicates that the contents of the source string should be displayed.
- A
w file
flag tells the command to write text processing results to a file.
Consider using the first variant of the replacement command, indicating the position of the replaced occurrence of the desired fragment:
$ sed 's/test/another test/2' myfile
Calling a replacement command indicating the position of the fragment being replaced
Here we indicated the number 2 as the replacement flag. This led to the fact that only the second occurrence of the search pattern in each line was replaced. Now let's test the global replacement flag -
g
:
$ sed 's/test/another test/g' myfile
As can be seen from the output, this command replaced all occurrences of the pattern in the text.
Global replacement
The replace command
p
flag allows you to output lines that match, and the
-n
specified when you call sed suppresses normal output:
$ sed -n 's/test/another test/p' myfile
As a result, when you run sed in such a configuration, only lines (in our case, one line) are displayed on the screen in which the specified text fragment is found.
Using the replace command flag p
We use the
w
flag, which allows you to save the text processing results to a file:
$ sed 's/test/another test/w output' myfile
Saving text processing results to a file
It is clearly seen that during the work of the command, the data is output to
STDOUT , while the processed lines are written to the file whose name is specified after
w
.
Delimiter characters
Imagine replacing
/bin/bash
with
/bin/csh
in the
/etc/passwd
. The task is not so difficult:
$ sed 's/\/bin\/bash/\/bin\/csh/' /etc/passwd
However, this does not look very good. The thing is that since the forward slashes are used as separator characters, the same characters in the transmitted sed lines have to be escaped. As a result, the readability of the command suffers.
Fortunately, sed allows us to set the delimiter characters ourselves for use in the replace command. The separator is the first character that will be encountered after
s
:
$ sed 's!/bin/bash!/bin/csh!' /etc/passwd
In this case, an exclamation point is used as a separator, as a result, the code is easier to read and it looks much tidier than before.
Select text fragments for processing
So far, we have called sed to process the entire data stream passed to the editor. In some cases, with the help of sed, it is necessary to process only some part of the text — some specific line or group of lines. To achieve this goal, you can use two approaches:
- Set a limit on the number of lines to be processed.
- Specify the filter corresponding to the line to be processed.
Consider the first approach. There are two possible options. The first, discussed below, provides for the indication of the number of one line to be processed:
$ sed '2s/test/another test/' myfile
Processing only one line whose number is set when you call sed
The second option is a range of lines:
$ sed '2,3s/test/another test/' myfile
Row range processing
In addition, you can call the replace command so that the file is processed starting from a certain line to the end:
$ sed '2,$s/test/another test/' myfile
File processing starting from the second line to the end
In order to process only the rows corresponding to the specified filter using the replace command, the command must be called as follows:
$ sed '/likegeeks/s/bash/csh/' /etc/passwd
By analogy with what was considered above, the pattern is passed before the command name
s
.
Processing strings matching the filter
Here we used a very simple filter. In order to fully reveal the possibilities of this approach, you can use regular expressions. We will talk about them in one of the following materials in this series.
Delete rows
The sed utility is not only suitable for replacing some character strings in strings with others. With its help, namely, using the
d
command, you can delete lines from the text stream.
The command call looks like this:
$ sed '3d' myfile
We want the third line to be removed from the text. Please note that this is not a file. The file will remain unchanged, the deletion will only affect the output that will form sed.
Delete third row
If the
d
command does not specify the line number to be deleted, all the lines in the stream will be deleted.
Here's how to apply the
d
command to a range of lines:
$ sed '2,3d' myfile
Delete row range
And here is how to delete lines starting from the given one - and to the end of the file:
$ sed '3,$d' myfile
Delete lines to end of file
Lines can also be deleted by pattern:
$ sed '/test/d' myfile
Delete rows by pattern
When you call
d
you can specify a couple of patterns - the lines will be deleted in which the pattern will be found, and the lines that are between them:
$ sed '/second/,/fourth/d' myfile
Delete a range of rows using templates
Insert text to stream
With sed, you can insert data into a text stream using the
i
and
a
commands:
- The
i
command adds a new line before the specified one.
- The
a
command adds a new line after the specified one.
Consider an example of using the
i
command:
$ echo "Another test" | sed 'i\First test '
Command i
Now take a look at the
a
command:
$ echo "Another test" | sed 'a\First test '
Command a
As you can see, these commands add text before or after the data from the stream. What if you need to add a line somewhere in the middle?
Here we will be helped by the indication of the number of the reference line in the stream, or pattern. Note that addressing strings as a range does not work here. Call the
i
command, specifying the line number before which you need to insert a new line:
$ sed '2i\This is the inserted line.' myfile
The i command with the reference line number
Do the same with the
a
command:
$ sed '2a\This is the appended line.' myfile
Command a with reference line number
Notice the difference between the
i
and
a
commands. The first inserts a new line to the specified, the second - after.
Replacing strings
The
c
command allows you to change the contents of a whole line of text in a data stream. When calling it, you need to specify the line number, instead of which new data should be added to the stream:
$ sed '3c\This is a modified line.' myfile
Replacing the entire string
If you use a template in the form of a plain text or a regular expression, all the lines matching the pattern will be replaced:
$ sed '/This is/c This is a changed line of text.' myfile
Replacing strings by pattern
Character replacement
The
y
command works with individual characters, replacing them in accordance with the data passed to it when it is called:
$ sed 'y/123/567/' myfile
Character replacement
Using this command, you need to take into account that it applies to the entire text flow, you cannot limit it to specific occurrences of characters.
Display line numbers
If you call sed using the
=
command, the utility displays line numbers in the data stream:
$ sed '=' myfile
Display line numbers
The stream editor output line numbers in front of their contents.
If you pass a pattern to this command and use the sed
-n
key, only line numbers matching the pattern will be output:
$ sed -n '/test/=' myfile
Display line numbers matching the pattern
Reading data to insert from file
Above, we looked at techniques for inserting data into a stream, indicating what needs to be inserted, right when you call sed. As a data source, you can use the file. To do this, use the
r
command, which allows you to insert data from the specified file into the stream. When you call it, you can specify the line number after which you want to insert the contents of the file, or a template.
Consider an example:
$ sed '3r newfile' myfile
Insert file into stream
Here the contents of the file
newfile
were inserted after the third line of the file
myfile
.
Here's what happens if you apply a pattern when invoking the
r
command:
$ sed '/test/r newfile' myfile
Using a template when calling the r command
The contents of the file will be inserted after each line corresponding to the pattern.
Example
Imagine such a task. There is a file in which there is a certain sequence of characters, in itself meaningless, which must be replaced with data taken from another file. Namely, let it be the file
newfile
, in which the sequence of characters
DATA
plays the role of a
newfile
. Data that needs to be substituted for
DATA
is stored in the
data
file.
This task can be solved by using the
r
and
d
commands of the stream sed editor:
$ Sed '/DATA>/ { r newfile d}' myfile
Replacing the placeholder with real data
As you can see, instead of the placeholder,
DATA
sed added two lines from the
data
file to the output stream.
Results
Today we covered the basics of working with the sed streaming editor. In fact, sed is a huge topic. Its study can be compared with the study of a new programming language, but, having understood the basics, you can master sed at any level you need. As a result, your ability to process texts with its help will be limited only by your imagination.
That's all for today. Next time, let's talk about the awk data processing language.
Dear readers! Do you use sed in your daily work? If so, please share your experience.