📜 ⬆️ ⬇️

Linux basics from the founder of Gentoo. Part 2 (2/5): Folder Assignments, File Search

This passage describes the file system hierarchy (FHS) standard, why directories are so called and why they are needed. The PATH environment variable is mentioned and the main commands for finding files on the system are parsed, such as whereis, find and locate (slocate).

Navigating Linux basics from the founder of Gentoo:

Part I
  1. BASH: Basics of Navigation (Intro)
  2. File and Directory Management
  3. Links, and deleting files and directories
  4. Glob-substitutions (totals and links)

Part II
  1. Regular Expressions (Intro)
  2. Folder Assignments, File Search
  3. Process management
  4. Text processing and redirection
  5. Kernel modules (totals and links)

FHS and file search

File System Hierarchy Standard

The file system hierarchy standard (Filesystem Hierarchy Standard or FHS) is a document that defines the directory scheme on Linux systems. The FHS is designed to provide a general scheme for simplifying software distribution-independent distribution, since everything that is needed is the same in most distributions. The FHS defines the following directory tree (taken directly from the specification):

Two independent classifications in the FHS

The FHS specification is based on the idea of ​​the existence of two independent file classifications: shared and non-shared, as well as mutable and static. Shared data can be shared across multiple hosts; Unshareable host-specific (for example, configuration files). Variable data may change; static do not change (except for installation and maintenance of the system).

The following table summarizes four possible combinations, with examples of directories that fall into these categories. Again, this table is straight from the specification:

  + ------------ + ----------------- + --------------- +
 |  |  shared |  undivided |
 + ------------ + ----------------- + --------------- +
 |  static |  / usr |  / etc |
 |  |  / opt |  / boot |
 + ------------ + ----------------- + --------------- +
 |  changeable |  / var / mail |  / var / run |
 |  |  / var / spool / news |  / var / lock |
 + ------------ + ----------------- + --------------- + 

Secondary hierarchy in / usr

Inside / usr you will find a secondary hierarchy that looks very similar to the root filesystem. For / usr, the existence at the time the machine is turned on is not critical; it can be a shared network resource (shared) or mounted from a CD-ROM (static). Most Linux configurations do not use the " usr shareability" , but it is valuable to understand the usefulness of the difference between the main hierarchy in the root directory and the secondary hierarchy in / usr .

This is all that we will tell about the standard file system hierarchy. The document itself is quite readable and you should take a look at it. After reading it, you will understand the Linux file system much better. Find the specification here: http://www.pathname.com/fhs/ .

File search

Linux systems often contain hundreds of thousands of files. It’s possible that you’re smart enough to never lose sight of any of them, but it’s much more likely that sometimes you’ll need help finding a file. Linux has several different tools for this. This introduction will help you choose the one that suits your needs.


When you run a program from the command line, bash starts looking through the list of directories in search of the program you specified. For example, when you type ls , bash doesn't really know that the ls program is in / usr / bin. Instead, it refers to an environment variable called PATH, which contains a list of directories separated by a colon. We can check the PATH value:

$ echo $PATH

With this PATH value (you may have it different), bash first checks the / usr / local / bin directory, then / usr / bin in search of the ls program. Most likely, ls is in / usr / bin, then on this directory, bash will stop searching.

PATH change

You can extend the PATH variable by assigning it a new value in the command line:

$ PATH=$PATH:~/bin
$ echo $PATH

You can also delete items from the PATH, although this is not so easy, since you cannot refer to the existing $ PATH in the command. The best option is to simply re-specify in your PATH what you need:

$ PATH=/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:~/bin
$ echo $PATH

To make your PATH changes available to processes that will be launched in the command shell, you need to “export” them using the export command:

$ export PATH

About the "which" command

You can check if there is a specific program in your PATH using which . In the following example, we see, in the PATH directories of our system, there is no program called sense:

$ which sense
which: no sense in (/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/X11R6/bin)

In this example, ls is successfully found:

$ which ls

which -a

Finally, you should be aware of the -a flag, which will indicate which show you all instances of the program in PATH:

$ which -a ls


If you need more information about the program than just its location, you can use the whereis command:

$ whereis ls
ls: /bin/ls /usr/bin/ls /usr/share/man/man1/ls.1.gz

Here we see that ls is in two directories with common executables, / bin and / usr / bin. In addition, we were informed that there is documentation, which is located in / usr / share / man. This is the man page that you will see if you type man ls .

The whereis program can be used to search for the location of source codes and non-standard search (meaning the ability to search for files for which there are no mana, source codes or binaries - approx. Lane). You can also specify alternate paths for your search. See the man page for more information.


The find command is another handy tool in your arsenal. Using find, you are not limited to finding programs; You can search for any type of file using various search criteria. For example, look in the directory / usr / share / doc, a file called README:

$ find /usr/share/doc -name README

find and patterns

You can use glob-templates for the -name arguments, provided that you escape them with quotes or a backslash (thus they will be transferred to the command intact, otherwise they will first be deployed by bash and then transferred to the command). Let's look for all the README files with the extension:

$ find /usr/share/doc -name README\*
[ 578 ]

Ignore case in find

Of course, you can ignore case when searching:

$ find /usr/share/doc -name '[Rr][Ee][Aa][Dd][Mm][Ee]*'

Or, much simpler:

$ find /usr/share/doc -iname readme\*

As you can see, you can use the -iname option for a case-insensitive search.

find and regular expressions

If you are familiar with regular expressions, you can use the -regex option to search for files with names matching the pattern. And also an option similar to -iname, which is called -iregex and causes find to ignore case in the pattern. Example:

$ find /etc -iregex '.*xt.*'

However, unlike most programs, find requires that a regular expression be specified for the entire path, and not just for its part. For this reason, it is worth putting at the beginning and end of the template. *; just using xt as a template will not be enough.

find and file types

The option -type allows you to search for files of a particular type in the file system. Possible arguments for -type are: b (block device), c (character device), d (directory), p (named pipe), f (regular file), l (symbolic link), and s (socket). For example, search for a symbolic link in / usr / bin, which contains the string vim in its name:

$ find /usr/bin -name '*vim*' -type l

find and mtimes

The -mtime option allows you to search for files based on the date they were last modified. The mtime argument is the number of 24-hour periods, and the most useful one would be to precede the plus argument (meaning “after”) or minus (meaning “before”). For example, consider the following scenario:

 $ ls -l ? 
-rw------- 1 root root 0 Jan 7 18:00 a -rw------- 1 root root 0 Jan 6 18:00 b -rw------- 1 root root 0 Jan 5 18:00 c -rw------- 1 root root 0 Jan 4 18:00 d

$ date
Tue Jan 7 18:14:52 EST 2003

You can find files that have been modified in the last 24 hours:

$ find . -name \? -mtime -1

Or files that have been modified before the current 24-hour period:

$ find . -name \? -mtime +0

Option -daystart

If you additionally specify the -daystart option, time periods will be counted from the beginning of today, and not from the current time. For example, here are the files created yesterday and the day before:

 $ find . -name \? -daystart -mtime +0 -mtime -3 
$ ls -lbc
-rw------- 1 root root 0 May 6 18:00 b -rw------- 1 root root 0 May 5 18:00 c

Option -size

The -size option allows you to search for files by their size. By default, the -size argument is the number of 512-byte blocks, but by adding a suffix to the option, you can make the output more understandable. Available suffixes are b (512-byte blocks), c (byte), k (kilobyte), and w (2-byte words). Additionally, you can specify a plus (“more than”) or a minus (“less than”) before the argument.

For example, to search for a regular file in / usr / bin whose size is less than 50 bytes:

$ find /usr/bin -type f -size -50c

Work with found files

You have no idea what to do with the files found! So, find can perform any actions on files using the option -exec. This option takes a string of commands to execute, which ends in;, and replaces all occurrences of {} with the file name. This is easiest to understand by example:

 $ find /usr/bin -type f -size -50c -exec ls -l '{}' ';' 
-rwxr-xr-x 1 root root 27 Oct 28 07:13 /usr/bin/krdb -rwxr-xr-x 1 root root 35 Nov 28 18:26 /usr/bin/run-nautilus -rwxr-xr-x 1 root root 25 Oct 21 17:51 /usr/bin/sgmlwhich -rwxr-xr-x 1 root root 26 Sep 26 08:00 /usr/bin/muttbug

As you can see, find is a very powerful command. She "grew up" over the years the development of UNIX and Linux. Find has many other useful options. You can read about them in the man page.


We have already reviewed which, whereis and find. As you may have noticed, the execution of find may take some time, since she needs to read every directory in which the search is performed. It turns out that the locate command can speed up the process using an external database generated by updatedb (we’ll review updatedb below).

The locate command looks for any part of the path to match, not just the file itself. Example:

$ locate bin/ls

Using updatedb

Many Linux systems have a “cron job” for periodic database updates. If the call to locate returned the error described below, you need to run updatedb as root to generate the search base:

$ locate bin/ls
locate: /var/spool/locate/locatedb: No such file or directory
$ su -
# updatedb

The updatedb program may take some time. If you have a noisy hard drive, you will hear how it rustles when indexing the file system. :)


In many Linux distributions, the locate utility has been replaced by slocate. As a rule, there is also a link to locate, so you do not need to remember what exactly is in the system. slocate means "secure locate" (from secure locate). It stores information about access rights in the search database, so that ordinary users will not be able to see the directories that they would not be able to see anyway. Slocate is used in the same way as locate, but the output of the program may be different depending on the user who started it.

Translated by Dmitry Minsky (Dmitry.Minsky@gmail.com)

Continued ...

About the authors

Daniel Robbins

Daniel Robbins is the founder of the Gentoo community and the creator of the Gentoo Linux operating system. Daniel lives in New Mexico with his wife, Mary, and two energetic daughters. He is also the founder and head of Funtoo , has written many technical articles for IBM developerWorks , Intel Developer Services and the C / C ++ Users Journal.

Chris Houser

Chris Hauser was a UNIX supporter since 1994 when he joined the team of administrators at Taylor University (Indiana, USA), where he received a bachelor's degree in computer science and mathematics. After that, he worked in many areas, including web applications, video editing, drivers for UNIX, and cryptographic protection. Currently working in Sentry Data Systems. Chris also contributed to many free projects, such as Gentoo Linux and Clojure, co-authored The Joy of Clojure .

Aron griffis

Airon Griffis lives in Boston, where he spent the last decade working with Hewlett-Packard on projects such as UNIX network drivers for Tru64, Linux security certification, Xen and KVM virtualization, and most recently, the HP ePrint platform. In his spare time, Airon prefers to ponder over the problems of programming while riding his bike, juggling bits, or cheering on the Boston Red Baseball team.

Source: https://habr.com/ru/post/105495/

All Articles