\ Linux Find Command - Use, examples, explanations and manpage

Linux Find Command

blank

Free Office Software


  • NAME

    find − search for files in a directory hierarchy

    SYNOPSIS

    find [−H] [−L] [−P] [−D debugopts] [−Olevel] [starting-point...] [expression]

    DESCRIPTION

    This manual page documents the GNU version of find. GNU find searches the directory tree rooted at each given starting-point by evaluating the given expression from left to right, according to the rules of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and operations, true for or), at which point find moves on to the next file name. If no starting-point is specified, ‘.’ is assumed.

    If you are using find in an environment where security is important (for example if you are using it to search directories that are writable by other users), you should read the "Security Considerations" chapter of the findutils documentation, which is called Finding Files and comes with findutils. That document also includes a lot more detail and discussion than this manual page, so you may find it a more useful source of information.

    OPTIONS

    The −H, −L and −P options control the treatment of symbolic links. Command-line arguments following these are taken to be names of files or directories to be examined, up to the first argument that begins with ‘−’, or the argument ‘(’ or ‘!’. That argument and any following arguments are taken to be the expression describing what is to be searched for. If no paths are given, the current directory is used. If no expression is given, the expression −print is used (but you should probably consider using −print0 instead, anyway).

    This manual page talks about ‘options’ within the expression list. These options control the behaviour of find but are specified immediately after the last path name. The five ‘real’ options −H, −L, −P, −D and −O must appear before the first path name, if at all. A double dash −− can also be used to signal that any remaining arguments are not options (though ensuring that all start points begin with either ‘./’ or ‘/’ is generally safer if you use wildcards in the list of start points).

    −P

    Never follow symbolic links. This is the default behaviour. When find examines or prints information a file, and the file is a symbolic link, the information used shall be taken from the properties of the symbolic link itself.

    −L

    Follow symbolic links. When find examines or prints information about files, the information used shall be taken from the properties of the file to which the link points, not from the link itself (unless it is a broken symbolic link or find is unable to examine the file to which the link points). Use of this option implies −noleaf. If you later use the −P option, −noleaf will still be in effect. If −L is in effect and find discovers a symbolic link to a subdirectory during its search, the subdirectory pointed to by the symbolic link will be searched.

    When the −L option is in effect, the −type predicate will always match against the type of the file that a symbolic link points to rather than the link itself (unless the symbolic link is broken). Using −L causes the −lname and −ilname predicates always to return false.

    −H

    Do not follow symbolic links, except while processing the command line arguments. When find examines or prints information about files, the information used shall be taken from the properties of the symbolic link itself. The only exception to this behaviour is when a file specified on the command line is a symbolic link, and the link can be resolved. For that situation, the information used is taken from whatever the link points to (that is, the link is followed). The information about the link itself is used as a fallback if the file pointed to by the symbolic link cannot be examined. If −H is in effect and one of the paths specified on the command line is a symbolic link to a directory, the contents of that directory will be examined (though of course −maxdepth 0 would prevent this).

    If more than one of −H, −L and −P is specified, each overrides the others; the last one appearing on the command line takes effect. Since it is the default, the −P option should be considered to be in effect unless either −H or −L is specified.

    GNU find frequently stats files during the processing of the command line itself, before any searching has begun. These options also affect how those arguments are processed. Specifically, there are a number of tests that compare files listed on the command line against a file we are currently considering. In each case, the file specified on the command line will have been examined and some of its properties will have been saved. If the named file is in fact a symbolic link, and the −P option is in effect (or if neither −H nor −L were specified), the information used for the comparison will be taken from the properties of the symbolic link. Otherwise, it will be taken from the properties of the file the link points to. If find cannot follow the link (for example because it has insufficient privileges or the link points to a nonexistent file) the properties of the link itself will be used.

    When the −H or −L options are in effect, any symbolic links listed as the argument of −newer will be dereferenced, and the timestamp will be taken from the file to which the symbolic link points. The same consideration applies to −newerXY, −anewer and −cnewer.

    The −follow option has a similar effect to −L, though it takes effect at the point where it appears (that is, if −L is not used but −follow is, any symbolic links appearing after −follow on the command line will be dereferenced, and those before it will not).
    −D debugoptions

    Print diagnostic information; this can be helpful to diagnose problems with why find is not doing what you want. The list of debug options should be comma separated. Compatibility of the debug options is not guaranteed between releases of findutils. For a complete list of valid debug options, see the output of find −D help. Valid debug options include

    help

    Explain the debugging options

    tree

    Show the expression tree in its original and optimised form.

    stat

    Print messages as files are examined with the stat and lstat system calls. The find program tries to minimise such calls.

    opt

    Prints diagnostic information relating to the optimisation of the expression tree; see the −O option.

    rates

    Prints a summary indicating how often each predicate succeeded or failed.

    −Olevel

    Enables query optimisation. The find program reorders tests to speed up execution while preserving the overall effect; that is, predicates with side effects are not reordered relative to each other. The optimisations performed at each optimisation level are as follows.

    0

    Equivalent to optimisation level 1.

    1

    This is the default optimisation level and corresponds to the traditional behaviour. Expressions are reordered so that tests based only on the names of files (for example −name and −regex) are performed first.

    2

    Any −type or −xtype tests are performed after any tests based only on the names of files, but before any tests that require information from the inode. On many modern versions of Unix, file types are returned by readdir() and so these predicates are faster to evaluate than predicates which need to stat the file first. If you use the −fstype FOO predicate and specify a filsystem type FOO which is not known (that is, present in ‘/etc/mtab’) at the time find starts, that predicate is equivalent to −false.

    3

    At this optimisation level, the full cost-based query optimiser is enabled. The order of tests is modified so that cheap (i.e. fast) tests are performed first and more expensive ones are performed later, if necessary. Within each cost band, predicates are evaluated earlier or later according to whether they are likely to succeed or not. For −o, predicates which are likely to succeed are evaluated earlier, and for −a, predicates which are likely to fail are evaluated earlier.

    The cost-based optimiser has a fixed idea of how likely any given test is to succeed. In some cases the probability takes account of the specific nature of the test (for example, −type f is assumed to be more likely to succeed than −type c). The cost-based optimiser is currently being evaluated. If it does not actually improve the performance of find, it will be removed again. Conversely, optimisations that prove to be reliable, robust and effective may be enabled at lower optimisation levels over time. However, the default behaviour (i.e. optimisation level 1) will not be changed in the 4.3.x release series. The findutils test suite runs all the tests on find at each optimisation level and ensures that the result is the same.

    EXPRESSIONS

    The expression is made up of options (which affect overall operation rather than the processing of a specific file, and always return true), tests (which return a true or false value), and actions (which have side effects and return a true or false value), all separated by operators. −and is assumed where the operator is omitted.

    If the expression contains no actions other than −prune, −print is performed on all files for which the expression is true.

    OPTIONS
    All options always return true. Except for −daystart, −follow and −regextype, the options affect all tests, including tests specified before the option. This is because the options are processed when the command line is parsed, while the tests don’t do anything until files are examined. The −daystart, −follow and −regextype options are different in this respect, and have an effect only on tests which appear later in the command line. Therefore, for clarity, it is best to place them at the beginning of the expression. A warning is issued if you don’t do this.

    −d

    A synonym for −depth, for compatibility with FreeBSD, NetBSD, MacOS X and OpenBSD.

    −daystart

    Measure times (for −amin, −atime, −cmin, −ctime, −mmin, and −mtime) from the beginning of today rather than from 24 hours ago. This option only affects tests which appear later on the command line.

    −depth

    Process each directory’s contents before the directory itself. The −delete action also implies −depth.

    −follow

    Deprecated; use the −L option instead. Dereference symbolic links. Implies −noleaf. The −follow option affects only those tests which appear after it on the command line. Unless the −H or −L option has been specified, the position of the −follow option changes the behaviour of the −newer predicate; any files listed as the argument of −newer will be dereferenced if they are symbolic links. The same consideration applies to −newerXY, −anewer and −cnewer. Similarly, the −type predicate will always match against the type of the file that a symbolic link points to rather than the link itself. Using −follow causes the −lname and −ilname predicates always to return false.

    −help, −−help

    Print a summary of the command-line usage of find and exit.

    −ignore_readdir_race

    Normally, find will emit an error message when it fails to stat a file. If you give this option and a file is deleted between the time find reads the name of the file from the directory and the time it tries to stat the file, no error message will be issued. This also applies to files or directories whose names are given on the command line. This option takes effect at the time the command line is read, which means that you cannot search one part of the filesystem with this option on and part of it with this option off (if you need to do that, you will need to issue two find commands instead, one with the option and one without it).

    −maxdepth levels

    Descend at most levels (a non-negative integer) levels of directories below the starting-points. −maxdepth 0
    means only apply the tests and actions to the starting-points themselves.

    −mindepth levels

    Do not apply any tests or actions at levels less than levels (a non-negative integer). −mindepth 1 means process all files except the starting-points.

    −mount

    Don’t descend directories on other filesystems. An alternate name for −xdev, for compatibility with some other versions of find.

    −noignore_readdir_race

    Turns off the effect of −ignore_readdir_race.

    −noleaf

    Do not optimize by assuming that directories contain 2 fewer subdirectories than their hard link count. This option is needed when searching filesystems that do not follow the Unix directory-link convention, such as CD-ROM or MS-DOS filesystems or AFS volume mount points. Each directory on a normal Unix filesystem has at least 2 hard links: its name and its ‘.’ entry. Additionally, its subdirectories (if any) each have a ‘..’ entry linked to that directory. When find is examining a directory, after it has statted 2 fewer subdirectories than the directory’s link count, it knows that the rest of the entries in the directory are non-directories (‘leaf’ files in the directory tree). If only the files’ names need to be examined, there is no need to stat them; this gives a significant increase in search speed.

    −regextype type

    Changes the regular expression syntax understood by −regex and −iregex tests which occur later on the command line. Currently-implemented types are emacs (this is the default), posix-awk, posix-basic, posix-egrep and posix-extended.

    −version, −−version

    Print the find version number and exit.

    −warn, −nowarn

    Turn warning messages on or off. These warnings apply only to the command line usage, not to any conditions that find might encounter when it searches directories. The default behaviour corresponds to −warn if standard input is a tty, and to −nowarn otherwise.

    −xautofs

    Don’t descend directories on autofs filesystems.

    −xdev

    Don’t descend directories on other filesystems.

    TESTS
    Some tests, for example −newerXY and -samefile, allow comparison between the file currently being examined and some reference file specified on the command line. When these tests are used, the interpretation of the reference file is determined by the options −H, −L and −P and any previous −follow, but the reference file is only examined once, at the time the command line is parsed. If the reference file cannot be examined (for example, the stat(2) system call fails for it), an error message is issued, and find exits with a nonzero status.

    Numeric arguments can be specified as

    +n

    for greater than n,

    −n

    for less than n,

    n

    for exactly n.

    −amin n

    File was last accessed n minutes ago.

    −anewer file

    File was last accessed more recently than file was modified. If file is a symbolic link and the −H option or the −L option is in effect, the access time of the file it points to is always used.

    −atime n

    File was last accessed n*24 hours ago. When find figures out how many 24-hour periods ago the file was last accessed, any fractional part is ignored, so to match −atime +1, a file has to have been accessed at least two days ago.

    −cmin n

    File’s status was last changed n minutes ago.

    −cnewer file

    File’s status was last changed more recently than file was modified. If file is a symbolic link and the −H option or the −L option is in effect, the status-change time of the file it points to is always used.

    −ctime n

    File’s status was last changed n*24 hours ago. See the comments for −atime to understand how rounding affects the interpretation of file status change times.

    −empty

    File is empty and is either a regular file or a directory.

    −executable

    Matches files which are executable and directories which are searchable (in a file name resolution sense). This takes into account access control lists and other permissions artefacts which the −perm test ignores. This test makes use of the access(2) system call, and so can be fooled by NFS servers which do UID mapping (or root-squashing), since many systems implement access(2) in the client’s kernel and so cannot make use of the UID mapping information held on the server. Because this test is based only on the result of the access(2) system call, there is no guarantee that a file for which this test succeeds can actually be executed.

    −false

    Always false.

    −fstype type

    File is on a filesystem of type type. The valid filesystem types vary among different versions of Unix; an incomplete list of filesystem types that are accepted on some version of Unix or another is: ufs, 4.2, 4.3, nfs, tmp, mfs, S51K, S52K. You can use −printf with the %F directive to see the types of your filesystems.

    −gid n

    File’s numeric group ID is n.

    −group gname

    File belongs to group gname (numeric group ID allowed).

    −ilname pattern

    Like −lname, but the match is case insensitive. If the −L option or the −follow option is in effect, this test returns false unless the symbolic link is broken.

    −iname pattern

    Like −name, but the match is case insensitive. For example, the patterns ‘fo*’ and ‘F??’ match the file names ‘Foo’, ‘FOO’, ‘foo’, ‘fOo’, etc. The pattern ‘*foo*‘ will also match a file called ’.foobar’.

    −inum n

    File has inode number n. It is normally easier to use the −samefile test instead.

    −ipath pattern

    Like −path. but the match is case insensitive.

    −iregex pattern

    Like −regex, but the match is case insensitive.

    −iwholename pattern

    See −ipath. This alternative is less portable than −ipath.

    −links n

    File has n links.

    −lname pattern

    File is a symbolic link whose contents match shell pattern pattern. The metacharacters do not treat ‘/’ or ‘.’ specially. If the −L option or the −follow option is in effect, this test returns false unless the symbolic link is broken.

    −mmin n

    File’s data was last modified n minutes ago.

    −mtime n

    File’s data was last modified n*24 hours ago. See the comments for −atime to understand how rounding affects the interpretation of file modification times.

    −name pattern

    Base of file name (the path with the leading directories removed) matches shell pattern pattern. Because the leading directories are removed, the file names considered for a match with −name will never include a slash, so ‘−name a/b’ will never match anything (you probably need to use −path instead). The metacharacters (‘*’, ‘?’, and ‘[]’) match a ‘.’ at the start of the base name (this is a change in findutils-4.2.2; see section STANDARDS CONFORMANCE below). To ignore a directory and the files under it, use −prune; see an example in the description of −path. Braces are not recognised as being special, despite the fact that some shells including Bash imbue braces with a special meaning in shell patterns. The filename matching is performed with the use of the fnmatch(3) library function. Don’t forget to enclose the pattern in quotes in order to protect it from expansion by the shell.

    −newer file

    File was modified more recently than file. If file is a symbolic link and the −H option or the −L option is in effect, the modification time of the file it points to is always used.

    −newerXY reference

    Compares the timestamp of the current file with reference. The reference argument is normally the name of a file (and one of its timestamps is used for the comparison) but it may also be a string describing an absolute time. X and Y are placeholders for other letters, and these letters select which time belonging to how reference is used for the comparison.

    Some combinations are invalid; for example, it is invalid for X to be t. Some combinations are not implemented on all systems; for example B is not supported on all systems. If an invalid or unsupported combination of XY is specified, a fatal error results. Time specifications are interpreted as for the argument to the −d option of GNU date. If you try to use the birth time of a reference file, and the birth time cannot be determined, a fatal error message results. If you specify a test which refers to the birth time of files being examined, this test will fail for any files where the birth time is unknown.

    −nogroup

    No group corresponds to file’s numeric group ID.

    −nouser

    No user corresponds to file’s numeric user ID.

    −path pattern

    File name matches shell pattern pattern. The metacharacters do not treat ‘/’ or ‘.’ specially; so, for example,

    find . −path "./sr*sc"

    will print an entry for a directory called ‘./src/misc’ (if one exists). To ignore a whole directory tree, use −prune rather than checking every file in the tree. For example, to skip the directory ‘src/emacs’ and all files and directories under it, and print the names of the other files found, do something like this:

    find . −path ./src/emacs −prune −o −print

    Note that the pattern match test applies to the whole file name, starting from one of the start points named on the command line. It would only make sense to use an absolute path name here if the relevant start point is also an absolute path. This means that this command will never match anything:

    find bar −path /foo/bar/myfile −print

    Find compares the −path argument with the concatenation of a directory name and the base name of the file it’s examining. Since the concatenation will never end with a slash, −path arguments ending in a slash will match nothing (except perhaps a start point specified on the command line). The predicate −path is also supported by HP-UX find and will be in a forthcoming version of the POSIX standard.

    −perm mode

    File’s permission bits are exactly mode (octal or symbolic). Since an exact match is required, if you want to use this form for symbolic modes, you may have to specify a rather complex mode string. For example ‘−perm g=w’ will only match files which have mode 0020 (that is, ones for which group write permission is the only permission set). It is more likely that you will want to use the ‘/’ or ‘-’ forms, for example ‘−perm −g=w’, which matches any file with group write permission. See the EXAMPLES section for some illustrative examples.

    −perm −mode

    All of the permission bits mode are set for the file. Symbolic modes are accepted in this form, and this is usually the way in which would want to use them. You must specify ‘u’, ‘g’ or ‘o’ if you use a symbolic mode. See the EXAMPLES section for some illustrative examples.

    −perm /mode

    Any of the permission bits mode are set for the file. Symbolic modes are accepted in this form. You must specify ‘u’, ‘g’ or ‘o’ if you use a symbolic mode. See the EXAMPLES section for some illustrative examples. If no permission bits in mode are set, this test matches any file (the idea here is to be consistent with the behaviour of −perm −000).

    −perm +mode

    This is no longer supported (and has been deprecated since 2005). Use −perm /mode instead.

    −readable

    Matches files which are readable. This takes into account access control lists and other permissions artefacts which the −perm test ignores. This test makes use of the access(2) system call, and so can be fooled by NFS servers which do UID mapping (or root-squashing), since many systems implement access(2) in the client’s kernel and so cannot make use of the UID mapping information held on the server.

    −regex pattern

    File name matches regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named ‘./fubar3’, you can use the regular expression ‘.*bar.’ or ‘.*b.*3’, but not ‘f.*r3’. The regular expressions understood by find are by default Emacs Regular Expressions, but this can be changed with the −regextype option.

    −samefile name

    File refers to the same inode as name. When −L is in effect, this can include symbolic links.

    −size n[cwbkMG]

    File uses n units of space, rounding up. The following suffixes can be used:

    ‘b’

    for 512-byte blocks (this is the default if no suffix is used)

    ‘c’

    for bytes

    ‘w’

    for two-byte words

    ‘k’

    for Kilobytes (units of 1024 bytes)

    ‘M’

    for Megabytes (units of 1048576 bytes)

    ‘G’

    for Gigabytes (units of 1073741824 bytes)

    The size does not count indirect blocks, but it does count blocks in sparse files that are not actually allocated. Bear in mind that the ‘%k’ and ‘%b’ format specifiers of −printf handle sparse files differently. The ‘b’ suffix always denotes 512-byte blocks and never 1 Kilobyte blocks, which is different to the behaviour of −ls. The + and - prefixes signifiy greater than and less than, as usual, but bear in mind that the size is rounded up to the next unit (so a 1-byte file is not matched by -size -1M).

    −true

    Always true.

    −type c

    File is of type c:

    b

    block (buffered) special

    c

    character (unbuffered) special

    d

    directory

    p

    named pipe (FIFO)

    f

    regular file

    l

    symbolic link; this is never true if the −L option or the −follow option is in effect, unless the symbolic link is broken. If you want to search for symbolic links when −L is in effect, use −xtype.

    s

    socket

    D

    door (Solaris)

    −uid n

    File’s numeric user ID is n.

    −used n

    File was last accessed n days after its status was last changed.

    −user uname

    File is owned by user uname (numeric user ID allowed).

    −wholename pattern

    See −path. This alternative is less portable than −path.

    −writable

    Matches files which are writable. This takes into account access control lists and other permissions artefacts which the −perm test ignores. This test makes use of the access(2) system call, and so can be fooled by NFS servers which do UID mapping (or root-squashing), since many systems implement access(2) in the client’s kernel and so cannot make use of the UID mapping information held on the server.

    −xtype c

    The same as −type unless the file is a symbolic link. For symbolic links: if the −H or −P option was specified, true if the file is a link to a file of type c; if the −L option has been given, true if c is ‘l’. In other words, for symbolic links, −xtype checks the type of the file that −type does not check.

    −context pattern

    (SELinux only) Security context of the file matches glob pattern.

    ACTIONS
    −delete

    Delete files; true if removal succeeded. If the removal failed, an error message is issued. If −delete fails, find’s exit status will be nonzero (when it eventually exits). Use of −delete automatically turns on the ‘−depth’ option.

    Warnings: Don’t forget that the find command line is evaluated as an expression, so putting −delete first will make find try to delete everything below the starting points you specified. When testing a find command line that you later intend to use with −delete, you should explicitly specify −depth in order to avoid later surprises. Because −delete implies −depth, you cannot usefully use −prune and −delete together.

    −exec command ;

    Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ‘;’ is encountered. The string ‘{}’ is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find. Both of these constructions might need to be escaped (with a ‘\’) or quoted to protect them from expansion by the shell. See the EXAMPLES section for examples of the use of the −exec option. The specified command is run once for each matched file. The command is executed in the starting directory. There are unavoidable security problems surrounding use of the −exec action; you should use the −execdir option instead.

    −exec command {} +

    This variant of the −exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of ‘{}’ is allowed within the command. The command is executed in the starting directory. If find encounters an error, this can sometimes cause an immediate exit, so some pending commands may not be run at all. This variant of −exec always returns true.

    −execdir command ;
    −execdir command {} +

    Like −exec, but the specified command is run from the subdirectory containing the matched file, which is not normally the directory in which you started find. This a much more secure method for invoking commands, as it avoids race conditions during resolution of the paths to the matched files. As with the −exec action, the ‘+’ form of −execdir will build a command line to process more than one matched file, but any given invocation of command will only list files that exist in the same subdirectory. If you use this option, you must ensure that your $PATH environment variable does not reference ‘.’; otherwise, an attacker can run any commands they like by leaving an appropriately-named file in a directory in which you will run −execdir. The same applies to having entries in $PATH which are empty or which are not absolute directory names. If find encounters an error, this can sometimes cause an immediate exit, so some pending commands may not be run at all. The result of the action depends on whether the + or the ; variant is being used; −execdir command {} + always returns true, while −execdir command {} ; returns true only if command returns 0.

    −fls file

    True; like −ls but write to file like −fprint. The output file is always created, even if the predicate is never matched. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −fprint file

    True; print the full file name into file file. If file does not exist when find is run, it is created; if it does exist, it is truncated. The file names ‘/dev/stdout’ and ‘/dev/stderr’ are handled specially; they refer to the standard output and standard error output, respectively. The output file is always created, even if the predicate is never matched. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −fprint0 file

    True; like −print0 but write to file like −fprint. The output file is always created, even if the predicate is never matched. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −fprintf file format

    True; like −printf but write to file like −fprint. The output file is always created, even if the predicate is never matched. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −ls

    True; list current file in ls −dils format on standard output. The block counts are of 1K blocks, unless the environment variable POSIXLY_CORRECT is set, in which case 512-byte blocks are used. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −ok command ;

    Like −exec but ask the user first. If the user agrees, run the command. Otherwise just return false. If the command is run, its standard input is redirected from /dev/null.

    The response to the prompt is matched against a pair of regular expressions to determine if it is an affirmative or negative response. This regular expression is obtained from the system if the ‘POSIXLY_CORRECT’ environment variable is set, or otherwise from find’s message translations. If the system has no suitable definition, find’s own definition will be used. In either case, the interpretation of the regular expression itself will be affected by the environment variables ’LC_CTYPE’ (character classes) and ’LC_COLLATE’ (character ranges and equivalence classes).

    −okdir command ;

    Like −execdir but ask the user first in the same way as for −ok. If the user does not agree, just return false. If the command is run, its standard input is redirected from /dev/null.

    −print

    True; print the full file name on the standard output, followed by a newline. If you are piping the output of find into another program and there is the faintest possibility that the files which you are searching for might contain a newline, then you should seriously consider using the −print0 option instead of −print. See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −print0

    True; print the full file name on the standard output, followed by a null character (instead of the newline character that −print uses). This allows file names that contain newlines or other types of white space to be correctly interpreted by programs that process the find output. This option corresponds to the −0 option of xargs.

    −printf format

    True; print format on the standard output, interpreting ‘\’ escapes and ‘%’ directives. Field widths and precisions can be specified as with the ‘printf’ C function. Please note that many of the fields are printed as %s rather than %d, and this may mean that flags don’t work as you might expect. This also means that the ‘−’ flag does work (it forces fields to be left-aligned). Unlike −print, −printf does not add a newline at the end of the string. The escapes and directives are:

    \a

    Alarm bell.

    \b

    Backspace.

    \c

    Stop printing from this format immediately and flush the output.

    \f

    Form feed.

    \n

    Newline.

    \r

    Carriage return.

    \t

    Horizontal tab.

    \v

    Vertical tab.

    \0

    ASCII NUL.

    \\

    A literal backslash (‘\’).

    \NNN

    The character whose ASCII code is NNN (octal).

    A ‘\’ character followed by any other character is treated as an ordinary character, so they both are printed.

    %%

    A literal percent sign.

    %a

    File’s last access time in the format returned by the C ‘ctime’ function.

    %Ak

    File’s last access time in the format specified by k, which is either ‘@’ or a directive for the C ‘strftime’ function. The possible values for k are listed below; some of them might not be available on all systems, due to differences in ‘strftime’ between systems.

    @

    seconds since Jan. 1, 1970, 00:00 GMT, with fractional part.

    Time fields:

    H

    hour (00..23)

    I

    hour (01..12)

    k

    hour ( 0..23)

    l

    hour ( 1..12)

    M

    minute (00..59)

    p

    locale’s AM or PM

    r

    time, 12-hour (hh:mm:ss [AP]M)

    S

    Second (00.00 .. 61.00). There is a fractional part.

    T

    time, 24-hour (hh:mm:ss)

    +

    Date and time, separated by ‘+’, for example ‘2004−04−28+22:22:05.0’. This is a GNU extension. The time is given in the current timezone (which may be affected by setting the TZ environment variable). The seconds field includes a fractional part.

    X

    locale’s time representation (H:M:S)

    Z

    time zone (e.g., EDT), or nothing if no time zone is determinable

    Date fields:

    a

    locale’s abbreviated weekday name (Sun..Sat)

    A

    locale’s full weekday name, variable length (Sunday..Saturday)

    b

    locale’s abbreviated month name (Jan..Dec)

    B

    locale’s full month name, variable length (January..December)

    c

    locale’s date and time (Sat Nov 04 12:02:33 EST 1989). The format is the same as for ctime(3) and so to preserve compatibility with that format, there is no fractional part in the seconds field.

    d

    day of month (01..31)

    D

    date (mm/dd/yy)

    h

    same as b

    j

    day of year (001..366)

    m

    month (01..12)

    U

    week number of year with Sunday as first day of week (00..53)

    w

    day of week (0..6)

    W

    week number of year with Monday as first day of week (00..53)

    x

    locale’s date representation (mm/dd/yy)

    y

    last two digits of year (00..99)

    Y

    year (1970...)

    %b

    The amount of disk space used for this file in 512-byte blocks. Since disk space is allocated in multiples of the filesystem block size this is usually greater than %s/512, but it can also be smaller if the file is a sparse file.

    %c

    File’s last status change time in the format returned by the C ‘ctime’ function.

    %Ck

    File’s last status change time in the format specified by k, which is the same as for %A.

    %d

    File’s depth in the directory tree; 0 means the file is a starting-point.

    %D

    The device number on which the file exists (the st_dev field of struct stat), in decimal.

    %f

    File’s name with any leading directories removed (only the last element).

    %F

    Type of the filesystem the file is on; this value can be used for −fstype.

    %g

    File’s group name, or numeric group ID if the group has no name.

    %G

    File’s numeric group ID.

    %h

    Leading directories of file’s name (all but the last element). If the file name contains no slashes (since it is in the current directory) the %h specifier expands to ".".

    %H

    Starting-point under which file was found.

    %i

    File’s inode number (in decimal).

    %k

    The amount of disk space used for this file in 1K blocks. Since disk space is allocated in multiples of the filesystem block size this is usually greater than %s/1024, but it can also be smaller if the file is a sparse file.

    %l

    Object of symbolic link (empty string if file is not a symbolic link).

    %m

    File’s permission bits (in octal). This option uses the ‘traditional’ numbers which most Unix implementations use, but if your particular implementation uses an unusual ordering of octal permissions bits, you will see a difference between the actual value of the file’s mode and the output of %m. Normally you will want to have a leading zero on this number, and to do this, you should use the # flag (as in, for example, ‘%#m’).

    %M

    File’s permissions (in symbolic form, as for ls). This directive is supported in findutils 4.2.5 and later.

    %n

    Number of hard links to file.

    %p

    File’s name.

    %P

    File’s name with the name of the starting-point under which it was found removed.

    %s

    File’s size in bytes.

    %S

    File’s sparseness. This is calculated as (BLOCKSIZE*st_blocks / st_size). The exact value you will get for an ordinary file of a certain length is system-dependent. However, normally sparse files will have values less than 1.0, and files which use indirect blocks may have a value which is greater than 1.0. The value used for BLOCKSIZE is system-dependent, but is usually 512 bytes. If the file size is zero, the value printed is undefined. On systems which lack support for st_blocks, a file’s sparseness is assumed to be 1.0.

    %t

    File’s last modification time in the format returned by the C ‘ctime’ function.

    %Tk

    File’s last modification time in the format specified by k, which is the same as for %A.

    %u

    File’s user name, or numeric user ID if the user has no name.

    %U

    File’s numeric user ID.

    %y

    File’s type (like in ls −l), U=unknown type (shouldn’t happen)

    %Y

    File’s type (like %y), plus follow symlinks: L=loop, N=nonexistent

    %Z

    (SELinux only) file’s security context.

    %{ %[ %(

    Reserved for future use.

    A ‘%’ character followed by any other character is discarded, but the other character is printed (don’t rely on this, as further format characters may be introduced). A ‘%’ at the end of the format argument causes undefined behaviour since there is no following character. In some locales, it may hide your door keys, while in others it may remove the final page from the novel you are reading.

    The %m and %d directives support the # , 0 and + flags, but the other directives do not, even if they print numbers. Numeric directives that do not support these flags include G, U, b, D, k and n. The ‘−’ format flag is supported and changes the alignment of a field from right-justified (which is the default) to left-justified.

    See the UNUSUAL FILENAMES section for information about how unusual characters in filenames are handled.

    −prune

    True; if the file is a directory, do not descend into it. If −depth is given, false; no effect. Because −delete implies −depth, you cannot usefully use −prune and −delete together.

    −quit

    Exit immediately. No child processes will be left running, but no more paths specified on the command line will be processed. For example, find /tmp/foo /tmp/bar −print −quit will print only /tmp/foo. Any command lines which have been built up with −execdir ... {} + will be invoked before find exits. The exit status may or may not be zero, depending on whether an error has already occurred.

    UNUSUAL FILENAMES
    Many of the actions of find result in the printing of data which is under the control of other users. This includes file names, sizes, modification times and so forth. File names are a potential problem since they can contain any character except ‘\0’ and ‘/’. Unusual characters in file names can do unexpected and often undesirable things to your terminal (for example, changing the settings of your function keys on some terminals). Unusual characters are handled differently by various actions, as described below.
    −print0, −fprint0

    Always print the exact filename, unchanged, even if the output is going to a terminal.

    −ls, −fls

    Unusual characters are always escaped. White space, backslash, and double quote characters are printed using C-style escaping (for example ‘\f’, ‘\"’). Other unusual characters are printed using an octal escape. Other printable characters (for −ls and −fls these are the characters between octal 041 and 0176) are printed as-is.

    −printf, −fprintf

    If the output is not going to a terminal, it is printed as-is. Otherwise, the result depends on which directive is in use. The directives %D, %F, %g, %G, %H, %Y, and %y expand to values which are not under control of files’ owners, and so are printed as-is. The directives %a, %b, %c, %d, %i, %k, %m, %M, %n, %s, %t, %u and %U have values which are under the control of files’ owners but which cannot be used to send arbitrary data to the terminal, and so these are printed as-is. The directives %f, %h, %l, %p and %P are quoted. This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for −ls and −fls. If you are able to decide what format to use for the output of find then it is normally better to use ‘\0’ as a terminator than to use newline, as file names can contain white space and newline characters. The setting of the ‘LC_CTYPE’ environment variable is used to determine which characters need to be quoted.

    −print, −fprint

    Quoting is handled in the same way as for −printf and −fprintf. If you are using find in a script or in a situation where the matched files might have arbitrary names, you should consider using −print0 instead of −print.

    The −ok and −okdir actions print the current filename as-is. This may change in a future release.

    OPERATORS
    Listed in order of decreasing precedence:
    ( expr )

    Force precedence. Since parentheses are special to the shell, you will normally need to quote them. Many of the examples in this manual page use backslashes for this purpose: ‘\(...\)’ instead of ‘(...)’.

    ! expr

    True if expr is false. This character will also usually need protection from interpretation by the shell.

    −not expr

    Same as ! expr, but not POSIX compliant.

    expr1 expr2

    Two expressions in a row are taken to be joined with an implied "and"; expr2 is not evaluated if expr1 is false.

    expr1 −a expr2

    Same as expr1 expr2.

    expr1 −and expr2

    Same as expr1 expr2, but not POSIX compliant.

    expr1 −o expr2

    Or; expr2 is not evaluated if expr1 is true.

    expr1 −or expr2

    Same as expr1 −o expr2, but not POSIX compliant.

    expr1 , expr2

    List; both expr1 and expr2 are always evaluated. The value of expr1 is discarded; the value of the list is the value of expr2. The comma operator can be useful for searching for several different types of thing, but traversing the filesystem hierarchy only once. The −fprintf action can be used to list the various matched items into several different output files.

    STANDARDS CONFORMANCE

    For closest compliance to the POSIX standard, you should set the POSIXLY_CORRECT environment variable. The following options are specified in the POSIX standard (IEEE Std 1003.1, 2003 Edition):

    −H

    This option is supported.

    −L

    This option is supported.

    −name

    This option is supported, but POSIX conformance depends on the POSIX conformance of the system’s fnmatch(3) library function. As of findutils-4.2.2, shell metacharacters (‘*’, ‘?’ or ‘[]’ for example) will match a leading ‘.’, because IEEE PASC interpretation 126 requires this. This is a change from previous versions of findutils.

    −type

    Supported. POSIX specifies ‘b’, ‘c’, ‘d’, ‘l’, ‘p’, ‘f’ and ‘s’. GNU find also supports ‘D’, representing a Door, where the OS provides these.

    −ok

    Supported. Interpretation of the response is according to the "yes" and "no" patterns selected by setting the ‘LC_MESSAGES’ environment variable. When the ‘POSIXLY_CORRECT’ environment variable is set, these patterns are taken system’s definition of a positive (yes) or negative (no) response. See the system’s documentation for nl_langinfo(3), in particular YESEXPR and NOEXPR. When ‘POSIXLY_CORRECT’ is not set, the patterns are instead taken from find’s own message catalogue.

    −newer

    Supported. If the file specified is a symbolic link, it is always dereferenced. This is a change from previous behaviour, which used to take the relevant time from the symbolic link; see the HISTORY section below.

    −perm

    Supported. If the POSIXLY_CORRECT environment variable is not set, some mode arguments (for example +a+x) which are not valid in POSIX are supported for backward-compatibility.

    Other predicates

    The predicates −atime, −ctime, −depth, −group, −links, −mtime, −nogroup, −nouser, −print, −prune, −size, −user and −xdev ‘−atime’, ‘−ctime’, ‘−depth’, ‘−group’, ‘−links’, ‘−mtime’, ‘−nogroup’, ‘−nouser’, ‘−perm’, ‘−print’, ‘−prune’, ‘−size’, ‘−user’ and ‘−xdev’, are all supported.

    The POSIX standard specifies parentheses ‘(’, ‘)’, negation ‘!’ and the ‘and’ and ‘or’ operators ( −a, −o).

    All other options, predicates, expressions and so forth are extensions beyond the POSIX standard. Many of these extensions are not unique to GNU find, however.

    The POSIX standard requires that find detects loops:

    The find utility shall detect infinite loops; that is, entering a previously visited directory that is an ancestor of the last file encountered. When it detects an infinite loop, find shall write a diagnostic message to standard error and shall either recover its position in the hierarchy or terminate.

    GNU find complies with these requirements. The link count of directories which contain entries which are hard links to an ancestor will often be lower than they otherwise should be. This can mean that GNU find will sometimes optimise away the visiting of a subdirectory which is actually a link to an ancestor. Since find does not actually enter such a subdirectory, it is allowed to avoid emitting a diagnostic message. Although this behaviour may be somewhat confusing, it is unlikely that anybody actually depends on this behaviour. If the leaf optimisation has been turned off with −noleaf, the directory entry will always be examined and the diagnostic message will be issued where it is appropriate. Symbolic links cannot be used to create filesystem cycles as such, but if the −L option or the −follow option is in use, a diagnostic message is issued when find encounters a loop of symbolic links. As with loops containing hard links, the leaf optimisation will often mean that find knows that it doesn’t need to call stat() or chdir() on the symbolic link, so this diagnostic is frequently not necessary.

    The −d option is supported for compatibility with various BSD systems, but you should use the POSIX-compliant option −depth instead.

    The POSIXLY_CORRECT environment variable does not affect the behaviour of the −regex or −iregex tests because those tests aren’t specified in the POSIX standard.

    ENVIRONMENT VARIABLES

    LANG

    Provides a default value for the internationalization variables that are unset or null.

    LC_ALL

    If set to a non-empty string value, override the values of all the other internationalization variables.

    LC_COLLATE

    The POSIX standard specifies that this variable affects the pattern matching to be used for the −name option. GNU find uses the fnmatch(3) library function, and so support for ‘LC_COLLATE’ depends on the system library. This variable also affects the interpretation of the response to −ok; while the ‘LC_MESSAGES’ variable selects the actual pattern used to interpret the response to −ok, the interpretation of any bracket expressions in the pattern will be affected by ‘LC_COLLATE’.

    LC_CTYPE

    This variable affects the treatment of character classes used in regular expressions and also with the −name test, if the system’s fnmatch(3) library function supports this. This variable also affects the interpretation of any character classes in the regular expressions used to interpret the response to the prompt issued by −ok. The ‘LC_CTYPE’ environment variable will also affect which characters are considered to be unprintable when filenames are printed; see the section UNUSUAL FILENAMES.

    LC_MESSAGES

    Determines the locale to be used for internationalised messages. If the ‘POSIXLY_CORRECT’ environment variable is set, this also determines the interpretation of the response to the prompt made by the −ok action.

    NLSPATH

    Determines the location of the internationalisation message catalogues.

    PATH

    Affects the directories which are searched to find the executables invoked by −exec, −execdir, −ok and −okdir.

    POSIXLY_CORRECT

    Determines the block size used by −ls and −fls. If POSIXLY_CORRECT is set, blocks are units of 512 bytes. Otherwise they are units of 1024 bytes.

    Setting this variable also turns off warning messages (that is, implies −nowarn) by default, because POSIX requires that apart from the output for −ok, all messages printed on stderr are diagnostics and must result in a non-zero exit status.

    When POSIXLY_CORRECT is not set, −perm +zzz is treated just like −perm /zzz if +zzz is not a valid symbolic mode. When POSIXLY_CORRECT is set, such constructs are treated as an error.

    When POSIXLY_CORRECT is set, the response to the prompt made by the −ok action is interpreted according to the system’s message catalogue, as opposed to according to find’s own message translations.

    TZ

    Affects the time zone used for some of the time-related format directives of −printf and −fprintf.

    BINARIES

    The findutils source distribution contains two different implementations of find. The older implementation descends the file system recursively, while the newer one uses fts(3). Both are normally installed.

    If the option −−without−fts was passed to configure, the recursive implementation is installed as find and the fts-based implementation is installed as ftsfind. Otherwise, the fts-based implementation is installed as find and the recursive implementation is installed as oldfind.

    EXAMPLES

    find /tmp −name core −type f −print | xargs /bin/rm −f

    Find files named core in or below the directory /tmp and delete them. Note that this will work incorrectly if there are any filenames containing newlines, single or double quotes, or spaces.

    find /tmp −name core −type f −print0 | xargs −0 /bin/rm −f

    Find files named core in or below the directory /tmp and delete them, processing filenames in such a way that file or directory names containing single or double quotes, spaces or newlines are correctly handled. The −name test comes before the −type test in order to avoid having to call stat(1) on every file.

    find . −type f −exec file '{}' \;

    Runs ‘file’ on every file in or below the current directory. Notice that the braces are enclosed in single quote marks to protect them from interpretation as shell script punctuation. The semicolon is similarly protected by the use of a backslash, though single quotes could have been used in that case also.

    find / \( −perm −4000 −fprintf /root/suid.txt '%#m %u %p\n' \) , \
    \( −size +100M −fprintf /root/big.txt '%−10s %p\n' \)

    Traverse the filesystem just once, listing setuid files and directories into /root/suid.txt and large files into /root/big.txt.

    find $HOME −mtime 0

    Search for files in your home directory which have been modified in the last twenty-four hours. This command works this way because the time since each file was last modified is divided by 24 hours and any remainder is discarded. That means that to match −mtime 0, a file will have to have a modification in the past which is less than 24 hours ago.

    find /sbin /usr/sbin -executable \! -readable −print

    Search for files which are executable but not readable.

    find . −perm 664

    Search for files which have read and write permission for their owner, and group, but which other users can read but not write to. Files which meet these criteria but have other permissions bits set (for example if someone can execute the file) will not be matched.

    find . −perm −664

    Search for files which have read and write permission for their owner and group, and which other users can read, without regard to the presence of any extra permission bits (for example the executable bit). This will match a file which has mode 0777, for example.

    find . −perm /222

    Search for files which are writable by somebody (their owner, or their group, or anybody else).

    find . −perm /220
    find . −perm /u+w,g+w
    find . −perm /u=w,g=w

    All three of these commands do the same thing, but the first one uses the octal representation of the file mode, and the other two use the symbolic form. These commands all search for files which are writable by either their owner or their group. The files don’t have to be writable by both the owner and group to be matched; either will do.

    find . −perm −220
    find . −perm −g+w,u+w

    Both these commands do the same thing; search for files which are writable by both their owner and their group.

    find . −perm −444 −perm /222 ! −perm /111
    find . −perm −a+r −perm /a+w ! −perm /a+x

    These two commands both search for files that are readable for everybody ( −perm −444 or −perm −a+r), have at least one write bit set ( −perm /222 or −perm /a+w) but are not executable for anybody ( ! −perm /111 and ! −perm /a+x respectively).

    cd /source-dir
    find . −name .snapshot −prune −o \( \! −name *~ −print0 \)|
    cpio −pmd0 /dest-dir

    This command copies the contents of /source-dir to /dest-dir, but omits files and directories named .snapshot (and anything in them). It also omits files or directories whose name ends in ~, but not their contents. The construct −prune −o \( ... −print0 \) is quite common. The idea here is that the expression before −prune matches things which are to be pruned. However, the −prune action itself returns true, so the following −o ensures that the right hand side is evaluated only for those directories which didn’t get pruned (the contents of the pruned directories are not even visited, so their contents are irrelevant). The expression on the right hand side of the −o is in parentheses only for clarity. It emphasises that the −print0 action takes place only for things that didn’t have −prune applied to them. Because the default ‘and’ condition between tests binds more tightly than −o, this is the default anyway, but the parentheses help to show what is going on.

    find repo/ -exec test -d {}/.svn \; -or \
    -exec test -d {}/.git \; -or -exec test -d {}/CVS \; \
    -print -prune

    Given the following directory of projects and their associated SCM administrative directories, perform an efficient search for the projects’ roots:

    repo/project1/CVS
    repo/gnu/project2/.svn
    repo/gnu/project3/.svn
    repo/gnu/project3/src/.svn
    repo/project4/.git

    In this example, −prune prevents unnecessary descent into directories that have already been discovered (for example we do not search project3/src because we already found project3/.svn), but ensures sibling directories (project2 and project3) are found.

    EXIT STATUS

    find exits with status 0 if all files are processed successfully, greater than 0 if errors occur. This is deliberately a very broad description, but if the return value is non-zero, you should not rely on the correctness of the results of find.

    When some error occurs, find may stop immeidately, without completing all the actions specified. For example, some starting points may not have been examined or some pending program invocations for −exec ... {} + or −execdir ... {} + may not have been performed.

    SEE ALSO

    locate(1), locatedb(5), updatedb(1), xargs(1), chmod(1), fnmatch(3), regex(7), stat(2), lstat(2), ls(1), printf(3), strftime(3), ctime(3)

    The full documentation for find is maintained as a Texinfo manual. If the info and find programs are properly installed at your site, the command info find should give you access to the complete manual.

    HISTORY

    As of findutils-4.2.2, shell metacharacters (‘*’, ‘?’ or ‘[]’ for example) used in filename patterns will match a leading ‘.’, because IEEE POSIX interpretation 126 requires this.

    As of findutils-4.3.3, −perm /000 now matches all files instead of none.

    Nanosecond-resolution timestamps were implemented in findutils-4.3.3.

    As of findutils-4.3.11, the −delete action sets find’s exit status to a nonzero value when it fails. However, find will not exit immediately. Previously, find’s exit status was unaffected by the failure of −delete.

    The syntax −perm +MODE was removed in findutils-4.5.12, in favour of −perm /MODE. The +MODE syntax had been deprecated since findutils-4.2.21 which was released in 2005.

    NON-BUGS

    $ find . −name *.c −print
    find: paths must precede expression
    Usage: find [−H] [−L] [−P] [−Olevel] [−D help|tree|search|stat|rates|opt|exec] [path...] [expression]

    This happens because *.c has been expanded by the shell resulting in find actually receiving a command line like this:

    find . −name bigram.c code.c frcode.c locate.c −print

    That command is of course not going to work. Instead of doing things this way, you should enclose the pattern in quotes or escape the wildcard:
    $ find . −name '*.c' −print
    $ find . −name \*.c −print

    BUGS

    There are security problems inherent in the behaviour that the POSIX standard specifies for find, which therefore cannot be fixed. For example, the −exec action is inherently insecure, and −execdir should be used instead. Please see Finding Files for more information.

    The environment variable LC_COLLATE has no effect on the −ok action.

    The best way to report a bug is to use the form at http://savannah.gnu.org/bugs/?group=findutils. The reason for this is that you will then be able to track progress in fixing the problem. Other comments about find(1) and about the findutils package in general can be sent to the bug−findutils mailing list. To join the list, send email to bug−findutils−request@gnu.org.


"Free software is primarily a matter of liberty, not price: users, individually or collectively, are free to do whatever they want with it - this includes the freedom ... to sell it "



"Our goal is to increase awareness of open source and to assist everyone in leveraging its economic and social value. Learn how to use open source to create opportunity and wealth."



blank

Learn



Products and Services



Free Software

  • Viewing And Editing (find -xdev)

    Below is an example using find to locate and view all files that end in .c that are under the current directory hierarchy. In this example, the viewing command less is utilized to view the list of file names output by the find command. The find command is enclosed in backquotes so that the shell will replace the enclosed programs(s) with their output.

    less `find . -name '*.c'`

    To restrict the program to only finding files on the current filesystem, use find -xdev to prevents the command from decending to directories on other file systems.

    less `find . -name '*.c' -xdev`

    To further refine the filtering process, the output of find could have been piped to xargs and filtered by grep as shown below:

    less `find . -name '*.c'| xargs grep -l llo `

    For the command above, only the files ending in .c and containing the string "llo" would be displayed.

    It should be noted that the command line has a maximum length. Therefore, if there is a need to process a large number of files or items an alternate method is needed. The example below illustrates a method of using xargs to direct input to an output file and then execute a command for each element in the output file:

    find /usr/include -name '*.h' | xargs grep -l mode_t > todo
    xargs --arg-file=todo emacs

    In the example above `xargs' will run `emacs' once for each file listed in the file `todo'.

    The next example illustrates how to find and edit the list of files with a single invocation:

    find /usr/include -name '*.h' | xargs grep -l mode_t | xargs sh -c 'emacs "$@" < /dev/tty' Emacs

    In this example 'xargs' invokes a shell command using `sh -c'. The `$@' in the command line is expanded by the shell to a list of arguments as provided by `xargs'. The single quotes in the command line protect the `$@' against expansion by your interactive shell (which will normally have no arguments and thus expand `$@' to nothing). The capitalised `Emacs' on the command line is used as `$0' by the shell that `xargs' launches.

    Archiving (find -print -depth)

    You can pass a list of files produced by `find' to a file archiving program. GNU `tar' and `cpio' can both read lists of file names from the standard input - either delimited by nulls (the safe way) or by blanks (the lazy, risky default way). To use null-delimited names, give them the `--null' option. You can store a file archive in a file, write it on a tape, or send it over a network to extract on another machine.

    One common use of `find' to archive files is to send a list of the files in a directory tree to `cpio'. Use `-depth' so if a directory does not have write permission for its owner, its contents can still be restored from the archive since the directory's permissions are restored after its contents. Here is an example of doing this using `cpio'; you could use a more complex `find' expression to archive only certain files.

    find . -depth -print0 | cpio --create --null --format=crc --file=/dev/nrst0

    You could restore that archive using this command:

    cpio --extract --null --make-dir --unconditional \ --preserve --file=/dev/nrst0

    Here are the commands to do the same things using `tar':

    find . -depth -print0 | tar --create --null --files-from=- --file=/dev/nrst0
    tar --extract --null --preserve-perm --same-owner \ --file=/dev/nrst0

    Here is an example of copying a directory from one machine to another:

    find . -depth -print0 | cpio -0o -Hnewc | rsh OTHER-MACHINE "cd `pwd` && cpio -i0dum"

    rsh OTHER-MACHINE "cd `pwd` && cpio -i0dum"

    Cleaning Up (find -prune -name -print)

    This section gives examples of removing unwanted files in various situations. Here is a command to remove the CVS backup files created when an update requires a merge:

    find . -name '.#*' -print0 | xargs -0r rm -f

    If your `find' command removes directories, you may find that you get a spurious error message when `find' tries to recurse into a directory that has now been removed. Using the `-depth' option will normally resolve this problem.

    It is also possible to use the `-delete' action:

    find . -depth -name '.#*' -delete

    You can run this command to clean out your clutter in `/tmp'. You might place it in the file your shell runs when you log out (`.bash_logout', `.logout', or `.zlogout', depending on which shell you use).

    find /tmp -depth -user "$LOGNAME" -type f -delete

    To remove old Emacs backup and auto-save files, you can use a command like the following. It is especially important in this case to use null-terminated file names because Emacs packages like the VM mailer often create temporary file names with spaces in them, like `#reply to David J. MacKenzie<1>#'.

    find ~ \( -name '*~' -o -name '#*#' \) -print0 | xargs --no-run-if-empty --null rm -vf

    Removing old files from `/tmp' is commonly done from `cron':

    find /tmp /var/tmp -depth -not -type d -mtime +3 -delete find /tmp /var/tmp -depth -mindepth 1 -type d -empty -delete

    The second `find' command above cleans out empty directories depth-first (`-delete' implies `-depth' anyway), hoping that the parents become empty and can be removed too. It uses `-mindepth' to avoid removing `/tmp' itself if it becomes totally empty. arents become empty and can be removed too. It uses `-mindepth' to avoid removing `/tmp' itself if it becomes totally empty.

    Lastly, an example of a program that almost certainly does not do what the user intended:

    find dirname -delete -name quux

    If the user hoped to delete only files named `quux' they will get an unpleasant surprise; this command will attempt to delete everything at or below the starting point `dirname'. This is because `find' evaluates the items on the command line as an expression. The `find' program will normally execute an action if the preceding action succeeds. Here, there is no action or test before the `-delete' so it will always be executed. The `-name quux' test will be performed for files we successfully deleted, but that test has no effect since `-delete' also disables the default `-print' operation. So the above example will probably delete a lot of files the user didn't want to delete.

    This command is also likely to do something you did not intend:

    find dirname -path dirname/foo -prune -o -delete

    Because `-delete' turns on `-depth', the `-prune' action has no effect and files in `dirname/foo' will be deleted too.

    Strange File Names (find -okdir -maxdepth -inum)

    `find' can help you remove or rename a file with strange characters in its name. People are sometimes stymied by files whose names contain characters such as spaces, tabs, control characters, or characters with the high bit set. The simplest way to remove such files is:

    rm -i SOME*PATTERN*THAT*MATCHES*THE*PROBLEM*FILE

    `rm' asks you whether to remove each file matching the given pattern. If you are using an old shell, this approach might not work if the file name contains a character with the high bit set; the shell may strip it off. A more reliable way is:

    find . -maxdepth 1 TESTS -okdir rm '{}' \;

    where TESTS uniquely identify the file. The `-maxdepth 1' option prevents `find' from wasting time searching for the file in any subdirectories; if there are no subdirectories, you may omit it. A good way to uniquely identify the problem file is to figure out its inode number; use

    ls -i

    Suppose you have a file whose name contains control characters, and you have found that its inode number is 12345. This command prompts you for whether to remove it:

    find . -maxdepth 1 -inum 12345 -okdir rm -f '{}' \;

    If you don't want to be asked, perhaps because the file name may contain a strange character sequence that will mess up your screen when printed, then use `-execdir' instead of `-okdir'.

    If you want to rename the file instead, you can use `mv' instead of `rm':

    find . -maxdepth 1 -inum 12345 -okdir mv '{}' NEW-FILE-NAME \;

    Fixing Permissions (find -type -perm)

    Suppose you want to make sure that everyone can write to the directories in a certain directory tree. Here is a way to find directories lacking either user or group write permission (or both), and fix their permissions:

    find . -type d -not -perm -ug=w | xargs chmod ug+w

    You could also reverse the operations, if you want to make sure that directories do _not_ have world write permission.

    Classifying Files (find -perm)

    If you want to classify a set of files into several groups based on different criteria, you can use the comma operator to perform multiple independent tests on the files. Here is an example:

    find / -type d \( -perm -o=w -fprint allwrite , \
    -perm -o=x -fprint allexec \)

    echo "Directories that can be written to by everyone:"
    cat allwrite
    echo ""
    echo "Directories with search permissions for everyone:"
    cat allexec

    `find' has only to make one scan through the directory tree (which is one of the most time consuming parts of its work).

    The tools in the findutils package, and in particular `find', have a large number of options. This means that quite often, there is more than one way to do things. Some of the options and facilities only exist for compatibility with other tools, and findutils provides improved ways of doing things.

    Deleting Files (find -mtime -exec)

    One of the most common tasks that `find' is used for is locating files that can be deleted. This might include:

    * Files last modified more than 3 years ago which haven't been accessed for at least 2 years

    * Files belonging to a certain user

    * Temporary files which are no longer required

    This example concentrates on the actual deletion task rather than on sophisticated ways of locating the files that need to be deleted. We'll assume that the files we want to delete are old files underneath `/var/tmp/stuff'.

    The traditional way to delete files in `/var/tmp/stuff' that have not been modified in over 90 days would have been:

    find /var/tmp/stuff -mtime +90 -exec /bin/rm {} \;

    The above command uses `-exec' to run the `/bin/rm' command to remove each file. This approach works and in fact would have worked in Version 7 Unix in 1979. However, there are a number of problems with this approach.

    The most obvious problem with the approach above is that it causes `find' to fork every time it finds a file that needs to delete, and the child process then has to use the `exec' system call to launch `/bin/rm'. All this is quite inefficient. If we are going to use `/bin/rm' to do this job, it is better to make it delete more than one file at a time.

    The most obvious way of doing this is to use the shell's command expansion feature:

    /bin/rm `find /var/tmp/stuff -mtime +90 -print`

    or you could use the more modern form

    /bin/rm `find /var/tmp/stuff -mtime +90 -print`

    or you could use the more modern form

    /bin/rm $(find /var/tmp/stuff -mtime +90 -print)

    The commands above are much more efficient than the first attempt. However, there is a problem with them. The shell has a maximum command length which is imposed by the operating system (the actual limit varies between systems). This means that while the command expansion technique will usually work, it will suddenly fail when there are lots of files to delete. Since the task is to delete unwanted files, this is precisely the time we don't want things to go wrong.

    Making Use of `xargs' (find xargs)

    So, is there a way to be more efficient in the use of `fork()' and `exec()' without running up against this limit? Yes, we can be almost optimally efficient by making use of the `xargs' command. The `xargs' command reads arguments from its standard input and builds them into command lines. We can use it like this:

    find /var/tmp/stuff -mtime +90 -print | xargs /bin/rm

    For example if the files found by `find' are `/var/tmp/stuff/A', `/var/tmp/stuff/B' and `/var/tmp/stuff/C' then `xargs' might issue the commands

    /bin/rm /var/tmp/stuff/A /var/tmp/stuff/B
    /bin/rm /var/tmp/stuff/C

    The above assumes that `xargs' has a very small maximum command line length. The real limit is much larger but the idea is that `xargs' will run `/bin/rm' as many times as necessary to get the job done, given the limits on command line length.

    This usage of `xargs' is pretty efficient, and the `xargs' command is widely implemented (all modern versions of Unix offer it). So far then, the news is all good. However, there is bad news too.

    Unusual characters in filenames (find -mtime -print xargs)

    Unix-like systems allow any characters to appear in file names with the exception of the ASCII NUL character and the slash. Slashes can occur in path names (as the directory separator) but not in the names of exception of the ASCII NUL character and the slash. Slashes can occur in path names (as the directory separator) but not in the names of actual directory entries. This means that the list of files that `xargs' reads could in fact contain white space characters - spaces, tabs and newline characters. Since by default, `xargs' assumes that the list of files it is reading uses white space as an argument separator, it cannot correctly handle the case where a filename actually includes white space. This makes the default behaviour of `xargs' almost useless for handling arbitrary data.

    To solve this problem, GNU findutils introduced the `-print0' action for `find'. This uses the ASCII NUL character to separate the entries in the file list that it produces. This is the ideal choice of separator since it is the only character that cannot appear within a path name. The `-0' option to `xargs' makes it assume that arguments are separated with ASCII NUL instead of white space. It also turns off another misfeature in the default behaviour of `xargs', which is that it pays attention to quote characters in its input. Some versions of `xargs' also terminate when they see a lone `_' in the input, but GNU `find' no longer does that (since it has become an optional behaviour in the Unix standard).

    So, putting `find -print0' together with `xargs -0' we get this command:

    find /var/tmp/stuff -mtime +90 -print0 | xargs -0 /bin/rm

    The result is an efficient way of proceeding that correctly handles all the possible characters that could appear in the list of files to delete. This is good news. However, there is, as I'm sure you're expecting, also more bad news. The problem is that this is not a portable construct; although other versions of Unix (notably BSD-derived ones) support `-print0', it's not universal. So, is there a more universal mechanism?

    Using `-exec' (find -exec)

    There is indeed a more universal mechanism, which is a slight modification to the `-exec' action. The normal `-exec' action assumes that the command to run is terminated with a semicolon (the semicolon normally has to be quoted in order to protect it from interpretation as the shell command separator). The SVR4 edition of Unix introduced a slight variation, which involves terminating the command with `+' instead: slight variation, which involves terminating the command with `+' instead:

    find /var/tmp/stuff -mtime +90 -exec /bin/rm {} \+

    The above use of `-exec' causes `find' to build up a long command line and then issue it. This can be less efficient than some uses of `xargs'; for example `xargs' allows new command lines to be built up while the previous command is still executing, and allows you to specify a number of commands to run in parallel. However, the `find ... -exec ... +' construct has the advantage of wide portability. GNU findutils did not support `-exec ... +' until version 4.2.12; one of the reasons for this is that it already had the `-print0' action in any case.

    A more secure version of `-exec' (find -exec)

    The command above seems to be efficient and portable. However, within it lurks a security problem. The problem is shared with all the commands we've tried in this worked example so far, too. The security problem is a race condition; that is, if it is possible for somebody to manipulate the filesystem that you are searching while you are searching it, it is possible for them to persuade your `find' command to cause the deletion of a file that you can delete but they normally cannot.

    The problem occurs because the `-exec' action is defined by the POSIX standard to invoke its command with the same working directory as `find' had when it was started. This means that the arguments which replace the {} include a relative path from `find''s starting point down the file that needs to be deleted. For example,

    find /var/tmp/stuff -mtime +90 -exec /bin/rm {} \+

    might actually issue the command:

    /bin/rm /var/tmp/stuff/A /var/tmp/stuff/B /var/tmp/stuff/passwd

    Notice the file `/var/tmp/stuff/passwd'. Likewise, the command:

    cd /var/tmp && find stuff -mtime +90 -exec /bin/rm {} \+

    might actually issue the command:

    /bin/rm stuff/A stuff/B stuff/passwd

    If an attacker can rename `stuff' to something else (making use of their write permissions in `/var/tmp') they can replace it with a symbolic link to `/etc'. That means that the `/bin/rm' command will be invoked on `/etc/passwd'. If you are running your `find' command as root, the attacker has just managed to delete a vital file. All they needed to do to achieve this was replace a subdirectory with a symbolic link at the vital moment.

    There is however, a simple solution to the problem. This is an action which works a lot like `-exec' but doesn't need to traverse a chain of directories to reach the file that it needs to work on. This is the `-execdir' action, which was introduced by the BSD family of operating systems. The command,

    find /var/tmp/stuff -mtime +90 -execdir /bin/rm {} \+

    might delete a set of files by performing these actions:

    1. Change directory to /var/tmp/stuff/foo

    2. Invoke `/bin/rm ./file1 ./file2 ./file3'

    3. Change directory to /var/tmp/stuff/bar

    4. Invoke `/bin/rm ./file99 ./file100 ./file101'

    This is a much more secure method. We are no longer exposed to a race condition. For many typical uses of `find', this is the best strategy. It's reasonably efficient, but the length of the command line is limited not just by the operating system limits, but also by how many files we actually need to delete from each directory.

    Is it possible to do any better? In the case of general file processing, no. However, in the specific case of deleting files it is indeed possible to do better.

    Using the `-delete' action

    The most efficient and secure method of solving this problem is to use the `-delete' action: he most efficient and secure method of solving this problem is to use the `-delete' action:

    find /var/tmp/stuff -mtime +90 -delete

    This alternative is more efficient than any of the `-exec' or `-execdir' actions, since it entirely avoids the overhead of forking a new process and using `exec' to run `/bin/rm'. It is also normally more efficient than `xargs' for the same reason. The file deletion is performed from the directory containing the entry to be deleted, so the `-delete' action has the same security advantages as the `-execdir' action has.

    The `-delete' action was introduced by the BSD family of operating systems.

    Improving things still further

    Is it possible to improve things still further? Not without either modifying the system library to the operating system or having more specific knowledge of the layout of the filesystem and disk I/O subsystem, or both.

    The `find' command traverses the filesystem, reading directories. It then issues a separate system call for each file to be deleted. If we could modify the operating system, there are potential gains that could be made:

    * We could have a system call to which we pass more than one filename for deletion

    * Alternatively, we could pass in a list of inode numbers (on GNU/Linux systems, `readdir()' also returns the inode number of each directory entry) to be deleted.

    The above possibilities sound interesting, but from the kernel's point of view it is difficult to enforce standard Unix access controls for such processing by inode number. Such a facility would probably need to be restricted to the superuser.

    Another way of improving performance would be to increase the parallelism of the process. For example if the directory hierarchy we are searching is actually spread across a number of disks, we might somehow be able to arrange for `find' to process each disk in parallel. re searching is actually spread across a number of disks, we might somehow be able to arrange for `find' to process each disk in parallel. In practice GNU `find' doesn't have such an intimate understanding of the system's filesystem layout and disk I/O subsystem.

    However, since the system administrator can have such an understanding they can take advantage of it like so:

    find /var/tmp/stuff1 -mtime +90 -delete &
    find /var/tmp/stuff2 -mtime +90 -delete &
    find /var/tmp/stuff3 -mtime +90 -delete &
    find /var/tmp/stuff4 -mtime +90 -delete &
    wait

    In the example above, four separate instances of `find' are used to search four subdirectories in parallel. The `wait' command simply waits for all of these to complete. Whether this approach is more or less efficient than a single instance of `find' depends on a number of things:

    * Are the directories being searched in parallel actually on separate disks? If not, this parallel search might just result in a lot of disk head movement and so the speed might even be slower.

    * Other activity - are other programs also doing things on those disks?

    Conclusion

    The fastest and most secure way to delete files with the help of `find' is to use `-delete'. Using `xargs -0 -P N' can also make effective use of the disk, but it is not as secure.

    In the case where we're doing things other than deleting files, the most secure alternative is `-execdir ... +', but this is not as portable as the insecure action `-exec ... +'.

    The `-delete' action is not completely portable, but the only other possibility which is as secure (`-execdir') is no more portable. The most efficient portable alternative is `-exec ...+', but this is insecure and isn't supported by versions of GNU findutils prior to 4.2.12.

    Copying A Subset of Files (find -name -prune -o)

    Suppose you want to copy some files from `/source-dir' to `/dest-dir', but there are a small number of files in `/source-dir' you don't want to copy.

    One option of course is `cp /source-dir /dest-dir' followed by deletion of the unwanted material under `/dest-dir'. But often that can be inconvenient, because for example we would have copied a large amount of extraneous material, or because `/dest-dir' is too small. Naturally there are many other possible reasons why this strategy may be unsuitable.

    So we need to have some way of identifying which files we want to copy, and we need to have a way of copying that file list. The second part of this condition is met by `cpio -p'. Of course, we can identify the files we wish to copy by using `find'. Here is a command that solves our problem:

    cd /source-dir
    find . -name '.snapshot' -prune -o \( \! -name '*~' -print0 \) |
    cpio -pmd0 /dest-dir

    The first part of the `find' command here identifies files or directories named `.snapshot' and tells `find' not to recurse into them (since they do not need to be copied). The combination `-name '.snapshot' -prune' yields false for anything that didn't get pruned, but it is exactly those files we want to copy. Therefore we need to use an OR (`-o') condition to introduce the rest of our expression. The remainder of the expression simply arranges for the name of any file not ending in `~' to be printed.

    Using `-print0' ensures that white space characters in file names do not pose a problem. The `cpio' command does the actual work of copying files. The program as a whole fails if the `cpio' program returns nonzero. If the `find' command returns non-zero on the other hand, the Unix shell will not diagnose a problem (since `find' is not the last command in the pipeline).

    Updating A Timestamp File (find -newer -exec)

    Suppose we have a directory full of files which is maintained with a set of automated tools; perhaps one set of tools updates them and another set of tools uses the result. In this situation, it might be useful for the second set of tools to know if the files have recently been changed. It might be useful, for example, to have a 'timestamp' file which gives the timestamp on the newest file in the collection.

    We can use `find' to achieve this, but there are several different ways to do it.

    Updating the Timestamp The Wrong Way

    The obvious but wrong answer is just to use `-newer':

    find subdir -newer timestamp -exec touch -r {} timestamp \;

    This does the right sort of thing but has a bug. Suppose that two files in the subdirectory have been updated, and that these are called `file1' and `file2'. The command above will update `timestamp' with the modification time of `file1' or that of `file2', but we don't know which one. Since the timestamps on `file1' and `file2' will in general be different, this could well be the wrong value.

    One solution to this problem is to modify `find' to recheck the modification time of `timestamp' every time a file is to be compared against it, but that will reduce the performance of `find'.

    Using the test utility to compare timestamps

    The `test' command can be used to compare timestamps:

    find subdir -exec test {} -nt timestamp \; -exec touch -r {} timestamp \;

    This will ensure that any changes made to the modification time of `timestamp' that take place during the execution of `find' are taken into account. This resolves our earlier problem, but unfortunately this runs much more slowly.

    A combined approach

    We can of course still use `-newer' to cut down on the number of calls to `test':

    find subdir -newer timestamp -and \
         -exec test {} -nt timestamp \; -and \
         -exec touch -r {} timestamp \;

    Here, the `-newer' test excludes all the files which are definitely older than the timestamp, but all the files which are newer than the old value of the timestamp are compared against the current updated timestamp.

    This is indeed faster in general, but the speed difference will depend on how many updated files there are.

    Using `-printf' and `sort' to compare timestamps

    It is possible to use the `-printf' action to abandon the use of `test' entirely:

    newest=$(find subdir -newer timestamp -printf "%A%p\n" |
         sort -n |
         tail -1 |
         cut -d: -f2- )
    touch -r "${newest:-timestamp}" timestamp

    The command above works by generating a list of the timestamps and names of all the files which are newer than the timestamp. The `sort', `tail' and `cut' commands simply pull out the name of the file with the largest timestamp value (that is, the latest file). The `touch' command is then used to update the timestamp,

    The `"${newest:-timestamp}"' expression simply expands to the value of `$newest' if that variable is set, but to `timestamp' otherwise. This ensures that an argument is always given to the `-r' option of the `touch' command.

    This approach seems quite efficient, but unfortunately it has a roblem. Many operating systems now keep file modification time information at a granularity which is finer than one second. Findutils version 4.3.3 and later will print a fractional part with %A@, but older versions will not.

    Solving the problem with `make'

    Another tool which often works with timestamps is `make'. We can use `find' to generate a `Makefile' file on the fly and then use `make' to update the timestamps:

    makefile=$(mktemp)
    find subdir \
         \( \! -xtype l \) \
         -newer timestamp \
         -printf "timestamp:: %p\n\ttouch -r %p timestamp\n\n" > "$makefile"
    make -f "$makefile"
    rm -f "$makefile"

    Unfortunately although the solution above is quite elegant, it fails to cope with white space within file names, and adjusting it to do so would require a rather complex shell script.

    Coping files with odd filenames

    We can fix both of these problems (looping and problems with white space), and do things more efficiently too. The following command works with newlines and doesn't need to sort the list of filenames.

    find subdir -newer timestamp -printf "%A@:%p\0" |

         perl -0 newest.pl |
         xargs --no-run-if-empty --null -i \
    find {} -maxdepth 0 -newer timestamp -exec touch -r {} timestamp \;

    The first `find' command generates a list of files which are newer than the original timestamp file, and prints a list of them with their timestamps. The `newest.pl' script simply filters out all the filenames which have timestamps which are older than whatever the newest file is:

         #! /usr/bin/perl -0
         my @newest = ();
         my $latest_stamp = undef;
         while (<>) {
           my ($stamp, $name) = split(/:/);
           if (!defined($latest_stamp) || ($tstamp > $latest_stamp)) {
         $latest_stamp = $stamp;
         @newest = ();
         }
         if ($tstamp >= $latest_stamp) {
           push @newest, $name;
           }
         }
         print join("\0", @newest);

    This prints a list of zero or more files, all of which are newer than the original timestamp file, and which have the same timestamp as each other, to the nearest second. The second `find' command takes each resulting file one at a time, and if that is newer than the timestamp file, the timestamp is updated.

"Our goal is to increase awareness of open source and to assist everyone in leveraging its economic and social value. Learn how to use open source to create opportunity and wealth."

Website: sarata.com

Telephone: (877) 822-1208

Email: support@sarata.com