file(1posix)


NAME

   file — determine file type

SYNOPSIS

   file [−dh] [−M file] [−m file] file...

   file −i [−h] file...

DESCRIPTION

   The file utility shall perform a series of tests in  sequence  on  each
   specified file in an attempt to classify it:

    1. If  file  does  not exist, cannot be read, or its file status could
       not be determined, the output shall  indicate  that  the  file  was
       processed, but that its type could not be determined.

    2. If  the  file  is  not  a  regular  file,  its  file  type shall be
       identified.  The file types directory, FIFO, socket, block special,
       and   character   special   shall  be  identified  as  such.  Other
       implementation-defined file types may also be identified.  If  file
       is  a symbolic link, by default the link shall be resolved and file
       shall test the type of file referenced by the symbolic  link.  (See
       the −h and −i options below.)

    3. If  the  length of file is zero, it shall be identified as an empty
       file.

    4. The file utility shall examine an initial segment of file and shall
       make  a  guess  at  identifying  its  contents  based  on position-
       sensitive tests. (The answer is not guaranteed to be  correct;  see
       the −d, −M, and −m options below.)

    5. The file utility shall examine file and make a guess at identifying
       its contents based on context-sensitive default system tests.  (The
       answer is not guaranteed to be correct.)

    6. The file shall be identified as a data file.

   If file does not exist, cannot be read, or its file status could not be
   determined, the output shall indicate that the file was processed,  but
   that its type could not be determined.

   If  file  is a symbolic link, by default the link shall be resolved and
   file shall test the type of file referenced by the symbolic link.

OPTIONS

   The file utility shall  conform  to  the  Base  Definitions  volume  of
   POSIX.1‐2008,  Section 12.2, Utility Syntax Guidelines, except that the
   order of the −m, −d, and −M options shall be significant.

   The following options shall be supported by the implementation:

   −d        Apply  any  position-sensitive  default  system   tests   and
             context-sensitive  default  system tests to the file. This is
             the default if no −M or −m option is specified.

   −h        When a symbolic link is encountered, identify the file  as  a
             symbolic  link. If −h is not specified and file is a symbolic
             link that refers to a nonexistent file, file  shall  identify
             the file as a symbolic link, as if −h had been specified.

   −i        If  a  file is a regular file, do not attempt to classify the
             type of the file further, but identify the file as  specified
             in the STDOUT section.

   −M file   Specify  the  name  of  a  file containing position-sensitive
             tests that shall be applied to a file in order to classify it
             (see the EXTENDED DESCRIPTION). No position-sensitive default
             system tests nor context-sensitive default system tests shall
             be applied unless the −d option is also specified.

   −m file   Specify  the  name  of  a  file containing position-sensitive
             tests that shall be applied to a file in order to classify it
             (see the EXTENDED DESCRIPTION).

   If  the  −m option is specified without specifying the −d option or the
   −M option, position-sensitive default system  tests  shall  be  applied
   after  the  position-sensitive tests specified by the −m option. If the
   −M option is specified with the −d option, the −m option, or  both,  or
   the −m option is specified with the −d option, the concatenation of the
   position-sensitive tests specified by these options shall be applied in
   the  order  specified by the appearance of these options. If a −M or −m
   file option-argument is −, the results are unspecified.

OPERANDS

   The following operand shall be supported:

   file      A pathname of a file to be tested.

STDIN

   The standard input shall be used if a  file  operand  is  ''  and  the
   implementation  treats  the  '' as meaning standard input.  Otherwise,
   the standard input shall not be used.

INPUT FILES

   The file can be any file type.

ENVIRONMENT VARIABLES

   The following environment variables shall affect the execution of file:

   LANG      Provide  a  default  value   for   the   internationalization
             variables  that  are unset or null. (See the Base Definitions
             volume of  POSIX.1‐2008,  Section  8.2,  Internationalization
             Variables   for   the   precedence   of  internationalization
             variables used to determine the values of locale categories.)

   LC_ALL    If set to a non-empty string value, override  the  values  of
             all the other internationalization variables.

   LC_CTYPE  Determine  the  locale for the interpretation of sequences of
             bytes of text data as characters (for example, single-byte as
             opposed  to  multi-byte  characters  in  arguments  and input
             files).

   LC_MESSAGES
             Determine the locale that should be used to affect the format
             and contents of diagnostic messages written to standard error
             and informative messages written to standard output.

   NLSPATH   Determine the location of message catalogs for the processing
             of LC_MESSAGES.

ASYNCHRONOUS EVENTS

   Default.

STDOUT

   In  the  POSIX  locale,  the following format shall be used to identify
   each operand, file specified:

       "%s: %s\n", <file>, <type>

   The values for <type> are unspecified, except that in the POSIX locale,
   if  file  is  identified  as  one  of the types listed in the following
   table, <type> shall contain (but is not limited to)  the  corresponding
   string,  unless  the  file  is  identified by a position-sensitive test
   specified by a −M or −m option. Each <space> shown in the strings shall
   be exactly one <space>.

                   Table 4-9: File Utility Output StringsIf file is:                    <type> shall contain the string:   Notes│       │
 Nonexistent                                    cannot open                             │       │
   │                                             │                                  │       │
   │Block special                                │ block special                    │ 1     │
   │Character special                            │ character special                │ 1     │
   │Directory                                    │ directory                        │ 1     │
   │FIFO                                         │ fifo                             │ 1     │
   │Socket                                       │ socket                           │ 1     │
   │Symbolic link                                │ symbolic link to                 │ 1     │
   │Regular file                                 │ regular file                     │ 1,2   │
   │Empty regular file                           │ empty                            │ 3     │
   │Regular file that cannot be read             │ cannot open                      │ 3     │
   │                                             │                                  │       │
   │Executable binary                            │ executable                       │ 3,4,6 │
   │ar archive library (see ar)                  │ archive                          │ 3,4,6 │
   │Extended cpio format (see pax)               │ cpio archive                     │ 3,4,6 │
   │Extended tar format (see ustar in pax)       │ tar archive                      │ 3,4,6 │
   │                                             │                                  │       │
   │Shell script                                 │ commands text                    │ 3,5,6 │
   │C-language source                            │ c program text                   │ 3,5,6 │
   │FORTRAN source                               │ fortran program text             │ 3,5,6 │
   │                                             │                                  │       │
   │Regular file whose type cannot be determined │ data                             │ 3     │
   └─────────────────────────────────────────────┴──────────────────────────────────┴───────┘
   Notes:

              1. This is a file type test.

              2. This test is applied only if the −i option is specified.

              3. This  test  is  applied  only  if  the  −i  option is not
                 specified.

              4. This is a position-sensitive default system test.

              5. This is a context-sensitive default system test.

              6. Position-sensitive  default  system  tests  and  context-
                 sensitive  default system tests are not applied if the −M
                 option  is  specified  unless  the  −d  option  is   also
                 specified.

   In  the POSIX locale, if file is identified as a symbolic link (see the
   −h option), the following alternative output format shall be used:

       "%s: %s %s\n", <file>, <type>, <contents of link>"

   If the file named by the file operand does not exist, cannot  be  read,
   or the type of the file named by the file operand cannot be determined,
   this shall not be considered an error that affects the exit status.

STDERR

   The standard error shall be used only for diagnostic messages.

OUTPUT FILES

   None.

EXTENDED DESCRIPTION

   A file specified as an option-argument to the −m or  −M  options  shall
   contain one position-sensitive test per line, which shall be applied to
   the file. If the test succeeds, the message field of the line shall  be
   printed  and no further tests shall be applied, with the exception that
   tests on immediately  following  lines  beginning  with  a  single  '>'
   character shall be applied.

   Each  line  shall  be  composed  of  the following four <tab>-separated
   fields. (Implementations may allow  any  combination  of  one  or  more
   white-space   characters   other   than   <newline>  to  act  as  field
   separators.)

   offset    An unsigned number  (optionally  preceded  by  a  single  '>'
             character)  specifying  the offset, in bytes, of the value in
             the file that is to be compared against the  value  field  of
             the  line.  If the file is shorter than the specified offset,
             the test shall fail.

             If the  offset  begins  with  the  character  '>',  the  test
             contained in the line shall not be applied to the file unless
             the test on the last line for which the offset did not  begin
             with  a  '>'  was successful. By default, the offset shall be
             interpreted as an unsigned decimal number. With a leading  0x
             or  0X,  the  offset  shall  be  interpreted as a hexadecimal
             number; otherwise, with a leading  0,  the  offset  shall  be
             interpreted as an octal number.

   type      The  type  of  the  value  in the file to be tested. The type
             shall consist of the type specification characters d, s,  and
             u,  specifying  signed decimal, string, and unsigned decimal,
             respectively.

             The type string shall be interpreted as the  bytes  from  the
             file  starting at the specified offset and including the same
             number of bytes specified by the value field. If insufficient
             bytes  remain  in the file past the offset to match the value
             field, the test shall fail.

             The type specification characters d and u can be followed  by
             an  optional  unsigned  decimal  integer  that  specifies the
             number  of  bytes  represented  by   the   type.   The   type
             specification  characters  d  and  u  can  be  followed by an
             optional C, S, I, or L, indicating that the value is of  type
             char, short, int, or long, respectively.

             The   default   number  of  bytes  represented  by  the  type
             specifiers d, f, and u shall correspond to  their  respective
             C-language types as follows. If the system claims conformance
             to  the  C-Language  Development  Utilities   option,   those
             specifiers  shall correspond to the default sizes used in the
             c99  utility.  Otherwise,  the   default   sizes   shall   be
             implementation-defined.

             For the type specifier characters d and u, the default number
             of bytes shall correspond to the size of a basic integer type
             of  the  implementation.  For these specifier characters, the
             implementation shall support values of the optional number of
             bytes to be converted corresponding to the number of bytes in
             the C-language  types  char,  short,  int,  or  long.   These
             numbers  can  also  be  specified  by  an  application as the
             characters C, S, I, and L, respectively. The byte order  used
             when  interpreting  numeric values is implementation-defined,
             but shall correspond to the order in which a constant of  the
             corresponding type is stored in memory on the system.

             All  type specifiers, except for s, can be followed by a mask
             specifier of the form &number. The mask value shall be AND'ed
             with  the  value of the input file before the comparison with
             the value field of the line is made.  By  default,  the  mask
             shall  be  interpreted  as an unsigned decimal number. With a
             leading 0x or  0X,  the  mask  shall  be  interpreted  as  an
             unsigned hexadecimal number; otherwise, with a leading 0, the
             mask shall be interpreted as an unsigned octal number.

             The strings byte, short,  long,  and  string  shall  also  be
             supported  as  type  fields, being interpreted as dC, dS, dL,
             and s, respectively.

   value     The value to be compared with the value from the file.

             If the specifier from the type field is  s  or  string,  then
             interpret the value as a string. Otherwise, interpret it as a
             number. If the value is a string, then the test shall succeed
             only  when  a string value exactly matches the bytes from the
             file.

             If the value is  a  string,  it  can  contain  the  following
             sequences:

             \character  The  <backslash>-escape sequences as specified in
                         the  Base  Definitions  volume  of  POSIX.1‐2008,
                         Table   5-1,   Escape  Sequences  and  Associated
                         Actions ('\\',  '
',  '	',  '\f',  '\n',  '\r',
                         '\t',  '\v').   In  addition, the escape sequence
                         '\ ' (the <backslash>  character  followed  by  a
                         <space>   character)   shall   be  recognized  to
                         represent a <space>  character.  The  results  of
                         using  any  other  character, other than an octal
                         digit, following the <backslash> are unspecified.

             \octal      Octal sequences that can  be  used  to  represent
                         characters  with  specific coded values. An octal
                         sequence shall consist of a <backslash>  followed
                         by  the  longest  sequence  of one, two, or three
                         octal-digit characters (01234567).

             By  default,  any  value  that  is  not  a  string  shall  be
             interpreted  as a signed decimal number. Any such value, with
             a leading 0x or 0X,  shall  be  interpreted  as  an  unsigned
             hexadecimal number; otherwise, with a leading zero, the value
             shall be interpreted as an unsigned octal number.

             If the value is not  a  string,  it  can  be  preceded  by  a
             character   indicating   the   comparison  to  be  performed.
             Permissible characters and the comparisons they  specify  are
             as follows:

             =     The  test  shall  succeed  if  the  value from the file
                   equals the value field.

             <     The test shall succeed if the value from  the  file  is
                   less than the value field.

             >     The  test  shall  succeed if the value from the file is
                   greater than the value field.

             &     The test shall succeed if all of the set  bits  in  the
                   value field are set in the value from the file.

             ^     The  test shall succeed if at least one of the set bits
                   in the value field is not set in  the  value  from  the
                   file.

             x     The  test  shall succeed if the file is large enough to
                   contain a value of the type specified starting  at  the
                   offset specified.

   message   The  message  to be printed if the test succeeds. The message
             shall be  interpreted  using  the  notation  for  the  printf
             formatting specification; see printf.  If the value field was
             a string, then the value from the file shall be the  argument
             for the printf formatting specification; otherwise, the value
             from the file shall be the argument.

EXIT STATUS

   The following exit values shall be returned:

    0    Successful completion.

   >0    An error occurred.

CONSEQUENCES OF ERRORS

   Default.

   The following sections are informative.

APPLICATION USAGE

   The file utility can only be required to guess  at  many  of  the  file
   types  because  only  exhaustive  testing can determine some types with
   certainty. For example, binary data on some implementations might match
   the initial segment of an executable or a tar archive.

   Note  that  the  table  indicates  that  the output contains the stated
   string.  Systems  may  add  text  before  or  after  the  string.   For
   executables,  as an example, the machine architecture and various facts
   about how the file was link-edited may be included. Note also  that  on
   systems  that  recognize  shell  script  files  starting  with  "#!" as
   executable files, these may be identified as  executable  binary  files
   rather than as shell scripts.

EXAMPLES

   Determine whether an argument is a binary executable file:

       file −− "$1" | grepq ':.*executable' &&
           printf "%s is executable.\n$1"

RATIONALE

   The  −f  option was omitted because the same effect can (and should) be
   obtained using the xargs utility.

   Historical versions  of  the  file  utility  attempt  to  identify  the
   following  types of files: symbolic link, directory, character special,
   block special, socket, tar archive, cpio archive, SCCS archive, archive
   library,  empty,  compress  output, pack output, binary data, C source,
   FORTRAN source,  assembler  source,  nroff/troff/eqn/tbl  source  troff
   output, shell script, C shell script, English text, ASCII text, various
   executables, APL  workspace,  compiled  terminfo  entries,  and  CURSES
   screen  images.  Only those types that are reasonably well specified in
   POSIX or are directly related to POSIX  utilities  are  listed  in  the
   table.

   Historical  systems have used a ``magic file'' named /etc/magic to help
   identify file types. Because it  is  generally  useful  for  users  and
   scripts  to  be  able to identify special file types, the −m flag and a
   portable format for user-created magic files  has  been  specified.  No
   requirement  is  made that an implementation of file use this method of
   identifying files, only that  users  be  permitted  to  add  their  own
   classifying tests.

   In  addition, three options have been added to historical practice. The
   −d flag has been added to permit users to cause their tests  to  follow
   any default system tests. The −i flag has been added to permit users to
   test portably for regular files in shell scripts. The −M flag has  been
   added to permit users to ignore any default system tests.

   The   POSIX.1‐2008   description   of  default  system  tests  and  the
   interaction between the −d, −M, and −m options did not clearly indicate
   that  there were two types of ``default system tests''. The ``position-
   sensitive tests'' determine file types by looking for certain string or
   binary  values  at  specific  offsets in the file being examined. These
   position-sensitive tests were implemented in historical  systems  using
   the magic file described above.  Some of these tests are now built into
   the file utility itself on  some  implementations  so  the  output  can
   provide more detail than can be provided by magic files. For example, a
   magic file can easily identify a core file on most implementations, but
   cannot  name the program file that dropped the core. A magic file could
   produce output such as:

       /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1

   but by building the test into the file utility, you  could  get  output
   such as:

       /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'

   These  extended  built-in  tests  are  still to be treated as position-
   sensitive  default  system  tests  even  if  they  are  not  listed  in
   /etc/magic or any other magic file.

   The  context-sensitive  default system tests were always built into the
   file utility. These tests looked for language constructs in text  files
   trying  to  identify  shell  scripts,  C,  FORTRAN,  and other computer
   language source files, and even plain text files. With the addition  of
   the  −m  and  −M options the distinction between position-sensitive and
   context-sensitive default system tests  became  important  because  the
   order  of  testing  is  important. The context-sensitive system default
   tests should never be applied before any position-sensitive tests  even
   if  the  −d  option is specified before a −m option or −M option due to
   the high probability that the context-sensitive  system  default  tests
   will  incorrectly  identify  arbitrary  text files as text files before
   position-sensitive tests specified by the −m  or  −M  option  would  be
   applied to give a more accurate identification.

   Leaving  the  meaning  of  −M − and −m − unspecified allows an existing
   prototype of  these  options  to  continue  to  work  in  a  backwards-
   compatible manner. (In that implementation, −M − was roughly equivalent
   to −d in POSIX.1‐2008.)

   The historical −c option was omitted  as  not  particularly  useful  to
   users   or   portable   shell   scripts.   In  addition,  a  reasonable
   implementation of the file utility would report any errors  found  each
   time the magic file is read.

   The  historical format of the magic file was the same as that specified
   by the Rationale in  the  ISO POSIX‐2:1993  standard  for  the  offset,
   value,  and  message  fields; however, it used less precise type fields
   than the format specified by the current normative text. The  new  type
   field values are a superset of the historical ones.

   The following is an example magic file:

       0  short     070707              cpio archive
       0  short     0143561             Byte-swapped cpio archive
       0  string    070707              ASCII cpio archive
       0  long      0177555             Very old archive
       0  short     0177545             Old archive
       0  short     017437              Old packed data
       0  string    \037\036            Packed data
       0  string    \377\037            Compacted data
       0  string    \037\235            Compressed data
       >2 byte&0x80 >0                  Block compressed
       >2 byte&0x1f x                   %d bits
       0  string    \032\001            Compiled Terminfo Entry
       0  short     0433                Curses screen image
       0  short     0434                Curses screen image
       0  string    <ar>                System V Release 1 archive
       0  string    !<arch>\n__.SYMDEF  Archive random library
       0  string    !<arch>             Archive
       0  string    ARF_BEGARF          PHIGS clear text archive
       0  long      0x137A2950          Scalable OpenFont binary
       0  long      0x137A2951          Encrypted scalable OpenFont binary

   The  use  of  a  basic  integer  data  type  is  intended  to allow the
   implementation to choose a word size commonly used by  applications  on
   that architecture.

   Earlier  versions  of  this  standard  allowed for implementations with
   bytes other than eight  bits,  but  this  has  been  modified  in  this
   version.

FUTURE DIRECTIONS

   None.

SEE ALSO

   ar, ls, pax, printf

   The   Base  Definitions  volume  of  POSIX.1‐2008,  Table  5-1,  Escape
   Sequences and Associated Actions,  Chapter  8,  Environment  Variables,
   Section 12.2, Utility Syntax Guidelines

COPYRIGHT

   Portions  of  this text are reprinted and reproduced in electronic form
   from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
   --  Portable  Operating  System  Interface (POSIX), The Open Group Base
   Specifications  Issue  7,  Copyright  (C)  2013  by  the  Institute  of
   Electrical and Electronics Engineers, Inc and The Open Group.  (This is
   POSIX.1-2008 with the 2013 Technical Corrigendum  1  applied.)  In  the
   event of any discrepancy between this version and the original IEEE and
   The Open Group Standard, the original IEEE and The Open Group  Standard
   is  the  referee document. The original Standard can be obtained online
   at http://www.unix.org/online.html .

   Any typographical or formatting errors that appear  in  this  page  are
   most likely to have been introduced during the conversion of the source
   files   to   man   page   format.   To   report   such   errors,    see
   https://www.kernel.org/doc/man-pages/reporting_bugs.html .





Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.





Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.


Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.





Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.


Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.





Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.


Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.