states - awk alike text processing tool


   states  [-hvV]  [-D  var=val]  [-f  file] [-o outputfile] [-p path] [-s
   startstate] [-W level] [filename ...]


   States is an awk-alike text processing tool  with  some  state  machine
   extensions.  It is designed for program source code highlighting and to
   similar tasks where state information helps input processing.

   At a single point of time, States is in one state, each  quite  similar
   to  awk's  work  environment,  they  have regular expressions which are
   matched from the input and actions which are executed when a  match  is
   found.   From  the action blocks, states can perform state transitions;
   it can move to another state from which the  processing  is  continued.
   State  transitions  are  recorded  so  states can return to the calling
   state once the current state has finished.

   The biggest difference between states and awk,  besides  state  machine
   extensions,  is  that  states is not line-oriented.  It matches regular
   expression tokens from the input and once  a  match  is  processed,  it
   continues  processing from the current position, not from the beginning
   of the next input line.


   -D var=val, --define=var=val
           Define variable var to have string  value  val.   Command  line
           definitions  overwrite  variable  definitions  found  from  the
           config file.

   -f file, --file=file
           Read state definitions from file file.  As  a  default,  states
           tries  to  read  state  definitions  from file in the
           current working directory.

   -h, --help
           Print short help message and exit.

   -o file, --output=file
           Save output to file file instead of printing it to stdout.

   -p path, --path=path
           Set the load path to path.   The  load  path  defaults  to  the
           directory, from which the state definitions file is loaded.

   -s state, --state=state
           Start  execution  from state state.  This definition overwrites
           start state resolved from the start block.

   -v, --verbose
           Increase the program verbosity.

   -V, --version
           Print states version and exit.

   -W level, --warning=level
           Set the warning level to level.  Possible values for level are:

           light   light warnings (default)

           all     all warnings


   States program  files  can  contain  on  start  block,  startrules  and
   namerules  blocks  to  specify the initial state, state definitions and

   The start block is the main() of the states program, it is executed  on
   script   startup   for   each   input  file  and  it  can  perform  any
   initialization  the  script  needs.   It  normally   also   calls   the
   check_startrules()  and  check_namerules() primitives which resolve the
   initial state from the input file name  or  the  data  found  from  the
   beginning  of  the  input  file.   Here  is  a sample start block which
   initializes two variables and does the standard start state resolving:

            a = 1;
            msg = "Hello, world!";
            check_startrules ();
            check_namerules ();

   Once the start block is processed, the input  processing  is  continued
   from the initial state.

   The  initial  state  is  resolved  by  the  information  found from the
   startrules  and  namerules  blocks.   Both   blocks   contain   regular
   expression  - symbol pairs, when the regular expression is matched from
   the name of from the beginning of the input file, the initial state  is
   named  by  the  corresponding symbol.  For example, the following start
   and name rules can distinguish C and Fortran files:

            /\.(c|h)$/    c;
            /\.[fF]$/     fortran;

            /-\*- [cC] -\*-/      c;
            /-\*- fortran -\*-/   fortran;

   If these rules are used with the previously shown start  block,  states
   first  check  the beginning of input file.  If it has string -*- c -*-,
   the file is assumed to contain C code and  the  processing  is  started
   from state called c.  If the beginning of the input file has string -*-
   fortran -*-, the initial state is fortran.  If none of the start  rules
   matched,  the name of the input file is matched with the namerules.  If
   the name ends to suffix c or C, we go to state c.  If the suffix  is  f
   or F, the initial state is fortran.

   If  both start and name rules failed to resolve the start state, states
   just copies its input to output unmodified.

   The start state can also be specified from the command line with option
   -s, --state.

   State definitions have the following syntax:

   state { expr {statements} ... }

   where  expr  is: a regular expression, special expression or symbol and
   statements is a list  of  statements.   When  the  expression  expr  is
   matched from the input, the statement block is executed.  The statement
   block can call states' primitives, user-defined subroutines, call other
   states,  etc.   Once  the  block  is  executed, the input processing is
   continued from the current  intput  position  (which  might  have  been
   changed if the statement block called other states).

   Special  expressions  BEGIN  and  END can be used in the place of expr.
   Expression BEGIN matches the beginning  of  the  state,  its  block  is
   called  when  the  state is entered.  Expression END matches the end of
   the state, its block is executed when states leaves the state.

   If expr is a symbol, its value is looked up from the global environment
   and  if  it  is  a  regular  expression,  it  is  matched to the input,
   otherwise that rule is ignored.

   The states program file can also have top-level expressions,  they  are
   evaluated  after  the program file is parsed but before any input files
   are processed or the start block is evaluated.


   call (symbol)
           Move to state symbol and continue input  file  processing  from
           that  state.   Function  returns  whatever  the  symbol state's
           terminating return statement returned.

   calln (name)
           Like call but the argument name is evaluated and its value must
           be  string.   For  example, this function can be used to call a
           state which name is stored to a variable.

   check_namerules ()
           Try to resolve start  state  from  namerules  rules.   Function
           returns 1 if start state was resolved or 0 otherwise.

   check_startrules ()
           Try  to  resolve  start  state from startrules rules.  Function
           returns 1 if start state was resolved or 0 otherwise.

   concat (str, ...)
           Concanate argument strings and return result as a new string.

   float (any)
           Convert argument to a floating point number.

   getenv (str)
           Get value of environment variable str.  Returns an empty string
           if variable var is undefined.

   int (any)
           Convert argument to an integer number.

   length (item, ...)
           Count the length of argument strings or lists.

   list (any, ...)
           Create a new list which contains items any, ...

   panic (any, ...)
           Report   a  non-recoverable  error  and  exit  with  status  1.
           Function never returns.

   print (any, ...)
           Convert arguments to strings and print them to the output.

   range (source, start, end)
           Return a sub-range  of  source  starting  from  position  start
           (inclusively)  to  end  (exclusively).   Argument source can be
           string or list.

   regexp (string)
           Convert string string to a new regular expression.

   regexp_syntax (char, syntax)
           Modify regular expression character syntaxes by  assigning  new
           syntax  syntax  for character char.  Possible values for syntax

           'w'     character is a word constituent

           ' '     character isn't a word constituent

   regmatch (string, regexp)
           Check if  string  string  matches  regular  expression  regexp.
           Functions  returns  a  boolean  success  status  and  sets sub-
           expression registers $n.

   regsub (string, regexp, subst)
           Search regular expression regexp from string string and replace
           the   matching   substring  with  string  subst.   Returns  the
           resulting string.  The substitution string subst can contain $n
           references to the n:th parenthesized sup-expression.

   regsuball (string, regexp, subst)
           Like  regsub  but  replace  all  matches  of regular expression
           regexp from string string with string subst.

   require_state (symbol)
           Check that the state symbol is defined.  If the required  state
           is  undefined,  the  function  tries  to  autoload  it.  If the
           loading  fails,  the  program  will  terminate  with  an  error

   split (regexp, string)
           Split  string  string  to  list  considering matches of regular
           rexpression regexp as item separator.

   sprintf (fmt, ...)
           Format arguments according  to  fmt  and  return  result  as  a

   strcmp (str1, str2)
           Perform a case-sensitive comparision for strings str1 and str2.
           Function returns a value that is:

           -1      string str1 is less than str2

           0       strings are equal

           1       string str1 is greater than str2

   string (any)
           Convert argument to string.

   strncmp (str1, str2, num)
           Perform a case-sensitive comparision for strings str1 and  str2
           comparing at maximum num characters.

   substring (str, start, end)
           Return  a  substring of string str starting from position start
           (inclusively) to end (exclusively).


   $.      current input line number

   $n      the n:th parenthesized regular expression  sub-expression  from
           the  latest  state  regular  expression  or  from  the regmatch

   $`      everything before the matched  regular  rexpression.   This  is
           usable  when  used with the regmatch primitive; the contents of
           this variable is undefined when used in action blocks to  refer
           the data before the block's regular expression.

   $B      an alias for $`

   argv    list of input file names

           name of the current input file

   program name of the program (usually states)

   version program version string


   /usr/share/enscript/hl/*.st             enscript's states definitions


   awk(1), enscript(1)


   Markku Rossi <> <>

   GNU Enscript WWW home page: <>


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.

Free Software

Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.

Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.

Free Books

The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.

Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.

Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.