hunspell(3)


NAME

   hunspell  -  spell  checking,  stemming,  morphological  generation and
   analysis

SYNOPSIS

   #include <hunspell.hxx> /* or */
   #include <hunspell.h>

   Hunspell(const char *affpath, const char *dpath);

   Hunspell(const char *affpath, const char *dpath, const char * key);

   ~Hunspell();

   int add_dic(const char *dpath);

   int add_dic(const char *dpath, const char *key);

   int spell(const char *word);

   int spell(const char *word, int *info, char **root);

   int suggest(char***slst, const char *word);

   int analyze(char***slst, const char *word);

   int stem(char***slst, const char *word);

   int stem(char***slst, char **morph, int n);

   int generate(char***slst, const char *word, const char *word2);

   int generate(char***slst, const char *word, char **desc, int n);

   void free_list(char ***slst, int n);

   int add(const char *word);

   int add_with_affix(const char *word, const char *example);

   int remove(const char *word);

   char * get_dic_encoding();

   const char * get_wordchars();

   unsigned short * get_wordchars_utf16(int *len);

   struct cs_info * get_csconv();

   const char * get_version();

DESCRIPTION

   The Hunspell library  routines  give  the  user  word-level  linguistic
   functions:  spell  checking  and  correction,  stemming,  morphological
   generation and analysis in item-and-arrangement style.

   The optional C header contains the C interface of the C++ library  with
   Hunspell_create and Hunspell_destroy constructor and destructor, and an
   extra  HunHandle  parameter  (the  allocated  object)  in  the  wrapper
   functions (see in the C header file hunspell.h).

   The  basic  spelling  functions,  spell() and suggest() can be used for
   stemming, morphological generation and analysis by XML input texts (see
   XML API).

   Constructor and destructor
   Hunspell's  constructor  needs paths of the affix and dictionary files.
   (In WIN32 environment, use UTF-8 encoded paths started  with  the  long
   path  prefix  \\?\  to handle system-independent character encoding and
   very long path names, too.)  See the hunspell(4) manual  page  for  the
   dictionary   format.    Optional  key  parameter  is  for  dictionaries
   encrypted by the hzip tool of the Hunspell distribution.

   Extra dictionaries
   The add_dic() function  load  an  extra  dictionary  file.   The  extra
   dictionaries  use  the  affix  file  of  the allocated Hunspell object.
   Maximal number of the extra dictionaries is limited in the source  code
   (20).

   Spelling and correction
   The  spell() function returns non-zero, if the input word is recognised
   by the spell checker, and a  zero  value  if  not.  Optional  reference
   variables  return  a  bit  array  (info) and the root word of the input
   word.  Info bits checked with the  SPELL_COMPOUND,  SPELL_FORBIDDEN  or
   SPELL_WARN  macros sign compound words, explicit forbidden and probably
   bad words.  From version 1.3, the non-zero return value is  2  for  the
   dictionary words with the flag "WARN" (probably bad words).

   The  suggest()  function has two input parameters, a reference variable
   of the output suggestion list, and an input word. The function  returns
   the  number of the suggestions. The reference variable will contain the
   address of the newly allocated suggestion list or NULL, if  the  return
   value  of  suggest()  is  zero.  Maximal  number  of the suggestions is
   limited in the source code.

   The spell() and suggest() can recognize XML  input,  see  the  XML  API
   section.

   Morphological functions
   The  plain stem() and analyze() functions are similar to the suggest(),
   but  instead  of  suggestions,  return  stems  and   results   of   the
   morphological  analysis. The plain generate() waits a second word, too.
   This  extra  word  and  its  affixation  will  be  the  model  of   the
   morphological generation of the requested forms of the first word.

   The  extended  stem() and generate() use the results of a morphological
   analysis:

          char ** result, result2;
          int n1 = analyze(&result, "words");
          int n2 = stem(&result2, result, n1);

   The morphological annotation of the Hunspell  library  has  fixed  (two
   letter and a colon) field identifiers, see the hunspell(4) manual page.

          char ** result;
          char * affix = "is:plural"; // description depends from dictionaries, too
          int n = generate(&result, "word", &affix, 1);
          for (int i = 0; i < n; i++) printf("%s\n", result[i]);

   Memory deallocation
   The  free_list()  function  frees  the  memory  allocated by suggest(),
   analyze, generate and stem() functions.

   Other functions
   The add(), add_with_affix() and remove()  are  helper  functions  of  a
   personal  dictionary  implementation  to  add and remove words from the
   base dictionary in run-time. The add_with_affix() uses a second word as
   a model of the enabled affixation of the new word.

   The  get_dic_encoding()  function  returns "ISO8859-1" or the character
   encoding defined in the affix file with the "SET" keyword.

   The get_csconv() function returns the 8-bit character case table of the
   encoding of the dictionary.

   The  get_wordchars()  and  get_wordchars_utf16()  return the extra word
   characters definied in affix file for tokenization by  the  "WORDCHARS"
   keyword.

   The get_version() returns the version string of the library.

   XML API
   The   spell()   function  returns  non-zero  for  the  "<?xml?>"  input
   indicating the XML API support.

   The suggest() function stems, analyzes and generates the forms  of  the
   input word, if it was added by one of the following "SPELLML" syntaxes:

          <?xml?>
          <query type="analyze">
          <word>dogs</word>
          </query>

          <?xml?>
          <query type="stem">
          <word>dogs</word>
          </query>

          <?xml?>
          <query type="generate">
          <word>dog</word>
          <word>cats</word>
          </query>

          <?xml?>
          <query type="generate">
          <word>dog</word>
          <code><a>is:pl</a><a>is:poss</a></code>
          </query>

   The  outputs  of  the type="stem" query and the stem() library function
   are the same. The output  of  the  type="analyze"  query  is  a  string
   contained  a <code><a>result1</a><a>result2</a>...</code> element. This
   element can be used in the second syntax of the type="generate" query.

EXAMPLE

   See analyze.cxx in the Hunspell distribution.

AUTHORS

   Hunspell   based   on   Ispell's   spell   checking   algorithms    and
   OpenOffice.org's Myspell source code.

   Author of International Ispell is Geoff Kuenning.

   Author of MySpell is Kevin Hendricks.

   Author of Hunspell is Lszl Nmeth.

   Author of the original C API is Caolan McNamara.

   Author  of the Aspell table-driven phonetic transcription algorithm and
   code is Bjrn Jacke.

   See also THANKS and Changelog files of Hunspell distribution.

                              2014-05-26                       hunspell(3)





Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.





Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.


Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.





Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.


Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.





Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.


Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.