compseq(1e)


NAME

   compseq - Calculate the composition of unique words in sequences

SYNOPSIS

   compseq -sequence seqall [-infile infile] -word integer
           [-frame integer] -ignorebz boolean -reverse boolean
           [-calcfreq boolean] -outfile outfile [-zerocount boolean]

   compseq -help

DESCRIPTION

   compseq is a command line program from EMBOSS ("the European Molecular
   Biology Open Software Suite"). It is part of the
   "Nucleic:Composition,Protein:Composition" command group(s).

OPTIONS

   Input section
   -sequence seqall

   -infile infile
       This is a file previously produced by 'compseq' that can be used to
       set the expected frequencies of words in this analysis. The word
       size in the current run must be the same as the one in this results
       file. Obviously, you should use a file produced from protein
       sequences if you are counting protein sequence word frequencies,
       and you must use one made from nucleotide frequencies if you are
       analysing a nucleotide sequence.

   Required section
   -word integer
       This is the size of word (n-mer) to count. Thus if you want to
       count codon frequencies for a nucleotide sequence, you should enter
       3 here. Default value: 2

   Additional section
   -frame integer
       The normal behaviour of 'compseq' is to count the frequencies of
       all words that occur by moving a window of length 'word' up by one
       each time. This option allows you to move the window up by the
       length of the word each time, skipping over the intervening words.
       You can count only those words that occur in a single frame of the
       word by setting this value to a number other than zero. If you set
       it to 1 it will only count the words in frame 1, 2 will only count
       the words in frame 2 and so on.

   -ignorebz boolean
       The amino acid code B represents Asparagine or Aspartic acid and
       the code Z represents Glutamine or Glutamic acid. These are not
       commonly used codes and you may wish not to count words containing
       them, just noting them in the count of 'Other' words. Default
       value: Y

   -reverse boolean
       Set this to be true if you also wish to also count words in the
       reverse complement of a nucleic sequence. Default value: N

   -calcfreq boolean
       If this is set true then the expected frequencies of words are
       calculated from the observed frequency of single bases or residues
       in the sequences. If you are reporting a word size of 1 (single
       bases or residues) then there is no point in using this option
       because the calculated expected frequency will be equal to the
       observed frequency. Calculating the expected frequencies like this
       will give an approximation of the expected frequencies that you
       might get by using an input file of frequencies produced by a
       previous run of this program. If an input file of expected word
       frequencies has been specified then the values from that file will
       be used instead of this calculation of expected frequency from the
       sequence, even if 'calcfreq' is set to be true. Default value: N

   Output section
   -outfile outfile
       This is the results file.

   -zerocount boolean
       You can make the output results file much smaller if you do not
       display the words with a zero count. Default value: Y

BUGS

   Bugs can be reported to the Debian Bug Tracking system
   (http://bugs.debian.org/emboss), or directly to the EMBOSS developers
   (http://sourceforge.net/tracker/?group_id=93650&atid=605031).

SEE ALSO

   compseq is fully documented via the tfm(1) system.

AUTHOR

   Debian Med Packaging Team
   <debian-med-packaging@lists.alioth.debian.org>
       Wrote the script used to autogenerate this manual page.

COPYRIGHT

   This manual page was autogenerated from an Ajax Control Definition of
   the EMBOSS package. It can be redistributed under the same terms as
   EMBOSS itself.





Opportunity


Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.





Free Software


Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.


Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.





Free Books


The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.


Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.





Education


Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.


Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.