Linux Manual Pages

Free Software * Books

Source Code

Free Media

Linux

spectrum1d(1gmt)

NAME

   spectrum1d  -  Compute  auto-  [and  cross- ] spectra from one [or two]
   time-series

SYNOPSIS

   spectrum1d [ table ]  -Ssegment_size] [  -C[xycnpago] ]  [   -Ddt  ]  [
   -L[h|m] ] [  -N[name_stem ] ] [  -T ] [  -W ] [ -bbinary ] [ -dnodata ]
   [ -fflags ] [ -ggaps ] [ -hheaders ] [ -iflags ]

   Note: No space is allowed between the option flag  and  the  associated
   arguments.

DESCRIPTION

   spectrum1d  reads  X [and Y] values from the first [and second] columns
   on standard input [or x[y]file]. These values are treated as timeseries
   X(t) [Y(t)] sampled at equal intervals spaced dt units apart. There may
   be any number  of  lines  of  input.  spectrum1d  will  create  file[s]
   containing  auto-  [and  cross- ] spectral density estimates by Welch's
   method of ensemble averaging  of  multiple  overlapped  windows,  using
   standard error estimates from Bendat and Piersol.

   The  output  files  have  3  columns:  f  or w, p, and e. f or w is the
   frequency or wavelength, p is the spectral density estimate, and  e  is
   the  one standard deviation error bar size. These files are named based
   on name_stem. If the -C option is used, up to eight files are  created;
   otherwise  only  one  (xpower)  is  written. The files (which are ASCII
   unless -bo is set) are as follows:

   name_stem.xpower
          Power spectral density of X(t). Units of X * X * dt.

   name_stem.ypower
          Power spectral density of Y(t). Units of Y * Y * dt.

   name_stem.cpower
          Power spectral density of the coherent  output.  Units  same  as
          ypower.

   name_stem.npower
          Power  spectral  density  of  the  noise  output.  Units same as
          ypower.

   name_stem.gain
          Gain spectrum, or modulus of the transfer function. Units of  (Y
          / X).

   name_stem.phase
          Phase  spectrum,  or  phase  of the transfer function. Units are
          radians.

   name_stem.admit
          Admittance spectrum, or real  part  of  the  transfer  function.
          Units of (Y / X).

   name_stem.coh
          (Squared)  coherency spectrum, or linear correlation coefficient
          as a function of frequency. Dimensionless number in [0, 1].  The
          Signal-to-Noise-Ratio (SNR) is coh / (1 - coh). SNR = 1 when coh
          = 0.5.

   In addition, a single file with all of the above as individual  columns
   will be written to stdout (unless disabled via -T).

REQUIRED ARGUMENTS

   -Ssegment_size]
          segment_size  is  a  radix-2  number  of  samples per window for
          ensemble  averaging.  The  smallest   frequency   estimated   is
          1.0/(segment_size  * dt), while the largest is 1.0/(2 * dt). One
          standard error in power spectral density is approximately 1.0  /
          sqrt(n_data  / segment_size), so if segment_size = 256, you need
          25,600  data  to  get  a  one  standard  error   bar   of   10%.
          Cross-spectral error bars are larger and more complicated, being
          a function also of the coherency.

OPTIONAL ARGUMENTS

   table  One or more ASCII (or binary, see -bi) files holding X(t) [Y(t)]
          samples  in  the  first  1  [or  2]  columns.  If  no  files are
          specified, spectrum1d will read from standard input.

   -C[xycnpago]
          Read  the  first  two  columns  of  input  as  samples  of   two
          time-series,  X(t)  and Y(t). Consider Y(t) to be the output and
          X(t) the input in a  linear  system  with  noise.  Estimate  the
          optimum  frequency response function by least squares, such that
          the noise output is minimized and the coherent  output  and  the
          noise  output  are  uncorrelated.   Optionally  specify  up to 8
          letters from the set { x y c n p a g o } in any order to  create
          only  those  output  files  instead  of  the  default [all]. x =
          xpower, y = ypower, c = cpower, n =  npower,  p  =  phase,  a  =
          admit, g = gain, o = coh.

   -Ddt   dt Set the spacing between samples in the time-series [Default =
          1].

   -L     Leave trend alone. By default, a linear trend  will  be  removed
          prior  to  the transform. Alternatively, append m to just remove
          the mean value or h to remove the mid-value.

   -N[name_stem]
          Supply an alternate name  stem  to  be  used  for  output  files
          [Default  =  "spectrum"].   Giving  no argument will disable the
          writing of individual output files.

   -V[level] (more ...)
          Select verbosity level [c].

   -T     Disable the writing  of  a  single  composite  results  file  to
          stdout.

   -W     Write Wavelength rather than frequency in column 1 of the output
          file[s] [Default = frequency, (cycles / dt)].

   -bi[ncols][t] (more ...)
          Select native binary input. [Default is 2 input columns].

   -bo[ncols][type] (more ...)
          Select native binary output. [Default is 2 output columns].

   -d[i|o]nodata (more ...)
          Replace input columns that equal nodata  with  NaN  and  do  the
          reverse on output.

   -f[i|o]colinfo (more ...)
          Specify data types of input and/or output columns.

   -g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more ...)
          Determine data gaps and line breaks.

   -h[i|o][n][+c][+d][+rremark][+rtitle] (more ...)
          Skip or produce header record(s).

   -icols[l][sscale][ooffset][,...] (more ...)
          Select input columns (0 is first column).

   -^ or just -
          Print  a  short  message  about  the syntax of the command, then
          exits (NOTE: on Windows use just -).

   -+ or just +
          Print  an  extensive  usage  (help)   message,   including   the
          explanation  of  any  module-specific  option  (but  not the GMT
          common options), then exits.

   -? or no arguments
          Print a complete usage (help) message, including the explanation
          of options, then exits.

ASCII FORMAT PRECISION

   The ASCII output formats of numerical data are controlled by parameters
   in your gmt.conf file. Longitude and latitude are  formatted  according
   to  FORMAT_GEO_OUT,  whereas  other  values  are formatted according to
   FORMAT_FLOAT_OUT. Be aware that the format in effect can lead  to  loss
   of  precision  in  the  output,  which  can  lead  to  various problems
   downstream.  If  you  find  the  output  is  not  written  with  enough
   precision,  consider  switching  to binary output (-bo if available) or
   specify more decimals using the FORMAT_FLOAT_OUT setting.

EXAMPLES

   Suppose data.g is gravity data in mGal, sampled every 1.5 km. To  write
   its power spectrum, in mGal**2-km, to the file data.xpower, use

          gmt spectrum1d data.g -S256 -D1.5 -Ndata

   Suppose  in  addition to data.g you have data.t, which is topography in
   meters sampled at the  same  points  as  data.g.  To  estimate  various
   features  of  the  transfer  function,  considering data.t as input and
   data.g as output, use

          paste data.t data.g | gmt spectrum1d -S256 -D1.5 -Ndata -C > results.txt

TUTORIAL

   The output of spectrum1d is in units of power spectral density, and  so
   to  get units of data-squared you must divide by delta_t, where delta_t
   is the sample spacing.  (There may be a factor of 2 pi somewhere, also.
   If  you want to be sure of the normalization, you can determine a scale
   factor from Parseval's theorem: the sum of the squares  of  your  input
   data  should  equal  the  sum  of  the  squares  of  the  outputs  from
   spectrum1d, if you  are  simply  trying  to  get  a  periodogram.  [See
   below.])

   Suppose  we  simply  take  a  data  set, x(t), and compute the discrete
   Fourier transform (DFT) of the entire data set in  one  go.  Call  this
   X(f). Then suppose we form X(f) times the complex conjugate of X(f).

   P_raw(f) = X(f) * X'(f), where the ' indicates complex conjugation.

   P_raw  is  called  the  periodogram.  The  sum  of  the  samples of the
   periodogram equals the sum of the samples of the squares  of  x(t),  by
   Parseval's theorem. (If you use a DFT subroutine on a computer, usually
   the sum of P_raw equals the sum of x-squared, times M, where M  is  the
   number of samples in x(t).)

   Each estimate of X(f) is now formed by a weighted linear combination of
   all of the x(t) values. (The  weights  are  sometimes  called  "twiddle
   factors"  in  the  DFT literature.)  So, no matter what the probability
   distribution for the x(t) values is, the probability  distribution  for
   the  X(f)  values  approaches  [complex] Gaussian, by the Central Limit
   Theorem. This means that  the  probability  distribution  for  P_raw(f)
   approaches  chi-squared with two degrees of freedom. That reduces to an
   exponential distribution, and the variance of the estimate of P_raw  is
   proportional  to the square of the mean, that is, the expected value of
   P_raw.

   In practice if we form P_raw, the estimates are hopelessly noisy.  Thus
   P_raw  is  not  useful,  and  we  need  to do some kind of smoothing or
   averaging to get a useful estimate, P_useful(f).

   There are several different ways to do this in the literature.  One  is
   to   form   P_raw   and   then  smooth  it.  Another  is  to  form  the
   auto-covariance function of x(t), smooth, taper and shape it, and  then
   take  the  Fourier  transform  of  the  smoothed,  tapered  and  shaped
   auto-covariance.  Another  is  to  form  a  parametric  model  for  the
   auto-correlation  structure  in x(t), then compute the spectrum of that
   model. This last approach is  what  is  done  in  what  is  called  the
   "maximum  entropy"  or  "Berg"  or  "Box-Jenkins"  or "ARMA" or "ARIMA"
   methods.

   Welch's method is a tried-and-true method. In his method, you choose  a
   segment  length,  -SN,  so that estimates will be made from segments of
   length N. The frequency samples (in cycles per delta_t  unit)  of  your
   P_useful  will  then be at k /(N * delta_t), where k is an integer, and
   you will get N samples (since the spectrum is an even  function  of  f,
   only  N/2 of them are really useful). If the length of your entire data
   set, x(t), is M samples long, then the variance in your  P_useful  will
   decrease  in  proportion  to N/M. Thus you need to choose N << M to get
   very low noise and high confidence in P_useful. There  is  a  trade-off
   here; see below.

   There  is  an  additional  reduction in variance in that Welch's method
   uses a Von Hann spectral window  on  each  sample  of  length  N.  This
   reduces  side  lobe  leakage  and  has  the  effect of smoothing the (N
   segment) periodogram as if the X(f) had been convolved with [1/4,  1/2,
   1/4]  prior  to forming P_useful. But this slightly widens the spectral
   bandwidth of each estimate, because the estimate at frequency sample  k
   is  now  a little correlated with the estimate at frequency sample k+1.
   (Of course this would also happen if you simply formed P_raw  and  then
   smoothed it.)

   Finally,  Welch's method also uses overlapped processing. Since the Von
   Hann window is large in the middle and tapers to near zero at the ends,
   only  the  middle  of  the  segment of length N contributes much to its
   estimate. Therefore in taking the next segment of data, we  move  ahead
   in  the  x(t)  sequence  only N/2 points. In this way, the next segment
   gets large weight where the segments on either  side  of  it  will  get
   little  weight,  and  vice versa. This doubles the smoothing effect and
   ensures that (if N << M) nearly every point in  x(t)  contributes  with
   nearly equal weight in the final answer.

   Welch's  method  of spectral estimation has been widely used and widely
   studied. It is very reliable and its statistical  properties  are  well
   understood. It is highly recommended in such textbooks as "Random Data:
   Analysis and Measurement Procedures" by Bendat and Piersol.

   In all problems of estimating parameters from data, there is a  classic
   trade-off  between  resolution  and  variance.  If  you  want to try to
   squeeze more resolution out of your data  set,  then  you  have  to  be
   willing  to  accept  more noise in the estimates. The same trade-off is
   evident here in Welch's method. If you want to have very low  noise  in
   the  spectral estimates, then you have to choose N << M, and this means
   that you get only N samples of the spectrum,  and  the  longest  period
   that you can resolve is only N * delta_t.  So you see that reducing the
   noise lowers the number of spectral  samples  and  lowers  the  longest
   period.  Conversely,  if  you choose N approaching M, then you approach
   the periodogram with its very bad statistical properties, but  you  get
   lots of samples and a large fundamental period.

   The  other  spectral estimation methods also can do a good job. Welch's
   method was selected because the way it works, how one can code it,  and
   its   effects   on  statistical  distributions,  resolution,  side-lobe
   leakage, bias, variance, etc. are all easily understood.  Some  of  the
   other  methods  (e.g. Maximum Entropy) tend to hide where some of these
   trade-offs are happening inside a "black box".

REFERENCES

   Bendat, J. S., and A. G. Piersol, 1986, Random Data, 2nd  revised  ed.,
   John Wiley & Sons.

   Welch,  P.  D.,  1967,  The  use  of  Fast  Fourier  Transform  for the
   estimation of power spectra: a method  based  on  time  averaging  over
   short,   modified   periodograms,   IEEE   Transactions  on  Audio  and
   Electroacoustics, Vol AU-15, No 2.

COPYRIGHT

   2016, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe

Opportunity

Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.

Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.

Free Software

Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.

Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.

Free Books

The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.

Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.

Education

Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.

Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.