;+
; NAME:
;   MPFIT4GAUSS
; Has 14 parameters: 4 x 3 gaussian, constant, linear slope
; Use nterms=13 to not fit slope
;
; AUTHOR:
;   Craig B. Markwardt, NASA/GSFC Code 662, Greenbelt, MD 20770
;   craigm@lheamail.gsfc.nasa.gov
;   UPDATED VERSIONs can be found on my WEB PAGE: 
;      http://cow.physics.wisc.edu/~craigm/idl/idl.html
;
; PURPOSE:
;   Fits 4 gaussian models to data
;
; MAJOR TOPICS:
;   Curve and Surface Fitting
;
; CALLING SEQUENCE:
;   yfit = MPFITPEAK(X, Y, A, NTERMS=nterms, ...)
;
; DESCRIPTION:
;
;   Copy of MPFITPEAK for single component.
;
;   MPFITPEAK fits a gaussian model using the
;   non-linear least squares fitter MPFIT.  MPFITPEAK is meant to be a
;   drop-in replacement for IDL's GAUSSFIT function (and requires
;   MPFIT and MPFITFUN).
;
;   By default the gaussian model function is used.  
;
;   The functional form of the baseline is determined by NTERMS and
;   the function to be fitted.  NTERMS represents the total number of
;   parameters, A, to be fitted.  The functional forms and the
;   meanings of the parameters are described in this table:
;
;                 GAUSSIAN#       
;
;   Model     A[0]*exp(-0.5*u^2)  
;
;   A[0]         Peak Value 1     
;   A[1]        Peak Centroid 1    
;   A[2]       Gaussian Sigma 1   
;   A[3]         Peak Value 2     
;   A[4]        Peak Centroid 2   
;   A[5]       Gaussian Sigma 2   
;   A[6]         Peak Value 3     
;   A[7]        Peak Centroid 3   
;   A[8]       Gaussian Sigma 3   
;   A[9]         Peak Value 4     
;   A[10]        Peak Centroid 4   
;   A[11]       Gaussian Sigma 4   
;   A[12]         + A[12]    *      
;   A[13]         + A[13]*x  *      
;
;   Notes: # u = (x - A[1])/A[2]
;          % Half-width at half maximum
;          * Optional depending on NTERMS
;
;   By default the initial starting values for the parameters A are
;   estimated from the data.  However, explicit starting values can be
;   supplied using the ESTIMATES keyword.  Also, error or weighting
;   values can optionally be provided; otherwise the fit is
;   unweighted.
;
;   MPFITPEAK fits the peak value of the curve.  The area under a
;   gaussian peak is A[0]*A[2]*SQRT(2*!DPI); the area under a
;   lorentzian peak is A[0]*A[2]*!DPI.
;
; RESTRICTIONS:
;
;   If no starting parameter ESTIMATES are provided, then MPFITPEAK
;   attempts to estimate them from the data.  This is not a perfect
;   science; however, the author believes that the technique
;   implemented here is more robust than the one used in IDL's
;   GAUSSFIT.  The author has tested cases of strong peaks, noisy
;   peaks and broad peaks, all with success.
;
;   Users should be aware that if the baseline term contains a strong
;   linear component then the automatic estimation may fail.  For
;   automatic estimation to work the peak amplitude should dominate
;   over the the maximum baseline.
;
; INPUTS:
;   X - Array of independent variable values, whose values should
;       monotonically increase.
;
;   Y - Array of "measured" dependent variable values.  Y should have
;       the same data type and dimension as X.
;
;
; OUTPUTS:
;   A - Upon return, an array of NTERMS best fit parameter values.
;       See the table above for the meanings of each parameter
;       element.
;
;
; RETURNS:
;
;   Returns the best fitting model function.
;
; KEYWORDS:
;
;   ** NOTE ** Additional keywords such as PARINFO, BESTNORM, and
;              STATUS are accepted by MPFITPEAK but not documented
;              here.  Please see the documentation for MPFIT for the
;              description of these advanced options.
;
;   CHISQ - the value of the summed squared residuals for the
;           returned parameter values.
;
;   DOF - number of degrees of freedom, computed as
;             DOF = N_ELEMENTS(DEVIATES) - NFREE
;         Note that this doesn't account for pegged parameters (see
;         NPEGGED).
;
;   ERROR - upon input, the measured 1-sigma uncertainties in the "Y"
;           values.  If no ERROR or WEIGHTS are given, then the fit is
;           unweighted.
;
;   ESTIMATES - Array of starting values for each parameter of the
;               model.  The number of parameters should at least be
;               three (four for Moffat), and if less than NTERMS, will
;               be extended with zeroes.
;               Default: parameter values are estimated from data.
;
;   GAUSSIAN - if set, fit a gaussian model function.  The Default.
;   LORENTZIAN - if set, fit a lorentzian model function.
;   MOFFAT - if set, fit a Moffat model function.
;
;   MEASURE_ERRORS - synonym for ERRORS, for consistency with built-in
;                    IDL fitting routines.
;
;   NEGATIVE / POSITIVE - if set, and ESTIMATES is not provided, then
;                         MPFITPEAK will assume that a
;                         negative/positive peak is present.
;                         Default: determined automatically
;
;   NFREE - the number of free parameters in the fit.  This includes
;           parameters which are not FIXED and not TIED, but it does
;           include parameters which are pegged at LIMITS.
;
;   NTERMS - An integer describing the number of fitting terms.
;            NTERMS must have a minimum value, but can optionally be
;            larger depending on the desired baseline.  
;
;            For gaussian and lorentzian models, NTERMS must be three
;            (zero baseline), four (constant baseline) or five (linear
;            baseline).  Default: 4
;
;            For the Moffat model, NTERMS must be four (zero
;            baseline), five (constant baseline), or six (linear
;            baseline).  Default: 5
;
;   PERROR - upon return, the 1-sigma uncertainties of the parameter
;            values A.  These values are only meaningful if the ERRORS
;            or WEIGHTS keywords are specified properly.
;
;            If the fit is unweighted (i.e. no errors were given, or
;            the weights were uniformly set to unity), then PERROR
;            will probably not represent the true parameter
;            uncertainties.  
;
;            *If* you can assume that the true reduced chi-squared
;            value is unity -- meaning that the fit is implicitly
;            assumed to be of good quality -- then the estimated
;            parameter uncertainties can be computed by scaling PERROR
;            by the measured chi-squared value.
;
;              DOF     = N_ELEMENTS(X) - N_ELEMENTS(PARMS) ; deg of freedom
;              PCERROR = PERROR * SQRT(BESTNORM / DOF)   ; scaled uncertainties
;
;   QUIET - if set then diagnostic fitting messages are suppressed.
;           Default: QUIET=1 (i.e., no diagnostics)
;
;   SIGMA - synonym for PERROR (1-sigma parameter uncertainties), for
;           compatibility with GAUSSFIT.  Do not confuse this with the
;           Gaussian "sigma" width parameter.
;
;   WEIGHTS - Array of weights to be used in calculating the
;             chi-squared value.  If WEIGHTS is specified then the ERROR
;             keyword is ignored.  The chi-squared value is computed
;             as follows:
;
;                CHISQ = TOTAL( (Y-MYFUNCT(X,P))^2 * ABS(WEIGHTS) )
;
;             Here are common values of WEIGHTS:
;
;                1D/ERR^2 - Normal weighting (ERR is the measurement error)
;                1D/Y     - Poisson weighting (counting statistics)
;                1D       - Unweighted
;
;             The ERROR keyword takes precedence over any WEIGHTS
;             keyword values.  If no ERROR or WEIGHTS are given, then
;             the fit is unweighted.
;
;   YERROR - upon return, the root-mean-square variance of the
;            residuals.
;
;
; EXAMPLE:
;
;   ; First, generate some synthetic data
;   npts = 200
;   x  = dindgen(npts) * 0.1 - 10.                  ; Independent variable 
;   yi = gauss1(x, [2.2D, 1.4, 3000.]) + 1000       ; "Ideal" Y variable
;   y  = yi + randomn(seed, npts) * sqrt(1000. + yi); Measured, w/ noise
;   sy = sqrt(1000.D + y)                           ; Poisson errors
;
;   ; Now fit a Gaussian to see how well we can recover the original
;   yfit = mpfitpeak(x, y, a, error=sy)
;   print, p
;
;   Generates a synthetic data set with a Gaussian peak, and Poisson
;   statistical uncertainty.  Then the same function is fitted to the
;   data.
;
; REFERENCES:
;
;   MINPACK-1, Jorge More', available from netlib (www.netlib.org).
;   "Optimization Software Guide," Jorge More' and Stephen Wright, 
;     SIAM, *Frontiers in Applied Mathematics*, Number 14.
;
; MODIFICATION HISTORY:
;
;   New algorithm for estimating starting values, CM, 31 Oct 1999
;   Documented, 02 Nov 1999
;   Small documentation fixes, 02 Nov 1999
;   Slight correction to calculation of dx, CM, 02 Nov 1999
;   Documented PERROR for unweighted fits, 03 Nov 1999, CM
;   Copying permission terms have been liberalized, 26 Mar 2000, CM
;   Change requirements on # elements in X and Y, 20 Jul 2000, CM
;      (thanks to David Schlegel <schlegel@astro.princeton.edu>)
;   Added documentation on area under curve, 29 Aug 2000, CM
;   Added POSITIVE and NEGATIVE keywords, 17 Nov 2000, CM
;   Added reference to Moffat paper, 10 Jan 2001, CM
;   Added usage message, 26 Jul 2001, CM
;   Documentation clarification, 05 Sep 2001, CM
;   Make more consistent with comparable IDL routines, 30 Jun 2003, CM
;   Assumption of sorted data was removed, CM, 06 Sep 2003, CM
;   Add some defensive code against divide by zero, 30 Nov 2005, CM
;   Add some defensive code against all Y values equal to each other,
;     17 Apr 2005, CM
;   Convert to IDL 5 array syntax (!), 16 Jul 2006, CM
;   Move STRICTARR compile option inside each function/procedure, 9 Oct 2006
;
;  $Id: mpfitpeak.pro,v 1.11 2006/10/09 19:25:29 craigm Exp $
;-
; Copyright (C) 1997-2001, 2003, 2005, Craig Markwardt
; This software is provided as is without any warranty whatsoever.
; Permission to use, copy, modify, and distribute modified or
; unmodified copies is granted, provided this copyright and disclaimer
; are included unchanged.
;-

;forward_function mpfit, mpfitfun, mpfitpeak, mpfit4gauss_peak, mpfit4gauss_u

function mpfit4gauss_u, x, p
  COMPILE_OPT strictarr
  wid = abs(p[2]) > 1e-20
  return, ((x-p[1])/wid)^2
end


; Gaussian Function
function mpfit4gauss_peak, x, p, _extra=extra
  COMPILE_OPT strictarr
  sz = size(x)
  if sz[sz[0]+1] EQ 5 then smax = 26D else smax = 13.
  u1 = mpfit4gauss_u(x, p[0:2])
  u2 = mpfit4gauss_u(x, p[3:5])
  u3 = mpfit4gauss_u(x, p[6:8])
  u4 = mpfit4gauss_u(x, p[9:11])
  mask1 = u1 LT (smax^2)  ;; Prevents floating underflow
  mask2 = u2 LT (smax^2)  
  mask3 = u3 LT (smax^2)  
  mask4 = u4 LT (smax^2)  
  if n_elements(p) GE 13 then f = p[12] else f = 0.
  if n_elements(p) GE 14 then f = f + p[13]*x
  return,  f + $
           p[0] * mask1 * exp(-0.5 * temporary(u1) * mask1) + $
           p[3] * mask2 * exp(-0.5 * temporary(u2) * mask2) + $
           p[6] * mask3 * exp(-0.5 * temporary(u3) * mask3) + $
           p[9] * mask4 * exp(-0.5 * temporary(u4) * mask4)
end

function mpfit4gauss, x, y, a, estimates=est, nterms=nterms, $
                    perror=perror, sigma=sigma, yerror=yerror, $
                    chisq=chisq, bestnorm=bestnorm, niter=iter, nfev=nfev, $
                    error=dy, weights=weights, measure_errors=dym, $
                    nfree=nfree, dof=dof, $
                    negative=neg, positive=pos, parinfo=parinfo, $
                    errmsg=errmsg, status=status, $
                    query=query, quiet=quiet, _extra=extra

  COMPILE_OPT strictarr
  status = 0L
  errmsg = ''

  if n_params() EQ 0 then begin
      message, 'USAGE: yfit = MPFIT4GAUSS(X, Y, A, ...)', /info
      return, !values.d_nan
  endif

  ;; Detect MPFIT and crash if it was not found
  catch, catcherror
  if catcherror NE 0 then begin
      MPFIT_NOTFOUND:
      catch, /cancel
      message, 'ERROR: the required functions MPFIT and MPFITFUN ' + $
        'must be in your IDL path', /info
      return, !values.d_nan
  endif
  if mpfit(/query)    NE 1 then goto, MPFIT_NOTFOUND
  if mpfitfun(/query) NE 1 then goto, MPFIT_NOTFOUND
  catch, /cancel
  if keyword_set(query) then return, 1

  ;; Check the number of parameter estimates
  if n_elements(quiet) EQ 0 then quiet=0
  if n_elements(nterms) EQ 0 then nterms = 13

  ;; Reject data vectors that are too simple
  if n_elements(x) LT nterms OR n_elements(y) LT nterms then begin
      message, 'ERROR: X and Y must have at least NTERMS elements', /info
      return, !values.d_nan
  endif

  ;; Compute the weighting factors to use
  if (n_elements(dy) EQ 0 AND n_elements(weights) EQ 0 AND $
      n_elements(dym) EQ 0) then begin
      weights = x*0+1        ;; Unweighted by default
  endif else if n_elements(dy) GT 0 then begin
      weights = dy * 0   ;; Avoid division by zero
      wh = where(dy NE 0, ct)
      if ct GT 0 then weights[wh] = 1./dy[wh]^2
  endif else if n_elements(dym) GT 0 then begin
      weights = dym * 0   ;; Avoid division by zero
      wh = where(dym NE 0, ct)
      if ct GT 0 then weights[wh] = 1./dym[wh]^2
  endif 

  if n_elements(est) EQ 0 then begin
      ;; Here is the secret - the width is estimated based on the area
      ;; above/below the average.  Thus, as the signal becomes more
      ;; noisy the width automatically broadens as it should.

      nx = n_elements(x)

      is = sort(x)
      xs = x[is] & ys = y[is]
      maxx = max(xs, min=minx) & maxy = max(ys, min=miny)
      dx = 0.5 * [xs[1]-xs[0], xs[2:*] - xs, xs[nx-1] - xs[nx-2]]
      totarea = total(dx*ys)       ;; Total area under curve
      av = totarea/(maxx - minx)  ;; Average height

      ;; Degenerate case: all flat with no noise
      if miny EQ maxy then begin
          est = ys(0)*0.0 + [0,xs(nx/2),(xs(nx-1)-xs(0))/2, ys(0)]
          guess = 1
          goto, DONE_GUESS
      endif

      ;; Compute the spread in values above and below average... we
      ;; take the narrowest one as the one with the peak
      wh1 = where(y GE av, ct1)
      sd1 = total(x[wh1]^2)/ct1 - (total(x[wh1])/ct1)^2
      wh2 = where(y LE av, ct2)
      sd2 = total(x[wh2]^2)/ct2 - (total(x[wh2])/ct2)^2
      
      ;; Compute area above/below average

      if keyword_set(pos) then goto, POS_PEAK
      if keyword_set(neg) then goto, NEG_PEAK
      if sd1 LT sd2 then begin  ;; This is a positive peak
          POS_PEAK:
          cent  = x[where(y EQ maxy)] & cent = cent[0]
          peak  = maxy - av
      endif else begin          ;; This is a negative peak
          NEG_PEAK:
          cent  = x[where(y EQ miny)] & cent = cent[0]
          peak  = miny - av
      endelse
      peakarea = totarea - total(dx*(ys<av))
      if peak EQ 0 then peak = 0.5*peakarea
      width = peakarea / (2*abs(peak))
      if width EQ 0 OR finite(width) EQ 0 then width = median(dx)

      est = [peak, cent, width, peak/5., cent-width, 2.*width, peak/10., cent+width, 2.*width, peak/20., cent-width/2., width, av]
      guess = 1
  endif
  DONE_GUESS:

  ;; Parameter checking for individual function types
  np = 12
  fun = 'mpfit4gauss_peak'
  if n_elements(est) LT np then begin
      message, 'ERROR: parameter ESTIMATES must have at least '+strtrim(np,2)+$
        ' elements', /info
      return, !values.d_nan
  endif
  if nterms[0] LT np then begin
      message, 'ERROR: NTERMS must be at least '+strtrim(np,2), /info
      return, !values.d_nan
  endif
  p0 = replicate(est[0]*0, nterms[0] > n_elements(est))
  p0[0] = est

  ;; Function call
  a = mpfitfun(fun, x, y, 0, p0[0:nterms[0]-1], weights=weights, $
               bestnorm=bestnorm, nfev=nfev, status=status, $
               nfree=nfree, dof=dof, $
               parinfo=parinfo, perror=perror, niter=iter, yfit=yfit, $
               quiet=quiet, errmsg=errmsg, _EXTRA=extra)

  ;; Print error message if there is one.
  if NOT keyword_set(quiet) AND errmsg NE '' then $
    message, errmsg, /info

  if status NE 0 then begin
      ;; Make sure the width is positive
      a[2] = abs(a[2])

      ;; For compatibility with GAUSSFIT
      if n_elements(perror)   GT 0 then sigma = perror
      if n_elements(bestnorm) GT 0 then chisq = bestnorm

      yerror = a[0]*0
      if n_elements(dof) GT 0 AND dof[0] GT 0 then begin
          yerror[0] = sqrt( total( (y-yfit)^2 ) / dof[0] )
      endif

      return, yfit
  endif

  return, !values.d_nan
end