{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we investigate nonlinear least squares, and goodness of fit" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's simulate a data set that is a Gaussian function, with 3 parameters: amplitude, center, and sigma; this could be, for example, an emission line in a spectrum" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# here we define the function. For use with curve_fit, we need each parameter to \n", "# be a separate argument (i.e., not have all arguments as a single list/array)\n", "def func(x,amp,cent,sig) :\n", " \"\"\" Return Gaussian evaluated at x, given amplitude, center, and sigma\n", " \"\"\"\n", " return \n", "\n", "# simulate some data. Note that I've defined a par array, but pass it as\n", "# individual arguments using *par\n", "x= # values of independent variables\n", "par=[amp,cent,sig] # fill in your desired values\n", "sig= # set uncertainties\n", "# simmulate data with uncertainties\n", "y=func(x,*par)+np.random.normal(0.,sig,size=len(x))\n", "plt.errorbar(x,y,sig,fmt='ro')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use [scipy.optimize.curve_fit()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.htm) to do the nonlinear fit. curve_fit returns both the best fit paramters and the covariance matrix" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.optimize import curve_fit\n", "# you need to supply an initial guess for parameters. Can you write some code to\n", "# get an estimate of these from the data, e.g., given the peak value and the location of\n", "# the peak?\n", "init= # provide array of initial parameters\n", "# call curve_fit to get the best fit parameters. You'll need to supply a function,\n", "# data with uncertainties, and initial guess.\n", "fit= \n", "print('parameters:' ,fit[0])\n", "print('covariance matrix: \\n',fit[1])\n", "\n", "# plot the data and the best fit\n", "plt.errorbar(x,y,sig,fmt='ro')\n", "x=np.arange(-10,10,0.01)\n", "plt.plot(x,func(x,*fit[0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now calculate the $X^2$ statistic for the data set and fit:\n", "$$X^2 = \\Sigma {(y_i - f(x_i, parameters))^2 \\over \\sigma_i^2}$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use scipy.stats.chisq() to determine the probability of getting a $\\chi^2$ smaller than and larger than your value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Would you rule this fit out as a good fit?\n", "
ANSWER HERE: " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Change the size of the uncertainties going into the simulation but not the uncertainties that you feed the fit (or vice versa). Redo the fit , calculate $X^2$ and the probability again.\n", "
DISCUSS YOUR RESULTS: " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now add a second Gaussian to your simulated data, with different parameters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "par1= # parameter list for first Gaussian\n", "par2= # parameter list for second Gaussian\n", "y=func(x,*par1)+func(x,*par2)+np.random.normal(0.,sig,size=len(x))\n", "plt.errorbar(x,y,sig,fmt='ro')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fit it with a single Gaussian" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "init=\n", "fit=curve_fit()\n", "print('parameters:' ,fit[0])\n", "print('covariance matrix: \\n',fit[1])\n", "\n", "# plot the data and the best fit\n", "plt.errorbar(x,y,sig,fmt='ro')\n", "x=np.arange(-10,10,0.01)\n", "plt.plot(x,func(x,*fit[0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate the $X^2$ statistic and it's probability" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Modify your function to fit the sum of two Gaussians, and\n", "see how well you can recover the parameters. Experiment with different starting guesses." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def func2( ... ) :\n", " \"\"\"function to return the sum of two Gaussians given input parameters\n", " \"\"\"\n", " \n", " return\n", "\n", "# provide initial guess and do the fit\n", "\n", "init=\n", "fit=curve_fit()\n", "print('parameters:' ,fit[0])\n", "print('covariance matrix: \\n',fit[1])\n", "\n", "# plot the data and the best fit\n", "plt.errorbar(x,y,sig,fmt='ro')\n", "x=np.arange(-10,10,0.01)\n", "plt.plot(x,func(x,*fit[0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " DISCUSS YOUR RESULTS: " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" } }, "nbformat": 4, "nbformat_minor": 1 }