{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook, we investigate nonlinear least squares, and goodness of fit"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import matplotlib\n",
"%matplotlib inline\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's simulate a data set that is a Gaussian function, with 3 parameters: amplitude, center, and sigma; this could be, for example, an emission line in a spectrum"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# here we define the function. For use with curve_fit, we need each parameter to \n",
"# be a separate argument (i.e., not have all arguments as a single list/array)\n",
"def func(x,amp,cent,sig) :\n",
" \"\"\" Return Gaussian evaluated at x, given amplitude, center, and sigma\n",
" \"\"\"\n",
" return \n",
"\n",
"# simulate some data. Note that I've defined a par array, but pass it as\n",
"# individual arguments using *par\n",
"x= # values of independent variables\n",
"par=[amp,cent,sig] # fill in your desired values\n",
"sig= # set uncertainties\n",
"# simmulate data with uncertainties\n",
"y=func(x,*par)+np.random.normal(0.,sig,size=len(x))\n",
"plt.errorbar(x,y,sig,fmt='ro')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use [scipy.optimize.curve_fit()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.htm) to do the nonlinear fit. curve_fit returns both the best fit paramters and the covariance matrix"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from scipy.optimize import curve_fit\n",
"# you need to supply an initial guess for parameters. Can you write some code to\n",
"# get an estimate of these from the data, e.g., given the peak value and the location of\n",
"# the peak?\n",
"init= # provide array of initial parameters\n",
"# call curve_fit to get the best fit parameters. You'll need to supply a function,\n",
"# data with uncertainties, and initial guess.\n",
"fit= \n",
"print('parameters:' ,fit[0])\n",
"print('covariance matrix: \\n',fit[1])\n",
"\n",
"# plot the data and the best fit\n",
"plt.errorbar(x,y,sig,fmt='ro')\n",
"x=np.arange(-10,10,0.01)\n",
"plt.plot(x,func(x,*fit[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now calculate the $X^2$ statistic for the data set and fit:\n",
"$$X^2 = \\Sigma {(y_i - f(x_i, parameters))^2 \\over \\sigma_i^2}$$"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use scipy.stats.chisq() to determine the probability of getting a $\\chi^2$ smaller than and larger than your value."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" Would you rule this fit out as a good fit?\n",
"
ANSWER HERE: "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Change the size of the uncertainties going into the simulation but not the uncertainties that you feed the fit (or vice versa). Redo the fit , calculate $X^2$ and the probability again.\n",
"
DISCUSS YOUR RESULTS: "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now add a second Gaussian to your simulated data, with different parameters"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"par1= # parameter list for first Gaussian\n",
"par2= # parameter list for second Gaussian\n",
"y=func(x,*par1)+func(x,*par2)+np.random.normal(0.,sig,size=len(x))\n",
"plt.errorbar(x,y,sig,fmt='ro')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Fit it with a single Gaussian"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"init=\n",
"fit=curve_fit()\n",
"print('parameters:' ,fit[0])\n",
"print('covariance matrix: \\n',fit[1])\n",
"\n",
"# plot the data and the best fit\n",
"plt.errorbar(x,y,sig,fmt='ro')\n",
"x=np.arange(-10,10,0.01)\n",
"plt.plot(x,func(x,*fit[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calculate the $X^2$ statistic and it's probability"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Modify your function to fit the sum of two Gaussians, and\n",
"see how well you can recover the parameters. Experiment with different starting guesses."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def func2( ... ) :\n",
" \"\"\"function to return the sum of two Gaussians given input parameters\n",
" \"\"\"\n",
" \n",
" return\n",
"\n",
"# provide initial guess and do the fit\n",
"\n",
"init=\n",
"fit=curve_fit()\n",
"print('parameters:' ,fit[0])\n",
"print('covariance matrix: \\n',fit[1])\n",
"\n",
"# plot the data and the best fit\n",
"plt.errorbar(x,y,sig,fmt='ro')\n",
"x=np.arange(-10,10,0.01)\n",
"plt.plot(x,func(x,*fit[0]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" DISCUSS YOUR RESULTS: "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.11"
}
},
"nbformat": 4,
"nbformat_minor": 1
}