{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we investigate linear least squares fitting.\n", "
\n", "You are expected to implement the requested code, but also comment on your results and things that look as you expect or different from what you expect: if the latter, discuss why. Please add Markdown cells as needed for any discussion." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's define a linear function that is a polynomial, where linear means linear in the parameters, not the independent variable. Write a function to evaluate a general polynomial, where the order of the polynomial is determined by the length of the parameter array that you pass it.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def func(x,par) :\n", " \"\"\" Function to return a polynomial evaluates at x\n", " y = par[0] + par[1]*x + par[2]*x**2 + par3*x**3 + ....\n", " where the number of terms (order of polynomial) is determined by the lenght\n", " of the parameter array (e.g. if you give it a 2-parameter array, that would\n", " be a straight line)\n", " \"\"\"\n", " return # code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's generate a data set. Start with a straight line" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set true parameters\n", "npar=2\n", "truepar= # set intercept and slope\n", "\n", "# generate a data set with true underlying relation plus gaussian noise\n", "n= # number of points to generate\n", "x= # independent variable\n", "sig= # uncertainties on the dependent variable\n", "\n", "# generate the data set using your function + noise\n", "y=func(x,truepar)+np.random.normal(0.,sig)\n", "plt.errorbar(x,y,sig,fmt='ro')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Linear least squares using relations specific to a straight line function, as derived in class:\n", "$$a = {S_{xx} S_y - S_xS_{xy} \\over S S_{xx} - S_x^2}$$\n", "$$b = {S S_{xy} - S_xS_y \\over S S_{xx} - S_x^2}$$\n", "where\n", "$$S = \\sum {1\\over \\sigma_i^2}$$\n", "$$S_x = \\sum {x_i\\over \\sigma_i^2}$$\n", "$$S_{xy} = \\sum {x_i y_i\\over \\sigma_i^2}$$\n", "$$S_{xx} = \\sum {x_i x_i\\over \\sigma_i^2}$$\n", "$$S_y = \\sum {y_i\\over \\sigma_i^2}$$\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#calculate sums (I've filled in one)\n", "s=(1./sig**2).sum()\n", "sx=\n", "sxy=\n", "sxx=\n", "sy=\n", "\n", "# derived parameters: implement above equations\n", "par=np.zeros(npar)\n", "par[0]=\n", "par[1]=\n", "print('par',par)\n", "\n", "#plot them\n", "plt.errorbar(x,y,sig,fmt='ro')\n", "plt.plot(x,func(x,par))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Uncertainties in the parameters:\n", "$$\\sigma_a^2 = {S_{xx}\\over S S_{xx} - S_x^2}$$\n", "$$\\sigma_b^2 = {S \\over S S_{xx} - S_x^2}$$\n", "$$ \\sigma_{ab} = -\\sigma_b^2 \\bar{x} $$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#calculate uncertainties and covariances.\n", "sigpar=np.zeros(npar)\n", "sigpar[0]=\n", "sigpar[1]=\n", "print('sigpar',sigpar)\n", "print('covariance',sigpar[1]**2*x.mean())\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How do the derived parameters compare with the true parameters? What happens as you change the size of the uncertainties and/or the number of data points? Can you modify your independent variables to make the covariances go to zero?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use the more general formalism for linear least squares. Start by writing a function that calculates the derivatives of the function with respect to each parameter, which, for a polynomial, is just 1, $x$, $x^2$, ... Note that this special case of derivatives with respect to the parameters of a polynomial is called a Vandermonde matrix, and numpy has a function that calculates it if you are lazy, \n", " numpy.vander() \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def deriv(x,npar) :\n", " \"\"\" Return derivatives with respect to each parameter for a polynomial with npar parameters\n", " \"\"\"\n", " return \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To do the linear least squares, you can caculate the $\\alpha$ matrix and the $\\beta$ vector:\n", "\n", "$$\\alpha_{jk} = \\sum_{i=0}^{N-1} {f_j(x_i) f_k(x_i)\\over \\sigma_i^2}$$\n", "$$\\beta_j = \\sum_{i=0}^{N-1} {f_j(x_i) y_i \\over \\sigma_i^2}$$\n", "Your deriv() routine supplies the $f_j$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# sum over the data points. Note that this can be done better with \n", "# array arithmetic, but I write the for loops to be clear, you can write as you please\n", "beta=np.zeros(npar)\n", "alpha=np.zeros([npar,npar])\n", "for k in np.arange(npar) :\n", " beta[k] = # code here\n", " for j in np.arange(npar) :\n", " alpha[j,k] = # code here\n", "print('alpha:')\n", "print(alpha)\n", "print('beta')\n", "print(beta)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now solve for the parameters, $p$, given the set of equations\n", "$$ \\alpha_{jk} p_k = \\beta_{k} $$\n", "by inverting the $\\alpha$ matrix and multiplying by $\\beta$.\n", "Matrix inversion can be done using numpy.linalg.inv() and matrix multiplication by numpy.dot" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# now invert the matrix alpha, and multiply by beta to get the parameters\n", "err=\n", "p=\n", "print('p: ',p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How do the parameters compare with those derived specifically for the straight line fit? (They should be the same!)\n", "
ANSWER HERE :" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can calculate the uncertainties on the parameters from the error matrix (the inverse of the $\\alpha$ matrix: diagonal elements give the variances of the parameterws." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# uncertainties from inverse matrix\n", "print('inverse matrix: ')\n", "print(err)\n", "print('uncertainties on individual parameters from sqrt of diagonal')\n", "print(np.sqrt(err[0,0]),np.sqrt(err[1,1]))\n", "plt.errorbar(x,y,sig,fmt='ro')\n", "plt.plot(x,func(x,par))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can simplify the formulation even further, using the design matrix formalism and matrix arithmetic. Here, the design matrix is the matrix of derivatives for each point, i.e. an N (number of points) by M (number of parameters) matrix, divided by the uncertainty on each point You can use your deriv() routine to supply the design matrix:\n", "\n", "$$A_{ij} = {f_j(x_i) \\over \\sigma_i}$$\n", "with $N$ rows and $M$ columns. \n", "The data vector is the observed data points divided by the uncertainty.\n", "$$b_i = {y_i \\over \\sigma_i}$$\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# design array is the derivatives for all points, divided by the uncertainty\n", "# if your deriv routine returns a [npar,npoints] array, transpose the matrix (.T) to get [npoints,npar]\n", "design=\n", "print(design.shape)\n", "print('design')\n", "print(design)\n", "\n", "# b array is data points / sig\n", "b=\n", "print('b:')\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we set up the matrix arithmetic. What we previously called the alpha array is just $A^T A$ (where the multiplication is a matrix dot product), and the beta array is $A^T b$, where $A$ is the design matrix, and $b$ is the vector from above. Confirm that you can reproduce the $\\alpha$ matrix and $\\beta$ array with the design matrix formalism." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# calculate A^TA and compare with alpha from above (should be the same)\n", "ATA=\n", "print(ATA)\n", "print(alpha)\n", "\n", "# calculate A^T b and compare with beta from above (should be the same)\n", "ATb=\n", "print(ATb)\n", "print(beta)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can solve this set of (2) equations using [np.linalg.solve()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.solve.html), which is a solver that does not invert the matrix. Confirm that you get the same answers as above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "w = np.linalg.solve(ATA,np.dot(design.T,b))\n", "print(w)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we invert the matrix using [numpy.linalg.inv()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.inv.html) to get the uncertainties. If we're going to do this anyway, you would just use the inverted matrix to get the solution by multiplying by the beta array. Again, same answers!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "err = np.linalg.inv(ATA)\n", "print(err)\n", "print(np.sqrt(err[0,0]),np.sqrt(err[1,1]))\n", "\n", "# use the inverse matrix to get the parameters\n", "w = np.dot(err,np.dot(design.T,b))\n", "print(w)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we get the solution using [scipy.linalg.lstsq()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html), just providing the design matrix and the data array. Very simple!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import scipy.linalg\n", "scipy.linalg.lstsq(design,b)[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, let's play with this for general linear functions. Write routines to simulate the function\n", "$$ y= a + bx + c sin(x)$$\n", "and the routines to solve for the least squares parameters" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "def func(x,par):\n", " \"\"\" Another linear function: a + b x + c sin(x) (linear in the parameters!)\n", " \"\"\"\n", " return \n", "\n", "def deriv(x):\n", " \"\"\" Derivatives with respect to each parameter at input value(s)\n", " \"\"\"\n", " return \n", "\n", "\n", "x= # desired independent variables\n", "truepar= # input parameters\n", "sig= # uncertainties on data points\n", "# simulate a data set with noise:\n", "y=func(x,truepar)+np.random.normal(0.,sig,size=len(x))\n", "plt.errorbar(x,y,sig,fmt='ro')\n", "\n", "# design matrix\n", "design=\n", "print(design)\n", "\n", "# solve using either numpy or scipy routines (or both)\n", "\n", "#nunpy\n", "b=\n", "ATA=\n", "par=\n", "print('True parameters: ',truepar)\n", "print('Derived parameters: ',par)\n", "\n", "#scipy\n", "par=\n", "print('Derived parameters: ',par)\n", "\n", "x=np.arange(0,1,0.01)\n", "plt.plot(x,func(x,par))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 1 }