{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "

Markov Chain Monte Carlo (MCMC) and density estimation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import emcee" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start by simulating data from a straight line with uncertainties (as we've done before!)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "true = # [slope,intercept]\n", "x = np.arange(0,20)\n", "yerr= # uncertainties\n", "y = true[0]*x + true[1] + np.random.normal(0,yerr,size=len(x))\n", "plt.plot(x,y,'ro')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the Bayesian analysis, define log_likelihood, log_prior, and log_posterior routines. We'll supply the parameters as an array, and then the input data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def log_likelihood(par,x,y,yerr) :\n", " \"\"\" Likelihood for Gaussian uncertainties\n", " \"\"\"\n", " m,b=par \n", " f = m*x + b\n", " return -0.5 * np.sum(((y-f) / yerr) ** 2)\n", "\n", "def log_prior(par):\n", " \"\"\" Flat prior\n", " \"\"\"\n", " m,b=par\n", " if not (-10 < m < 10):\n", " return -np.inf\n", " if not (-10 < b < 10):\n", " return -np.inf\n", " return 0.0\n", "\n", "def log_posterior(par,x,y,yerr):\n", " \"\"\" log posterior\n", " \"\"\"\n", " return log_prior(par) + log_likelihood(par,x,y,yerr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll supply the MCMC Metropolis routine from last time, but I've formulated it into a function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def metropolis_step(x, y, yerr, log_posterior, theta_t, lnpost_t, step_cov):\n", " \"\"\" Take single metropolis step and return new chain value\n", " \"\"\"\n", " # draw from proposal and calculate new posterior\n", " q = np.random.multivariate_normal(theta_t, step_cov)\n", " lp1 = log_posterior(q,x,y,yerr)\n", " \n", " # if posterior has value than previous step or falls within second random draw, accept\n", " if (lp1 - lnpost_t) > np.log(np.random.rand()):\n", " return q, lp1\n", " \n", " # otherwise, return old value\n", " return theta_t, lnpost_t\n", "\n", "def metropolis(x, y, yerr, log_posterior, p0, step = [1.e-4,1.e-4], nstep=50000, nburn=0 ) :\n", " \"\"\" General Metropolis MCMC routine from Foreman-Mackey notebook\n", " Input :\n", " x,y : independent and dependent variables\n", " log_posterior : function to calculate log(posterior) given parameters\n", " p0 : array of initial parameter, of length npar\n", " step : covariance array, (npar,npar)\n", " nstep : number of steps to take\n", " nburn : number of initial steps to discard\n", " \"\"\"\n", " \n", " lp0 = log_posterior(p0,x,y,yerr)\n", " chain = np.empty((nstep, len(p0)))\n", " for i in range(len(chain)):\n", " p0, lp0 = metropolis_step(x,y,yerr,log_posterior, p0, lp0, step)\n", " chain[i] = p0\n", " \n", " # Compute the acceptance fraction.\n", " acc = float(np.any(np.diff(chain, axis=0), axis=1).sum()) / (len(chain)-1)\n", " print(\"The acceptance fraction was: {0:.3f}\".format(acc))\n", "\n", " return chain\n", "\n", "def plotchain(chain,labels=None, nburn=0) :\n", " npts,ndim = chain.shape\n", " fig,ax = plt.subplots(ndim,1, figsize=(8,5))\n", " for idim in range(ndim) :\n", " ax[idim].plot(chain[:,idim],'k')\n", " if labels != None :\n", " ax[idim].set_ylabel(labels[idim])\n", " if nburn> 0 :\n", " ax[idim].axvline(nburn,color=\"g\",lw=2)\n", " \n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, now supply initial guess, number of steps, number of burning to discard, step size. Iterate on these until you get a reasonable looking chain and acceptance fraction: you should get acceptance fraction of 0.3 - 0.7 and chains should not show significant autocorrelation (no long coherence)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "p0 = # initial guess\n", "nburn = # number of burning to discard\n", "nstep = # number of steps\n", "step= np.diag() # step sizes in each parameter\n", "chain = metropolis(x,y,yerr,log_posterior,p0=p0,nburn=nburn,nstep=nstep,step=step)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plotchain(chain,labels=['m','b'],nburn=nburn)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show results in corner plot" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import corner\n", "print(chain.shape)\n", "plot=corner.corner(chain[nburn:, :], labels=[\"m\", \"b\"], show_titles=True, quantiles=[0.16,0.4,0.84])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we're going to use the MCMC Hammer emcee() routine. This routine uses multiple \"walkers\" to investigate the posterior. The recommendation is for at least twice as many walkers as parameters you are trying to determine.\n", "
\n", "emcee() is written using an object-oriented style: you instantiate a sampler, then use methods of that object to run the MCMC and to look at the results.\n", "
\n", "For emcee, you have to supply a log_posterior routine with the parameter array as the first argument. If the posterior routine needs additional data to compute (e.g., the data points!), then you supply these using the\n", "args=[] argument when you instatiate a sampler." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import emcee\n", "\n", "# for emcee, we just need to set the number of walkers. Recommendation is at least twice\n", "# the number of parameters\n", "nwalkers=4\n", "ndim=2\n", "nstep=10000\n", "nburn=100\n", "sampler = emcee.EnsembleSampler(nwalkers,ndim,log_posterior,args=[x,y,yerr])\n", "\n", "# starting location for each walker: this is a 2D array, with dimensions (nwalker,npar)\n", "p0=np.array([np.random.uniform(0,5.,size=nwalkers),\n", " np.random.uniform(0,2.,size=nwalkers)]).T\n", "print('Initial parameters: ',p0)\n", "\n", "s=sampler.run_mcmc(p0,nstep)\n", "\n", "# print acceptance fraction which are stored in acceptance_fraction attribute and computed\n", "# with get_autocorr_time() method\n", "print('acceptance fraction: {:8.3f}'.format(sampler.acceptance_fraction[0]))\n", "print('mean autocorrelation time: {:8.3f}'.format(np.mean(sampler.get_autocorr_time())))\n", "\n", "# the chains are saved in the chain attribute, with dimension (nwalkers, nstep, ndim)\n", "print(sampler.chain.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Can use autocorrelation length to discard burn-in (set to several times autocorrelation length), and to thin (set to roughly half the autocorrelation length)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig,ax=plt.subplots(ndim,2)\n", "\n", "# The flatchain attribute puts the chains from all of the walkers together\n", "print(sampler.flatchain.shape)\n", "\n", "# the get_chain() method will discard initial points and \"thin\" the chain\n", "nburn= # set nburn\n", "thin= # set thinning\n", "flat_samples = sampler.get_chain(discard=nburn, thin=thin, flat=True)\n", "\n", "# show the raw chains on the left, and the get_chain() results on the right\n", "for idim in range(ndim) :\n", " ax[idim,0].plot(sampler.flatchain[:,idim])\n", " ax[idim,1].plot(flat_samples[:,idim])\n", " \n", "# corner plot\n", "plot=corner.corner(flat_samples,plot_contours=True,\n", " labels=['mean','variance'],show_titles=True)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Allow for uncertainties on data points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's add a dimension and determine the scatter as well as the slope and intercept" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def log_likelihood(par,x,y,yerr) :\n", " \"\"\" log(likelihood)\n", " \"\"\"\n", " m,b,sig=par \n", " f = m*x + b\n", " # need to remember to include normalizing sigma term if that is a variable!\n", " if sig <= 0:\n", " # check for negative sig\n", " return -np.inf\n", " else:\n", " return -0.5 * np.sum(((y-f) / sig)**2) - len(x)*np.log(sig)\n", "\n", "def log_prior(par):\n", " \"\"\" flat prior\n", " \"\"\"\n", " m,b,sig=par\n", " if not (-10 < m < 10):\n", " return -np.inf\n", " if not (-10 < b < 10):\n", " return -np.inf\n", " if sig <= 0 :\n", " return -np.inf\n", " return 0.0\n", "\n", "def log_posterior(par,x,y,yerr):\n", " return log_prior(par) + log_likelihood(par,x,y,yerr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set initial parameters, etc. and tune to get a good chain" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "p0 = # initial values\n", "nburn = # nburn\n", "nstep= # nsteps\n", "step= np.diag( ... ) # step sizes\n", "chain = metropolis(x,y,yerr,log_posterior,p0=p0,nburn=nburn,nstep=nstep,step=step)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plotchain(chain,labels=['m','b','sig'])\n", "plot=corner.corner(chain[nburn:, :], labels=[\"m\", \"b\", \"sig\"], show_titles=True, quantiles=[0.16,0.4,0.84])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now do this case using emcee()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# for emcee, we just need to set the number of walkers. Recommendation is at least twice\n", "# the number of parameters\n", "ndim= # how many parameters\n", "nwalkers= # how many walkers\n", "nstep= # how many steps\n", "\n", "# instantiate the sampler\n", "sampler = emcee.EnsembleSampler(nwalkers,ndim,log_posterior,args=[x,y,yerr])\n", "\n", "# starting location for each walker: this is a 2D array, with dimensions (nwalker,npar)\n", "p0= # initial guesses\n", "print('Initial parameters: ',p0)\n", "\n", "# run the sampler\n", "s=sampler.run_mcmc(p0,nstep)\n", "\n", "# print acceptance fraction which are stored in acceptance_fraction attribute and computed\n", "# with get_autocorr_time() method\n", "print('acceptance fraction: {:8.3f}'.format(sampler.acceptance_fraction[0]))\n", "print('mean autocorrelation time: {:8.3f}'.format(np.mean(sampler.get_autocorr_time())))\n", "\n", "# the chains are saved in the chain attribute, with dimension (nwalkers, nstep, ndim)\n", "print(sampler.chain.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig,ax=plt.subplots(ndim,2)\n", "\n", "# The flatchain attribute puts the chains from all of the walkers together\n", "print(sampler.flatchain.shape)\n", "\n", "# the get_chain() method will discard initial points and \"thin\" the chain\n", "nburn= # use autocorrelation to set\n", "thin= # use autocorrelation to set\n", "flat_samples = sampler.get_chain(discard=nburn, thin=thin, flat=True)\n", "\n", "# show the raw chains on the left, and the get_chain() results on the right\n", "for idim in range(ndim) :\n", " ax[idim,0].plot(sampler.flatchain[:,idim])\n", " ax[idim,1].plot(flat_samples[:,idim])\n", " \n", "# corner plot\n", "plot=corner.corner(flat_samples,plot_contours=True,\n", " labels=['mean','variance','sigma'],show_titles=True)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Kernel density estimation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, let's consider kernel density estimation \n", "We'll consider both 1D and 2D distributions.\n", "\n", "If we want a 1D marginal distribution, we just use the values from the axis in question: no need to explicitly marginalize, the sampling does that for us automatically!\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Start with a histogram, using several of the \"rules\" described for choosing number of bins. We can use either the chain results from our Metropolis run, or the flat_samples results from the emcee run" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# use matplotlib default histogram\n", "fig,ax=plt.subplots(3,1)\n", "for idim in range(3) :\n", " ax[idim].hist( ... ) # sampled values of idim parameter\n", "fig.tight_layout()\n", "fig.suptitle('matplotlib default nbins')\n", " \n", "# use Scott's rule\n", "fig,ax=plt.subplots(3,1)\n", "for idim in range(3) :\n", " ax[idim].hist( ... ,bins='scott') \n", "fig.tight_layout()\n", "fig.suptitle(\"Scott's rule\")\n", "\n", "# use Freedman-Diaconis rule\n", "fig,ax=plt.subplots(3,1)\n", "for idim in range(3) :\n", " ax[idim].hist( ... ,bins='fd')\n", "fig.tight_layout()\n", "fig.suptitle('Freedman-Diaconis rule')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, now lets do a 2D histogram, say showing PDF for slope and intercept. Again, we just choose those two values of the sampled points to marginalize over sigam." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nbins= # number of bins\n", "out=np.histogram2d(..., ... ,bins=nbins)\n", "plt.imshow(out[0],origin='lower')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, now we get to real kernel density estimation, using scipy.status.gaussian_kde() \n", "\n", "You can use keyword bw_method= to set the bandwidth: the default is to use a value based on Scott's rule" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.stats import gaussian_kde\n", "print(chain.shape)\n", "\n", "#instatiate a 1D KDE estimator for the first parameter (slope)\n", "t=gaussian_kde(chain[:,0].T)\n", "\n", "# choose points to evaluate it at, first a 1D \n", "xplot=np.linspace(...,...100) # choose good range for slope\n", "plt.figure()\n", "plt.plot(xplot,t(xplot))\n", "\n", "# same for second parameter (intercept)\n", "t=gaussian_kde( ... ) # specify the second parameter\n", "yplot=np.linspace(...,...,100) # choose good range for intercept\n", "plt.figure()\n", "plt.plot(t(yplot),yplot)\n", "\n", "# 2D estimator for slope and intercept\n", "t=gaussian_kde( ... ) # specify first and second parameters\n", "# evaluate on 2D grid\n", "out=np.zeros([100,100])\n", "for ix,x in enumerate(xplot) :\n", " for iy,y in enumerate(yplot) :\n", " out[iy,ix] = t([x,y])\n", "\n", "plt.figure()\n", "plt.imshow(out,origin='lower',extent=[0,1,0,10],aspect='auto')\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Old/experimental stuff below here -- IGNORE" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.neighbors import KernelDensity\n", "\n", "X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])\n", "kde = KernelDensity(kernel='gaussian', bandwidth=0.2).fit(X)\n", "X_plot = np.linspace(-3, 3, 1000)[:, np.newaxis]\n", "print(X_plot.shape)\n", "#Xplot=np.atleast_2d(np.arange(-3,3,0.1))\n", "#print(Xplot.shape)\n", "y=kde.score_samples(X_plot)\n", "###plt.plot(X[:,0],X[:,1],'ro')\n", "#plt.plot(Xplot,y)\n", "#print(y)\n", "#print(X.shape)\n", "#print(kde.score_samples(X).shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Determine mean and standard deviation of set of points" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def gauss(x,m,s):\n", " \"\"\" Gaussian function\"\"\"\n", " return 1/np.sqrt(2*np.pi)/s*np.exp(-0.5*(x-m)**2/s**2)\n", " \n", "def log_prior(m,s) :\n", " if s<0 : return -np.inf\n", " return np.log(1/s)\n", "\n", "def log_posterior(par,y) :\n", " \"\"\"Return log likelihood\"\"\"\n", " m,s=par\n", " if s<0 : return -np.inf\n", " return np.sum(np.log(gauss(y,m,s)))+logprior(m,s)\n", "\n", "m,s,n=10,3,10\n", "x=np.random.normal(m,s,size=n)\n", "print(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def pi(x) :\n", " ''' simple gaussian distribution, mean of 2, sigma of 1'''\n", " return np.array(np.exp(-0.5*(x-2.)**2)) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, now we will define the MCMC procedure" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def mcmc(niter=100,x0=-1,step=0.5) :\n", " ''' \n", " routine for simple MCMC using Metropolis-Hastings algorithm\n", " input: niter: number of iterations to run\n", " x0: starting parameter guess\n", " '''\n", " # get probability at initial point\n", " pi0 = pi(x0)\n", " \n", " # initialize the chain and number of accepted points accumulator\n", " x = []\n", " accept = 0\n", " \n", " # iterate!\n", " for i in range(niter) :\n", " # generate a candidate point using a gaussian generating function\n", " xc = x0 + np.random.normal(scale=step)\n", " \n", " # evaluate probability ratio of new point to old point\n", " alpha = pi(xc)/pi0\n", " \n", " # Metropolis Hastings:\n", " if alpha>1 or np.random.random() < alpha :\n", " # accept the new point!\n", " x.append(xc)\n", " x0=xc\n", " pi0=pi(x0)\n", " accept +=1\n", " \n", " # if we want to see every step:\n", " #print xc, pi0, pi(xc), accept\n", " \n", " print('acceptance fraction: ',accept/float(niter))\n", " # return the MCMC chain\n", " return np.array(x),accept/float(niter)\n", " \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run it! Plot the chain and the resulting distribvution" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "step=10\n", "chain,accept=mcmc(niter=10000,x0=-10,step=step)\n", "fig,ax=plt.subplots(2,1)\n", "ax[0].plot(chain)\n", "ax[0].text(0.1,0.1,'step: {:.1f} acceptance: {:.2f}'.format(step,accept),transform=ax[0].transAxes)\n", "ax[1].hist(chain,bins=20)\n", "ax[0].set_xlim(0,500)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What happens when you play around with changing the step size?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, now lets use a canned routine, emcee. This wants us to supply a routine that gives log(probability). It also uses the concept of multiple walkers, each of which starts at a different location, to produce multiple chains from different starting points." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def lnprob(x) :\n", " \"\"\" ln(probability) for gaussian distribution with mean=2, sigma=1\n", " \"\"\"\n", " return -0.5 * (x-2)**2\n", "\n", "import emcee\n", "\n", "# for emcee, we just need to set the number of walkers. Recommendation is at least twice\n", "# the number of parameters\n", "\n", "nwalkers=2\n", "ndim=1\n", "sampler = emcee.EnsembleSampler(nwalkers,ndim,lnprob)\n", "\n", "# starting location for each walker: this is a 2D array, with dimensions (nwalker,npar)\n", "x0=np.array([[5.],[-5.]])\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, let's try it." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "s=sampler.run_mcmc(x0,5000)\n", "\n", "# print acceptance fraction\n", "print('acceptance fraction: {:8.3f}'.format(sampler.acceptance_fraction[0]))\n", "print('mean autocorrelation time: {:8.3f}'.format(np.mean(sampler.get_autocorr_time())))\n", "\n", "print(sampler.chain.shape)\n", "fig,ax=plt.subplots(3,1)\n", "\n", "# plot the chains for each walker\n", "ax[0].plot(sampler.chain[0])\n", "ax[1].plot(sampler.chain[1])\n", "\n", "#plot the flattened chain\n", "ax[2].plot(sampler.flatchain)\n", "plt.tight_layout()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Can use autocorrelation length to discard burn-in (several times autocorrelation length), and to thin (half the autocorrelation length)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(sampler.chain.shape)\n", "fig,ax=plt.subplots(3,npar)\n", "\n", "# plot the chains for each walker\n", "ax[0].plot(sampler.chain[0])\n", "ax[1].plot(sampler.chain[1])\n", "\n", "#plot the flattened chain\n", "ax[2].plot(sampler.flatchain)\n", "plt.tight_layout()\n", "flat_samples = sampler.get_chain(discard=100, thin=10, flat=True)\n", "plt.plot(flat_samples)\n", "print(flat_samples.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(sampler.chain.shape)\n", "fig,ax=plt.subplots(3,npar)\n", "\n", "# plot the chains for each walker\n", "ax[0].plot(sampler.chain[0])\n", "ax[1].plot(sampler.chain[1])\n", "\n", "#plot the flattened chain\n", "ax[2].plot(sampler.flatchain)\n", "plt.tight_layout()\n", "flat_samples = sampler.get_chain(discard=100, thin=10, flat=True)\n", "plt.plot(flat_samples)\n", "print(flat_samples.shape)\n", "flat_samples = sampler.get_chain(discard=100, thin=10, flat=True)\n", "plt.plot(flat_samples)\n", "print(flat_samples.shape)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.hist(flat_samples,bins=20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, let's move on to a problem with multiple parameters. Let's use MCMC on the problem of determining mean and standard deviation from a sample, where we've set xbar, variance, and number of points, as we did before with a grid calculation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def gauss(x,m,s):\n", " \"\"\" Gaussian function\"\"\"\n", " return 1/np.sqrt(2*np.pi)/s*np.exp(-0.5*(x-m)**2/s**2)\n", " \n", "def logprior(m,s) :\n", " if s<0 : return -np.inf\n", " return np.log(1/s)\n", "\n", "def logPDF(par,y) :\n", " \"\"\"Return log likelihood\"\"\"\n", " m,s=par\n", " if s<0 : return -np.inf\n", " return np.sum(np.log(gauss(y,m,s)))+logprior(m,s)\n", "\n", "m,s,n=10,3,10\n", "x=np.random.normal(m,s,size=n)\n", "print(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set up walkers with initial values\n", "nwalkers=4\n", "ndim=2\n", "pos=np.random.uniform(size=(ndim*nwalkers)).reshape(nwalkers,ndim)\n", "\n", "\n", "#run emcee\n", "sampler=emcee.EnsembleSampler(nwalkers,ndim,logPDF, args=[x])\n", "s=sampler.run_mcmc(pos,10000)\n", "\n", "print('acceptance fraction: {:8.3f}'.format(sampler.acceptance_fraction[0]))\n", "print('mean autocorrelation time: {:8.3f}'.format(np.mean(sampler.get_autocorr_time())))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To visualize the results, we will use a corner plot, but this time we don't have a grid of values, but a set of points that samples the posterior PDF. Foreman-Mackey has written a nice package called corner to do the plotting for us, with lots of nice options for labelling, etc." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#flat_samples = sampler.flatchain\n", "flat_samples = sampler.get_chain(discard=500,thin=30,flat=True)\n", "\n", "# corner plots to display output\n", "import corner\n", "plot=corner.corner(flat_samples,plot_contours=True,\n", " labels=['mean','variance'],show_titles=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the same, fixing xbar and variance" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xbar, var, n = 0, 4, 10\n", "def logL(x, xbar, var, n) :\n", " #using sigma=exp(x[1]) to avoid negative sigma, or could just set likelihood\n", " # to something very small for negative sigma\n", " mu, sigma = x[0],np.exp(x[1])\n", " return -n * np.log(sigma) - n / 2. / sigma**2 * ((xbar-mu)**2 + var)\n", "\n", "def pi(x) :\n", " return np.exp(logL(x))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set up walkers with initial values\n", "nwalkers=4\n", "ndim=2\n", "pos=np.random.uniform(size=(ndim*nwalkers)).reshape(nwalkers,ndim)\n", "\n", "#run emcee\n", "sampler=emcee.EnsembleSampler(nwalkers,ndim,logL,args=[xbar,var,n])\n", "s=sampler.run_mcmc(pos,5000)\n", "print('acceptance fraction: ',np.mean(sampler.acceptance_fraction))\n", "\n", "#sampler.flatchain.shape\n", "sampler.flatchain[:,1]= np.exp(sampler.flatchain[:,1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To visualize the results, we will use a corner plot, but this time we don't have a grid of values,\n", "but a set of points that samples the posterior PDF. Foreman-Mackey has written a nice package \n", "called corner to do the plotting for us, with lots of nice options for labelling, etc." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# corner plots to display output\n", "import corner\n", "plot=corner.corner(sampler.flatchain,plot_contours=True,\n", " labels=['mean','variance'],show_titles=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Straight line fit to points" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def Lmain(par,xdata,ydata,sig) :\n", " \"\"\" likelihood of point \"\"\"\n", " if sig <= 0 : return -np.inf\n", " return 1./sig/np.sqrt(2*np.pi) * np.exp(-0.5*(ydata-par[0]-par[1]*xdata)**2/sig**2)\n", "\n", "def logPDF(par,xdata,ydata,sig) :\n", " \"\"\" log(likelihood) for sum of main and outlier, weighted by outlier fraction\"\"\"\n", " if sig <= 0 : return -np.inf\n", " Ltot = Lmain(par,xdata,ydata,sig)\n", " return np.sum(np.log(Ltot))\n", "\n", "# simulate and plots some data, setting one point to be an outlier\n", "xdata=np.arange(0,20)\n", "intercept,slope,sig = 2.,5.,10.\n", "ydata=intercept+slope*xdata+np.random.normal(0,sig,size=len(xdata))\n", "plt.plot(xdata,ydata,'ro')\n", "plt.plot(xdata,intercept+slope*xdata)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# do the MCMC using emcee\n", "nwalkers=6\n", "ndim=2\n", "sampler=emcee.EnsembleSampler(nwalkers,ndim,logPDF,args=[xdata,ydata,sig])\n", "\n", "# starting guesses for parameters for each walker\n", "start=np.array([np.random.uniform(0,10.,size=nwalkers),\n", " np.random.uniform(0,10.,size=nwalkers)]).T\n", "\n", "# run MCMC\n", "s=sampler.run_mcmc(start,5000)\n", "\n", "# show acceptance fractions and chains\n", "print('acceptance fraction: {:8.3f}'.format(sampler.acceptance_fraction[0]))\n", "print('mean autocorrelation time: {:8.3f}'.format(np.mean(sampler.get_autocorr_time())))\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "#flat_samples = sampler.flatchain\n", "flat_samples = sampler.get_chain(discard=100,thin=15,flat=True)\n", "\n", "fig,ax=plt.subplots(ndim,1)\n", "for i in range(ndim) :\n", " ax[i].plot(sampler.flatchain[:,i])\n", "\n", "# corner plot of parameters\n", "plot=corner.corner(flat_samples,plot_contours=False,\n", " labels=['intercept','slope'],show_titles=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Straight line fit with outliers, use a mixture model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def Lmain(par,xdata,ydata,sig) :\n", " \"\"\" likelihood of main distribution \"\"\"\n", " if sig <= 0 : return -np.inf\n", " return 1./sig/np.sqrt(2*np.pi) * np.exp(-0.5*(ydata-par[0]-par[1]*xdata)**2/sig**2)\n", "\n", "def Lout(par,xdata,ydata,sig) :\n", " \"\"\" outlier likelihood: broad gaussian \"\"\"\n", " if sig <= 0 : return -np.inf\n", " return 1./(5*sig)/np.sqrt(2*np.pi) * np.exp(-0.5*(ydata-par[0]-par[1]*xdata)**2/(5*sig)**2)\n", "\n", "def logL(par,xdata,ydata,sig) :\n", " \"\"\" log(likelihood) for sum of main and outlier, weighted by outlier fraction\"\"\"\n", " if sig <= 0 : return -np.inf\n", " if par[2] < 0 or par[2] > 1 : return -np.inf\n", " Ltot = (1.-par[2])*Lmain(par,xdata,ydata,sig) + par[2]*Lout(par,xdata,ydata,sig)\n", " return np.sum(np.log(Ltot))\n", "\n", "# simulate and plots some data, setting one point to be an outlier\n", "xdata=np.arange(0,20)\n", "intercept,slope,sig = 2.,5.,5\n", "ydata=intercept+slope*xdata+np.random.normal(0,sig,size=len(xdata))\n", "ydata[2]=ydata[2]+50\n", "plt.plot(xdata,ydata,'ro')\n", "plt.plot(xdata,intercept+slope*xdata)\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# do the MCMC using emcee\n", "\n", "nwalkers=6\n", "ndim=3\n", "sampler=emcee.EnsembleSampler(nwalkers,ndim,logL,args=[xdata,ydata,sig])\n", "\n", "# starting guesses for parameters for each walker\n", "start=np.array([np.random.uniform(0,10.,size=nwalkers),\n", " np.random.uniform(0,10.,size=nwalkers),\n", " np.random.uniform(0.,1.,size=nwalkers)]).T\n", "# run MCMC\n", "s=sampler.run_mcmc(start,5000)\n", "\n", "# show acceptance fractions and chains\n", "print(sampler.acceptance_fraction)\n", "fig,ax=plt.subplots(ndim,1)\n", "for i in range(ndim) :\n", " ax[i].plot(sampler.flatchain[:,i])\n", "\n", "# corner plot of parameters\n", "plot=corner.corner(sampler.flatchain,plot_contours=False,\n", " labels=['intercept','slope','outlier prob'],show_titles=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# calculate the likelihoods at each point, looking for outliers by looking\n", "# at ratio of probability it is an outler to total probability\n", "\n", "# use mean parameters for the calculation\n", "pmean=np.mean(sampler.flatchain,axis=0)\n", "print(pmean)\n", "# calculate the unweighted likelihoods\n", "Lo = Lout(pmean,xdata,ydata,sig)\n", "Lm = Lmain(pmean,xdata,ydata,sig)\n", "\n", "# loop over points, calculating the probability ratio\n", "for i in range(len(xdata)) :\n", " print('{:7.3f}{:8.3f}{:7.3f}'.format(\n", " xdata[i],ydata[i],pmean[2]*Lo[i]/((1-pmean[2])*Lm[i]+pmean[2]*Lo[i])))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "OK, now let's generalize to the case where there are uncertainties in the x value as well. Now, to get the likelihood for each point, we need to integrate over all possible values of x given some model for the distribution of xdata." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# simulate and plot some data, with uncertainties on both axes\n", "sigx=3\n", "xdata0=np.arange(0,20)\n", "xdata=xdata0 + np.random.normal(0,sigx,size=len(xdata))\n", "intercept,slope,sig = 2.,3.,3\n", "ydata0=intercept+slope*xdata0\n", "ydata=ydata0+np.random.normal(0,sig,size=len(xdata))\n", "plt.plot(xdata,ydata,'ro')\n", "plt.plot(xdata0,ydata0,'go')\n", "plt.plot(xdata0,intercept+slope*xdata0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pdb\n", "def gauss(x,mu,sig) :\n", " return 1./sig/np.sqrt(2*np.pi)*np.exp(-0.5*(x-mu)**2/sig**2)\n", "\n", "import scipy.integrate\n", "def logL0(par,xdata,ydata,sig,sigx) :\n", " \"\"\" Likelihood function ignoring the uncertaintes in xdata \"\"\"\n", " L = gauss(ydata,par[0]+par[1]*xdata,sig)\n", " return np.sum(np.log(L))\n", "\n", "def logL(par,xdata,ydata,sig,sigx) :\n", " \"\"\" log(likelihood) including uncertainties in x\"\"\"\n", " if sig <= 0 : return -np.inf\n", " # loop over each point\n", " L =np.zeros(len(ydata))\n", " for i in range(len(ydata)) :\n", " # build up integrand, which has weight for the xprime, and y values at the xprime\n", " xprime=np.linspace(xdata[i]-3*sigx,xdata[i]+3*sigx,50)\n", " yint=gauss(xprime,xdata[i],sigx)*gauss(ydata[i],par[0]+par[1]*xprime,sig)\n", " L[i]=scipy.integrate.simps(yint,xprime)\n", " \n", " return np.sum(np.log(L))\n", " \n", "\n", "# do the MCMC using emcee\n", "\n", "nwalkers=6\n", "ndim=2\n", "\n", "# first do a run assuming no uncertainties in x, to get starting guesses\n", "sampler0=emcee.EnsembleSampler(nwalkers,ndim,logL0,args=[xdata,ydata,sig,sigx])\n", "# starting guesses for parameters for each walker\n", "start=np.array([np.random.uniform(0.,5.,size=nwalkers),\n", " np.random.uniform(0.,5.,size=nwalkers)]).T\n", "\n", "# run MCMC\n", "s=sampler0.run_mcmc(start,1000)\n", "# show acceptance fractions and chains\n", "print('acceptance fractions: ',sampler0.acceptance_fraction)\n", "# calculate mean and std of output chain\n", "pmean0=np.mean(sampler0.flatchain,axis=0)\n", "psig0=np.std(sampler0.flatchain,axis=0)\n", "print('pmean0: ', pmean0)\n", "print('psigma0: ',psig0)\n", "\n", "# plot the results\n", "fig,ax=plt.subplots(ndim,1)\n", "for i in range(ndim) :\n", " ax[i].plot(sampler0.flatchain[:,i])\n", "\n", "# corner plot of parameters\n", "plot=corner.corner(sampler0.flatchain,plot_contours=False,\n", " labels=['intercept','slope'],show_titles=True)\n", "\n", "\n", "# now allow for uncertainties in x\n", "#start=np.array([np.random.uniform(pmean0[0]-3*psig0[0],pmean0[0]+3*psig0[0],size=nwalkers),\n", "# np.random.uniform(pmean0[1]-3*psig0[1],pmean0[1]+3*psig0[1],size=nwalkers)]).T\n", "#sampler=emcee.EnsembleSampler(nwalkers,ndim,logL,args=[xdata,ydata,sig,sigx])\n", "#s=sampler.run_mcmc(start,1000)\n", "#print(sampler.acceptance_fraction)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "# now allow for uncertainties in x\n", "start=np.array([np.random.uniform(pmean0[0]-3*psig0[0],pmean0[0]+3*psig0[0],size=nwalkers),\n", " np.random.uniform(pmean0[1]-3*psig0[1],pmean0[1]+3*psig0[1],size=nwalkers)]).T\n", "sampler=emcee.EnsembleSampler(nwalkers,ndim,logL,args=[xdata,ydata,sig,sigx])\n", "s=sampler.run_mcmc(start,1000)\n", "print(sampler.acceptance_fraction)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# plot the results\n", "fig,ax=plt.subplots(ndim,1)\n", "for i in range(ndim) :\n", " ax[i].plot(sampler.flatchain[:,i])\n", "\n", "# corner plot of parameters\n", "plot=corner.corner(sampler.flatchain,plot_contours=False,\n", " labels=['intercept','slope'],show_titles=True)\n", "\n", "pmean=np.mean(sampler.flatchain,axis=0)\n", "fix,ax=plt.subplots(1,1)\n", "plt.plot(xdata,ydata,'ro')\n", "xplt=np.arange(0,20,0.1)\n", "#plot fit using mean parameters in blue, true relation in green, fit using mean\n", "# parameters for fit neglecting x uncertainties in red\n", "plt.plot(xplt,pmean[0]+pmean[1]*xplt,color='b')\n", "plt.plot(xplt,intercept+slope*xplt,color='g')\n", "plt.plot(xplt,pmean0[0]+pmean0[1]*xplt,color='r')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.11" } }, "nbformat": 4, "nbformat_minor": 1 }