Subsections

Astronomical image processing: Introduction and basics

The process of data reduction and analysis requires manipulation of raw data. With the advent of digital detectors, this is well suited to computers and software.

Historically, various software packages have been developed specifically for astronomical image processing, e.g.,:

IRAF. In particular, note PYRAF Python interface
XVISTA
GAIA
FIGARO: Caltech
MIDAS: European
AIPS: radio
Add-on packages: STSDAS (Space Telescope), PROS (X ray), DAOPHOT (stellar photometry), SeXtractor (source detection)...

These all had various specialities and pros and cons: availability, cost, GUI/command line, data handling (disk vs. memory), speed, ease of use (e.g., keywords vs. parm files), language and access to existing code, ability to add new code, scripts/ procedures (internal control language). Most of these packages recognized the value of scripting and repeating operations for many images, and, as a result, included some built-in scripting capabilities.

With the development of scripting languages such as IDL and Python, things have been gradually shifting away from the astronomy-specific packages towards routines written in the generic scripting language, or at least callable from it.

Some image processing tasks are simple and some are complex. For complex tasks, it is convenient to use pre-built tools. But it's generally important to recognize what a given tool is doing, and if it might have limitations: you shouldn't be limited in what you choose to do by the tool you are comfortable with, so always keep open the possibility of other tools, or improving the capability of a tool!

What software should you learn? These days, many instruments require rather involved tasks for reducing data. Often, the instrument team or observatory supplies routines (in some package) for doing these tasks. Generally, it is may be easier to use these routines rather than reprogram them using your favorite tool! So you are probably in the position of having to be comfortable with multiple tools.

An alternative way to look at things is that to be at the forefront, you will likely be working with new instruments and/or new techniques. Using standard analysis may be unlikely to take the most advantage, or even work at all, with new data. So you want to be in the position of having the flexibility to develop tools yourself.

There are several programming environments that make it fairly simple to work with astronomical data. Here, we'll provide an introduction to two of the more popular environments in the US: Python (especially useful in conjuction with PyRAF) and IDL. Working in one of these environments allows you to script the use of existing routines, and also to develop your own routines. Also extremely important to have tools to be able to explore data.

Getting started with Python

Basics
- Start python using: ipython –matplotlib
- Python works with objects. All objects can have attributes and methods.
- Get information:
  - type(var) gives type of variable
  - var? gives information on variable (iPython only)
  - var.<tab> gives information on variable attributes and methods
- Python as a language
  - conditionals via if/elif/else
  - looping via for, while
File I/O with astropy
- FITS: header/data, data types, HDUList, etc.
  - from astropy.io import fits
  - hd=fits.open(filename) returns HDULIST
  - hd[0].data is data from initial HDU in a numpy array
  - hd[0].header is header from initial HDU in a numpy structured array
- ASCII:
  - from astropy.io import ascii
  - a=ascii.read(filename) returns Table with columns
Image statistics
- numpy array methods: e.g.:
  - data.sum() : total
  - data.mean() : mean
  - data.std() : standard deviation
  - np.median(data) : median
- subsections: data[y1:y2,x1:x2]
Image display
- primitive display via imshow
  - plt.imshow(hd[0].data,vmin=min,vmax=max)
- display using pyds9,
  - from pyds9 import *
  - d=DS9() (opens a DS9 window, associates with object d)
  - d.set("fits filename") (display from file)
  - d.set_pyfits(hd) (display from HDULIST)
  - d.set_np2arr(hd[0].data) (display from numpy array)
  - d.set("scale limits 400 500") (sets display range)
  - command list
- display with tv
  - import os
  - os.environ["PYTHONPATH"] = /home/holtz/python
  - from holtz.pyvista.tv import *
  - t=tv.TV()
  - t.tv(hd[0],min=400,max=500)
  - t.tv(hd[0].data)
  - zoom, pan, colorbar
  - blinking image buffers with +/-
Plotting
- plt.figure
- plt.plot(hd[0].data[:,100] (for a plot along column 100)
- plt.plot(hd[0].data[500,:] (for a plot along row 500)
Histogram: plt.hist(data.flatten(),[bins=n],[bins=np.arange(min,max,delta)],[log=True])

Exercises

Let's continue to work with images in /home/holtz/raw/apo/a535/spicam/UT061215.

Read in several images: SN17135_r.0103.fits, flat_r.0015.fits, bias.0001.fits. Display them from inside a software environment ( Python/pyds9/tv/imshow or IDL/atv) and get familiar with using that.
Construct image histograms for several images: SN17135_r.0103.fits, flat_r.0015.fits, bias.0001.fits. See if you can explain where all of the different values are coming from.
Choose some sky subregions in SN17135_r.0103.fits, excluding objects in the field. Look at some pixel histograms of these regions. Determine the mean and standard deviation in the regions. What do you expect for the standard deviations? Are the standard deviations what you expect? Is the standard deviation in a region with an object a meaningful/useful quantity to look at?
Note the overscan region at the right side of the image (look at SN17135_r.0103.fits or flat_r.0015.fits). This gives the bias level, a constant that is added to the image to ensure that readout noise does not lead to any negative values. Determine the mean in the bias region, as well as the standard deviation, using image statistics. What do you think the standard deviation is giving you information about?
Using the mean bias level, use image arithmetic to subtract this level off of the entire image. How will this affect the previously calculated means and standard deviations in the image subregions? Are the standard deviations what you expect?
For many astronomical detectors, the numbers recorded are not the number of photons incident on each pixel, but are related to the number of photons by a multiplicative factor, called the gain. For SPICAM, the gain is about 4, i.e., the number of photons incident on each pixel is given by 4 times the data numbers. Convert the images to photon counts by multiplying the image by this factor. How will this affect the previously calculated means and standard deviations?
Now that you've subtracted bias level and accounted for gain, are the standard deviations in your subregions closer to your expectations?
There is still some extra scatter coming from pixel-to-pixel sensitivity variations. Looking at flat_r.0015.fits, estimate the level of this variation.
To correct for pixel-to-pixel variation, we divide raw image by a normalized flat field. Normalize the flat field by dividing it by the mean in the central region (subtract the bias from the flat first!), then divide the object frame by the normalized flat field. Recalculate the image statistics in the subregion and comment on how they have changed.
Write a program to loop over ALL images taken on UT061215 (note Python glob.glob('*.fits') to get list of all files), display each one, and compute and output the mean bias level and the mean sky level ( for Python, check photutils.background or mmm routine in holtz.pyvista.tv; for IDL, check out Astronomy Users Library sky command). Make a single plot of vertical cross sections in the overscan regions, averaging over the width of the overscan region.