Subsections
The process of data reduction and analysis requires manipulation of
raw data. With the advent of digital detectors, this is well suited
to computers and software.
Historically, various software packages have been developed
specifically for astronomical image processing, e.g.,:
- IRAF. In particular, note PYRAF
Python interface
- XVISTA
- GAIA
- FIGARO: Caltech
- MIDAS: European
- AIPS: radio
- Add-on packages: STSDAS (Space Telescope), PROS (X ray), DAOPHOT (stellar photometry), SeXtractor (source detection)...
These all had various specialities and pros and cons: availability, cost, GUI/command line, data handling
(disk vs. memory), speed, ease of use (e.g., keywords vs. parm files),
language and access to existing code, ability to add new code, scripts/
procedures (internal control language). Most of these packages recognized
the value of scripting and repeating operations for many images, and, as
a result, included some built-in scripting capabilities.
With the development of scripting languages such as IDL and Python, things
have been gradually shifting away from the astronomy-specific packages
towards routines written in the generic scripting language, or at least
callable from it.
Some image processing tasks are simple and some are complex. For complex
tasks, it is convenient to use pre-built tools. But it's generally
important to recognize what a given tool is doing, and if it might have
limitations: you
shouldn't be limited in what you choose to do by the tool you are comfortable
with, so always keep open the possibility of other tools, or improving the
capability of a tool!
What software should you learn? These days, many instruments require rather
involved tasks for reducing data. Often, the instrument team or observatory
supplies routines (in some package) for doing these tasks. Generally, it
is may be easier to use these routines rather than reprogram them using
your favorite tool! So you are probably in the position of having to be
comfortable with multiple tools.
An alternative way to look at things is that to be at the forefront, you
will likely be working with new instruments and/or new techniques. Using
standard analysis may be unlikely to take the most advantage, or even work
at all, with new data. So you want to be in the position of having the
flexibility to develop tools yourself.
There are several programming environments that make it fairly simple
to work with astronomical data. Here, we'll provide an introduction
to two of the more popular environments in the US: Python (especially
useful in conjuction with PyRAF) and IDL. Working in one of these environments
allows you to script the use of existing routines, and also to develop
your own routines. Also extremely important to have tools to be able to
explore data.
- Basics
- Start python using: ipython –matplotlib
- Python works with objects. All objects can have attributes and methods.
- Get information:
- type(var) gives type of variable
- var? gives information on variable (iPython only)
- var.<tab> gives information on variable attributes and methods
- Python as a language
- conditionals via if/elif/else
- looping via for, while
- File I/O with astropy
- FITS: header/data, data types, HDUList, etc.
- from astropy.io import fits
- hd=fits.open(filename) returns HDULIST
- hd[0].data is data from initial HDU in a numpy array
- hd[0].header is header from initial HDU in a numpy structured array
- ASCII:
- from astropy.io import ascii
- a=ascii.read(filename) returns Table with columns
- Image statistics
- numpy array methods: e.g.:
- data.sum() : total
- data.mean() : mean
- data.std() : standard deviation
- np.median(data) : median
- subsections: data[y1:y2,x1:x2]
- Image display
- primitive display via imshow
- plt.imshow(hd[0].data,vmin=min,vmax=max)
- display using pyds9,
- from pyds9 import *
- d=DS9() (opens a DS9 window, associates with object d)
- d.set("fits filename") (display from file)
- d.set_pyfits(hd) (display from HDULIST)
- d.set_np2arr(hd[0].data) (display from numpy array)
- d.set("scale limits 400 500") (sets display range)
- command list
- display with tv
- import os
- os.environ["PYTHONPATH"] = /home/holtz/python
- from holtz.pyvista.tv import *
- t=tv.TV()
- t.tv(hd[0],min=400,max=500)
- t.tv(hd[0].data)
- zoom, pan, colorbar
- blinking image buffers with +/-
- Plotting
- plt.figure
- plt.plot(hd[0].data[:,100] (for a plot along column 100)
- plt.plot(hd[0].data[500,:] (for a plot along row 500)
- Histogram: plt.hist(data.flatten(),[bins=n],[bins=np.arange(min,max,delta)],[log=True])
Let's continue to work with images in /home/holtz/raw/apo/a535/spicam/UT061215.
- Read in several images: SN17135_r.0103.fits, flat_r.0015.fits,
bias.0001.fits. Display them from inside a software environment (
Python/pyds9/tv/imshow or IDL/atv) and get familiar with using that.
- Construct image histograms for several images: SN17135_r.0103.fits,
flat_r.0015.fits, bias.0001.fits. See if you can explain where all of the
different values are coming from.
- Choose some sky subregions in SN17135_r.0103.fits,
excluding objects in the field. Look at some pixel histograms of these
regions. Determine the mean and standard deviation
in the regions. What do you expect for the standard deviations? Are the
standard deviations what you expect? Is the standard deviation in a region
with an object a meaningful/useful quantity to look at?
- Note the overscan region at the right side of the image (look
at SN17135_r.0103.fits or flat_r.0015.fits). This gives the bias
level, a constant that is added to the image to ensure that readout noise
does not lead to any negative values. Determine the mean in the bias region,
as well as the standard deviation, using image statistics. What do you think
the standard deviation is giving you information about?
- Using the mean bias level, use image arithmetic to subtract this
level off of the entire image. How will this affect the previously calculated
means and standard deviations in the image subregions? Are the standard
deviations what you expect?
- For many astronomical detectors, the numbers recorded are not the
number of photons incident on each pixel, but are related to the number
of photons by a multiplicative factor, called the gain. For SPICAM, the
gain is about 4, i.e., the number of photons incident on each pixel is
given by 4 times the data numbers. Convert the images to photon counts
by multiplying the image by this factor. How will this affect the previously
calculated means and standard deviations?
- Now that you've subtracted bias level and accounted for gain, are
the standard deviations in your subregions closer to your expectations?
- There is still some extra scatter coming from pixel-to-pixel
sensitivity variations. Looking at flat_r.0015.fits, estimate the level
of this variation.
- To correct for pixel-to-pixel variation, we divide raw image by
a normalized flat field. Normalize the flat field by dividing it by the
mean in the central region (subtract the bias from the flat first!),
then divide the object frame by the normalized
flat field. Recalculate the image statistics in the subregion and
comment on how they have changed.
- Write a program to loop over ALL images taken on UT061215
(note Python glob.glob('*.fits') to get list of all files),
display each one, and compute and output the mean bias level and the mean sky level (
for Python, check photutils.background or mmm routine in holtz.pyvista.tv; for IDL, check out
Astronomy Users Library sky
command).
Make a single plot of vertical cross sections
in the overscan regions, averaging over the width of the overscan region.