Near-Infrared Spectroscopic Observing and Data Reduction

Working in the J, H, and K Bands

While near-IR photometry is only slightly more complicated than optical data, near-IR spectroscopy verges towards the boundary between art and science. But there is important work to be done at these wavelengths. First, you can do spectral typing and abundance work just like in the optical regime. Here is a K-band sequence of cool dwarfs:

Here's a classical nova observed using SPEX:

And here's the spectrum of a "hard x-ray transient" that all have turned out to be (highly reddened) O and B supergiants (this one has AV = 9 so even the SPEX J-band spectrum was not very good!):

Telluric Features (suck)

First, let's look a little more closely at the atmospheric transmission in these bandpasses. Here are the J, H, and K-bands:

The majority of the telluric absorption is from water vapor (0.92, 1.13, 1.35, 1.85, and beyond 2.4 μm). But there are strong carbon dioxide features near 1.6 μm, and at 2.0 and 2.06 μm. Water vapor absorption decreases rapidly with altitude, but carbon dioxide remains better mixed, so it decreases very slowly.

While these figures show the large scale telluric features quite well, we will see that there are innumerable smaller features that are just as problematic. We have to correct our spectra for these bloody features to have any chance at extracting useful science (but as shown at the top of this page, it must be possible ;-). The techniques we will use for reducing these spectra are like a combination of near-IR photometric reduction, and optical spectral reduction with the additional step of telluric correction. Let's look at a couple of raw frames. Here is an H-band spectrum of a bright star, followed by its trace:

Note that the blue cutoff is caused by the order sorting filter--it is just an H-band filter somewhere in front of the grating. Here are the same types of plots for the K-band:

It is easy enough to identify the telluric absorption features shown in the earlier transmission traces. Let's look at the entire near-IR spectrum to get a better view of what we are up against--this will show the emission lines too (mostly OH). Here is a cross-dispersed data set from the SPEX spectrograph at the IRTF. This instrument is very similar to TripleSpec at APO. Here we get the spectrum from 0.77 μm all the way to 2.48 μm in a single go. The figure on the right allows the identification of the various orders (we will reduce this type of data very shortly!).

Note that there are some reflections and other things going on inside SPEX. But the IRTF is a very good IR telescope, so the normal, large thermal emission component at the red end of the K-band is not very strong here (it is easy to see in other, higher emissivity telescopes like the Keck data in the exercise that follows). So what is the first step in reducing IR spectra? Background subtraction! Here is an A-B for that last image:

We can see how well this cleans things up, but note that there are some residual OH features in the H-band that do not completely disappear. Like optical spectra, we will extract and subtract a sky region that parallels the object spectrum. As long as the residuals are relatively weak, they do not cause too many problems (but these damn OH lines can be highly transient in strength--they are somewhat like the aurora borealis in nature).

IR Spectral Reduction

There are six steps to obtaining an infrared spectrum of an object:

  • Background subtraction
  • Flatfielding the resultant image
  • Tracing and extracting the spectrum
  • Wavelength calibration
  • Telluric correction
  • Flux calibration

    This is more or less like what you do in reducing optical spectra. We will go over all of these steps to insure everyone knows how to do it, but for some of you this will be familiar. For this exercise we are going to be reducing K-band spectra obtained with NIRSPEC on Keck. This instrument has two modes, a moderate (R ≈ 3000), and high (R ≈ 30000) modes. We will work with low resolution data. Here is a raw spectrum:

    Note that there is considerable curvature. In high resolution mode, the spectra are cross-dispersed. Due to the desire to get many orders on the chip, this means that the non-cross-dispersed data are severely curved. This can cause issues with sky-line subtraction. Also note the limited fill factor of the actual data---we will trim these images before using them (it saves disk space, and allows better statistics on flatfielding, etc.). There are three other things we need to reduce spectroscopic data:

    We need bias frames, flatfields, and arcs. As with IR images, we will be subtracting off the bias level when we form our A − B differences. The bias frames here are only needed to subtract from the flats and arcs. Obviously, you will have to use some sort of flatfield lamp to get useful flats for spectroscopic data. The typical IR arc lamp (like that used in the optical) is the HeNeAr lamp, with Argon supplying nearly all of the lines in the K-band. But Xenon lamps are often used to fill in the wavelength coverage for high resolution observations.

    Exercise #2 (Part 1: single bandpass, point source IR spectral reduction)

    Before we actually talk about HOW to obtain good IR spectroscopic data, it is worth reducing some first to see the process. We will talk about the pitfalls of this type of data along the way, and talk about how to get a useful data set. The name of the data set we will use here is "/home/nadir/tharriso/keckdata.tar" and included in this tarfile is an observation log constructed from the headers letting you know what these files are. In this file there are three sets of standard star observations (for telluric correction), flats, biases ("Lampsoff1"), an arc ("Argonlamps"), and six program object (ST LMi) spectra (by the way, if you wish to see the thermal emission from a telescope that is not optimized for IR observations, take a look at one of the raw ST LMi spectra). The first thing we want to do is trim these images. Start-up iraf and a ds9 window. Display a flatfield (e.g., feb17s0100.fits). Using the cursor, identify the x and y ranges of the actual, exposed data. It is easy to obtain image subsets using the imcopy command. For example: "imcopy feb17s0100.fits[50:400,1:1024] subset100.fits" (I made up that image section---I want you to find the correct one). You should do this for EVERY image with IDENTICAL image sections (ccdproc will barf if your images are not identical in size).

    Step 1: Form median biases and flats then subtract the bias from your flat and your arc.

    Step 2: Form the image differences like we did for the IR imaging. (Be careful here---the ST LMi data set is ABBA!)

    Step 3: Flatfield the difference images and the arc lamp (using ccdproc).

    After this you should have images that look something like this:

    Now comes the actual spectrum extraction process. In iraf type "twod" and "apex" and also "oned". We need to set the dispersion axis correctly for these NIRSPEC data so epar apextract:

    Ok, the next thing is the very complex package called "apall". Jon had you use "doslit" last semester for spectroscopic reduction, but here I will assume you know nothing and start fresh with apall. The first thing to do is just run apall to see what everything looks like. Let's do it on that first standard star (HD65781). The first window you will see looks something like this:

    It has correctly identified the spectrum here. This part of the program is very stupid, and will set the aperture at the highest peak (which is often a cosmic ray). If so, all you have to do to change that is to move the cursor to that (bad) aperture and hit "d". Then move to the correct aperture and hit "n". [Note: sometimes with very faint sources---like ST LMi!!!---the cosmic rays can render the source spectrum invisible at this step. If so, you will have to use the "imrep" command to reset all the wacky pixels to some manageable value before proceeding onward. There are ways to remove cosmic rays in iraf, but I have found them to be highly unreliable---my advice is to never use them.] We can adjust the size of the aperture later (it isn't that important), but let's continue through the rest of the process before editing the huge parameter file for apall. In the current irafterm window hit "b" to get to the background aperture windows, you should see something like this:

    These two little windows look ok. We will see what to do with them in a bit. Now we hit "q" to quit, and then "q" again, and then "yes" and "yes", and "yes" again, to get to the trace:

    I have default values in apall that are listed up at the top of this frame, and that's why my fit may be better than yours. To set these type ":order 3" in that irafterm window, and ":func legen", and then an "f" to have it fit the trace. Now hit "q", and "yes" until it stops asking, and you should get this:

    This looks pretty good, so we are done for the moment, so "q" out of this. We can take a look at the spectrum we just created using splot: "splot" (the name of the spectrum is now for my particular naming scheme--the "ms" is the iraf multispec format where errors and such are kept in separate image planes). In splot, you can "x" over any cosmic rays (first do an "a" and an "a" to expand the spectrum on either side of a cosmic ray, and then an "x" and an "x" to patch-over the cosmic ray, and then an "r" to redraw the spectrum, a "c" to redraw the entire spectrum--to save this modified spectrum you can hit "i" to write it out--don't do that yet) to produce a cleaner looking spectrum.

    Except for the feature near pixel # 300 (H I Brackett γ) in the plot above, all of those features are tellluric (except for possible edge effects at the ends). Note that you cannot get the entire K bandpass with NIRSPEC in one go, so this spectrum will run from about 2.04 to 2.46 μm.

    Ok, now it is time to fix-up apall, here is the first page of the parameter file:

    There are not too many things to set here, most of the defaults will work ok. The "Default Aperture Parameters" basically set the size of the aperture in the first window when running apall. They do not actually affect what is extracted, so you keep them set to +/-15 or reset them to +/-5. The one thing we do want to change on this page is the "DEFAULT BACKGROUND PARAMETERS". I have set the background function to chebyshev and its order to "1". This works great for spectra where there is no gradient across the chip. But if you have some sort of illumination issue (say the moon was 5 degrees away), you can set this to second order, or whatever. This is simply the function that is fit to our two background sample regions. Those are defined on the next line. You can see I have mine set to "-25:-15,15:25", this is relative to the aperture center. For curved spectra like these Keck data, it is best to have these pretty snug. The problem is that the sky lines are tilted, and thus on one side it may have a "y" value of 78, and on the other a "y" value of 80. The subtraction of the background then imparts a "P-Cygni" like profile in your spectrum where it subtracts too much flux at one position, but not enough at the other! While our sky line residuals are not too bad here, we cannot get this perfect using iraf, and we end up with "noisy" spectra that have nothing to do with the count level.

    For the first standard star the set-up of "-25:-15,15:25" was necessary as the seeing was terrible. But when you apall the ST LMi and HD130767 spectra you will see that the seeing improved and these background apertures are way too far from the source spectrum! For those data I would use "-15:-10,10:15". Seeing changes happen, so be prepared to adjust your reduction process (regardless if it is optical or IR). Ok, here is the next page of apall parameters:

    We won't change too many of these either---the first two blocks set the centering parameters and how many apertures apall will attempt to find (yes, you can reduce cross-dispersed data using apall!). The only time you might want to change these is when you have two close-by sources. You might want to be very careful in how apall identifies and sets the aperture in this situation. The important block here is the "RESIZING PARAMETERS" one. This is where the actual extraction aperture is defined. You can make apall extract a defined width (the llimit and ulimit parameters), or let it extract a width that depends on the source flux: peak = yes, and then set to what level to extract that spectrum ("ylevel=0.05"). I usually have apall do this instead of having fixed extraction apertures. The next block is the "TRACING PARAMETERS", and this is where to set the tracing function and order to "legendre", and "3". Sometimes you can do a simple order=2 fit if there isn't too much curvature in the spectrum, but I usually find that order=3 is necessary for most instruments. The other thing to adjust here is the "t_sampl" parameter. This sets the trace limits. You will often find that the pixel values at the ends of any image or spectrum are junk--this is because they ARE near the edge of the chip, where we get strange affects. You can exclude those regions by setting this parameter (and it is essential to insure the trace goes through your data---note you can delete individual pixels in the trace window by hitting "d". Cosmic rays and bad columns can skew your trace). In the case here, our spectrum runs from y=1 to y=1024. You can insure we ignore the ends by setting it to something that avoids these ends (5:1019 for example), as shown on this last page:

    The next block is vital. In fact, when you started apall up, it might not have even subtracted a background. You want to set "backgro" to average, and the other parameters to what you see here. In theory, fit2d would be the way to go with tilted sky lines, but I have difficulty getting it to work as it should. You can also use the saturation and clean parameters to kill-off any saturated pixels (which would have values like 65535 for 16 bit data), but I don't let iraf do these things since it takes detective work to see what was actually done to your data! Note that optical data will require a similar set-up to what we have just done, so getting good with apall is useful.

    Step #4: Extract (or re-extract) all of the spectra for the standard stars and ST LMi using the appropriate parameters in apall.

    In the process of extracting these data, you have probably noticed that our spectra did not all end up identically in the same place when doing a beam switch. These types of things happen at telescopes! With NIRSPEC, there is quite a lot of persistence on the chip after a bright source. So, sometimes redefining exactly where the A and B beams are is not too bad an idea. Now it is time to extract arc spectra, something that we have to do carefully, since these spectra have severe curvature so that the A and B spectra will actually cover different wavelength ranges! This is not too hard though, here is the command you would use:

    apall arc reference=a044 out=a044arc recen− trace− backg− interac−

    What is this command? It uses the fit for the aperture of spectrum "a044" (not ""!) to extract an arc spectrum along this path. The "recen−", etc. terms turn off all of the other things in apall (if you subtract a background aperture on an arc you will have nothing left!). Note you must do this for every spectrum! With IR data we have to worry about exactly where the telluric features are. In optical data you can sometimes get away with using a single arc extraction for the wavelength calibration. But here we cannot do that. The result of the previous command is shown here:

    Here is a plot of a wavelength-calibrated argon arc in the K-band (plots of JHK argon arcs can be found on my homepage):

    Not too difficult to figure this one out. Time to "identify" the arc. First, epar the identify task and edit it so it looks like this:

    For optical data you would probably use linelists$henear.dat, but it doesn't have the IR lines that argon.dat does. Note that my plot above has wavelengths in microns, but the argon.dat file is in Angstroms. Run identify on all of your arcs---put the cursor on the leftmost bright line and hit "m" to mark it and then enter its wavelength in Angstroms, then move to the right most line and repeat that process. Your screen should look like this:

    Hit "f", and it will switch to the fitting window. Hit "q" to go back to the line ID window and hit "l" to automatically identify the rest of the lines. You should see this:

    There are two arc lines that are very close to bright lines (at 22000 and near 20600) that are probably ok to delete using "d" in this window. Now hit "f" again to get this:

    For some reason that arc line at 23139.5 doesn't fit very well. Let's delete it with a "d" in this window and hit "f" again to get this:

    You probably have to "q" out of this window then "f" back into it to get the last plot. Note that our RMS error on this fit is 0.9663 Angstroms. Our central wavelength is 22000 Angstroms, so our fitting error is ≈0.0044%. This is good (you should always attempt to get similar quality fits).

    Step #5: Identify all of the arcs

    Once this is done, we assign these fits to our data, and dispersion correct them:

    refspec reference=a044arc

    dispcor hd40686a

    splot hd40686a (I like to convert to using their real names at this step so I don't have to figure out what I am looking at---this is the A-beam spectrum for HD40686) and you should see something like this:

    Step #6: "refspec" and "dispcor" all of your object spectra.

    You might be suprised how high the signal is in this spectrum---nearly 90,000 counts. How is this possible when the 16 bit saturation limit is 65,535? Well two things: There are multiple rows of data summed together in each wavelength bin and that can give you a larger number than the 16 bit saturation limit. But what is really going on here is that we used 5 coadds on the standard stars, and NIRSPEC sums (not averages or medians) these up when it writes them out.

    Telluric Correction

    Now comes the hard part---tellurically correcting our ST LMi data using the various standard stars. It is time to talk about this issue. As you can see in the traces at the top of this document, the amount of water vapor above you changes the strength of the telluric features. Unfortunately, this constantly changes throughout a night. Ideally if we simply observed our object at one airmass, all we would need to perform a telluric correction would be to simply observe our standard star at the same airmass. This sometimes works well, other times it is a miserable failure! I have seen changes within ten minutes which result in a poor telluric correction. The main problem is that you will usually be observing some faint program target and it may take a long time to get sufficient exposure time for your S/N goal---you may sit on this target for hours. The spectra at the beginning and end of this run will almost certainly need different telluric standards. There will probably be a gradient in the telluric features over time--or they may just vary at random! It is especially common to see the tellurics weaken over a night as the temperature declines. Thus, an object at the beginning of the night at an airmass of 1.0 may have stronger telluric features than an object at an airmass of 1.3 at the end of a night (of course, the reverse can happen). It is an especially acute problem at large airmasses. If your object is going to be at an airmass of > 1.5, take several standards with airmasses slightly below and above this object's airmass. This is the main difficulty with IR spectroscopic data reduction--it is complicated by the changing strength of the telluric features.

    So, what kind of standard star do we use to perform this "telluric correction"? Ideally, what we would like is to have a star that is completely free of any absorption features. Of course, such objects do not exist. For IR reduction two types of standards have been used: A0 dwarfs and G2 dwarfs. Here is the spectrum of an A0V star (Vega) in the three bands:

    A0V stars are especially nice in the K-band since there is really only the H I Brackett γ feature to deal with. But the H-band is quite a bit messier as we have the entire Brackett series to deal with. Here are the spectra of a G2V (the sun):

    While G stars have lots of metal lines, they tend to be weaker than any other spectral type. So, if you are trying to get good H I profiles in a hot star, or an HII region, you might consider using a G star for telluric correction. So what is the process? You divide your standard star spectrum into the object spectrum. This process will create artificial emission lines at the positions of any absorption features in the standard star. To get rid of these we next multiply the previous division by a model atmosphere for Vega (or the observed spectrum of the Sun). For an example of where I tested how well this process works, see Fig 1 in this paper. We can also use this as a flux calibration step. Since we know the K magnitude of our standard star, and that of Vega, we can normalize the model atmosphere spectrum so that we have something resembling a flux calibrated spectrum. If we don't care about the HI line profiles, we can always multiply the divided spectrum by a 9800K blackbody spectrum (if an A star) to get the same result. Expecting accurate fluxes from spectra is wishful thinking. Most IR spectrographs have narrow slits, and you can rarely recover fluxes better than +/−20%.

    With these warnings, we now will attempt to tellurically correct the ST LMi data using the three standards. As the observation log shows, the airmass of ST LMi was low, about 1.01. I have included three A0V stars with similar airmasses: HD40686 (1.07), HD96951 (1.09), and HD130767 (1.00) in this data set. We will have to investigate which one does the best job. Since ST LMi was close to zenith, the tellurics will be easier to deal with than is often the case.

    Ok, before proceeding we need to combine our A and B spectra of our standard stars together, and all six ST LMi spectra using scomb (epar scomb to have "combine=median", and "reject=avsigclip" and combine all six spectra using a comma-separated list). Note that if there are easily identifiable cosmic rays in the standard star spectra (since there are only two!), they should be x'd out in splot (scomb should get rid of any CR's for ST LMi). Use specplot to plot the A and B pairs together (the names are comma-separated). In that window, we can compare the wavelength solution and coverage before the scomb step to insure everything is clean. I have colored one of these spectra (using :color 3 in the irafterm window) and then shifted (using the "s" command) one spectrum vertically ("y") to insure that they EXACTLY overlay each other (except for the ends, here is where you see the affect of the extreme curvature of NIRSPEC data!):

    This step is really not essential, since it only took a minute or so to get the standard star data set (and NOTHING should change in that short time), but it provides a confirmation that the dispersion correction went well. Note that for long sequences on any spectrograph, there can be considerable flexure as the telescope slowly changes its position on the sky, so that the wavelength solution for the first spectrum in the series is not the same as in the last. If you are doing a long series, always take an arc at the beginning, at the end and possibly even in the middle. For our IR spectra, we will be adjusting the wavelength solutions to insure they overlap before proceeding anyway, so having lots of arcs is not too big of a deal---you can usually rely on the sky lines to let you know if the spectrum shifted. Since at most observatories arcs take little time, more is better than fewer (note that for this observing run I actually have arcs at each new object---or big telescope move, but for simplicity we have only used one here to minimize the number of steps required to produce something).

    The first step is a quick division of our combined ST LMi spectrum by the three standard stars using "sarith" ("sarith a / b c"), you should get something that looks like this (using specplot to plot all three):

    The three HD stars provide similar quality corrections. But the presence of spikes means that the wavelengths do not perfectly agree between any of the four spectra. So, we manually shift the wavelength of the ST LMi spectrum up and down by a few Angstroms using specshift (or you could do it to the standard star--but usually you will be reusing the standard), and then performing a new division. Here I demonstrate using HD96951 (the middle of the three, above, which seems the best match--it has the weakest telluric residuals, but not the same airmass as ST LMi!):

    In this plot, I blew up the red end (specplot xxx.fits,yyy.fits xmin=23000 xmax=24600) to look at that very strong telluric feature at 24375. Note that I had to shift the ST LMi spectrum by -5 Angstroms to (almost) get rid of that spike there caused by the slightly different wavelengths for the HD and ST LMi data sets. There was no discernable improvement going to -6 so I stopped here. Here is a cleaned-up spectrum of ST LMi smoothed (hit "s" once in splot) by 5 pixels:

    In this spectrum we are now looking at (mostly) real features in the spectrum of ST LMi. The emission line at 2.06 μm is He I, and the absorption feature at 2.20 μm is the Na I doublet. Since this is a CV, there is almost certainly intrinsic HI Brγ emission at 2.16 μm, but the feature here is almost completely due to the division by the A star. We have to get rid of this, and correct the slope of the continuum. We do this by multiplying this spectrum by a model atmosphere spectrum of Vega (as shown earlier, and included in this data set: VegaNorm.fits). After this step we get the final product:

    At this step we could also flux calibrate, but I won't do that for this data set (you just construct a spectrum of Vega with the correct flux level at 2.19 μm for the standard star you used using sarith). Note that HI disappeared after our correction--well almost, that little peak is at the right spot. This could be due to intrinsic HI emission, or simply due to the fact that this standard star had different HI profiles than the Vega model used to correct this star. It is extremely dangerous to believe weak HI features in IR spectra due to this fact. We will see how one might correct for this uncertainty in the next part of this exercise (especially critical in the H band).

    Step #7: Create this final ST LMi spectrum.

    Turn in a plot of this spectrum and answer the following question:

    Question #1: Using the figure at the top of this page, what is the approximate spectral classification for ST LMi?

    To insure I didn't waste your time reducing a standard star that did not match our target, I didn't include any raw data for such an object. But if you were paying attention, there was a spectrum entitled HD107193.fits when you unpacked the tar file. Go back and divide the ST LMi spectrum by this star to see what happens when you have a grossly mismatched telluric standard. Because of Mauna Kea's altitude, and the relative dryness of this night, the result is not as bad as some other places and times. But especially note the poor CO2 result at the blue end of the resulting spectrum.

    End of part one of exercise #2.

    Note that with some extra care (for example, insuring that each of the ST LMi spectra were shifted to the same exact wavelength position, and getting rid of serious cosmic rays before the scomb step) it is possible to produce a somewhat better spectrum, as shown below:

    ST LMi has K = 13.5, and could have benefited from twice as much integration time as we obtained here. This shows you how hard it is to obtain IR spectra even on the largest telescope there is! Our program goals here were to obtain spectra of several systems, so we obtained what we obtained, and it was adequate for our program's purposes.

    Back to the homepage