Using a uniform computing environment: Astronomy cluster vs your laptop
Python
Python is a programming language that has well-developed package for
making plots (matplotlib). Some possibly useful references:
Side note: distinction between Python 2 and Python 3 : use Python 3!
Ways to run Python:
- python : for running existing programs, not best for interactive
- ipython : provides better interactive experience, e.g. command recall, etc.
- ipython provides command-line recall across sessions, also with starting characters
- ipython startup scripts in ~/.ipython/profile_default/startup/
- brief intro to the concept of objects: ipython provides object attribute and method identification using TAB , see below
- ipython provides help with ?
- ipython magic commands , e.g. %run will run a
python program from within an iPython session
- ipython --matplotlib (best for interactive plotting, but not if you want to run programs that create plots in files)
- ipython --pylab (automatically loads matplotlib.pyplot as plt and numpy as np and sets up interactive plotting)
- Running an existing python program
- Run the commands: python file.py
- While running ipython: %run file.py or import file.py
- Jupyter (formerly iPython) notebooks: allow you to intersperse text and python, including execution of code
- Starting a new Jupyter notebook
- jupyter notebook (may need to add --browser=firefox)
- in upper right, pull down New, and select Python 3
- after working, save your notebook using the File menu (standard extension .ipynb)
- Opening a pre-existing ipython notebook : jupyter notebook {name} (or select from file list)
- Working inside an ipython notebook: markdown vs. code cells
- Two main types of cells: code (for Python code) and markdown (for text/comments)
- Cells are exectued using shift-enter
- Note that if you execute a blank markdown cell, you get some funny text!
- Display matplotlib figures inline
- great way to document and communicate thought processes with results. Perhaps not best for code development, but note that
code in a notebook can be exported to a file (Download as), although you will need to clean it up to make nice code
- Python code editors, e.g., spyder (but uses a graphical user interface)
Practice:
- Start an iPython notebook!
Some very simple Python:
- variable types : implicit (not declared) typing of: integers, floats, strings, bool
- Can convert between types with int(), float(), str() functions
- Note that floats are only to machine precision, careful about equating them?
- print(var) (intrinsically in interactive mode!)
- formatted printing using format method: {string}.format(var)
import math
print('{:7.4f}'.format(math.pi))
format_string='{:7.1f}{:7.2f}{:7.3f}{:7.4f}'
print(format_string.format(math.pi,math.pi,math.pi,math.pi))
- For every format in {}, there is a variable in ()
- arithmetic; +, -, *, /, **, //, %
- more complex operators in math and numpy packages: log(), log10(), exp(), sin(), etc
- Exercise: if temperature will hit 103 F today, what is that in C? C=5/9 (F-32)
- Write a program to ask for temp to convert, and convert it
- Get input using var=input(prompt), which will give you var as a string
- strings and string operations
'help'+'me'
- collection variables: lists (brackets) and tuples (parentheses)
l = [1,2,3,1,5,7]
t = (a,b)
- Lists are an easy way to store data, but for numerical data, not
the best way: see numpy arrays below!
- lists can contain multiple data types
- list arithmetic may not be what you expect
- Tuples are often returned by functions (see below)
- variable types: dict and set
d = { 'name' : 'Jon', 'room' : 102}
set(l)
- variables as objects: they have attributes and methods. In ipython, use {object}. to view all of the possibilities
- Example, string methods
'hello '.strip()
'test.txt'.replace('txt','fits')
- arrays (though numpy): for scientific/numerical applications,
numpy arrays are almost always preferred to lists!
- import numpy
- initialize array using numpy.zeros([ny,nx]), numpy.ones([ny,nx]), numpy.arange(start,end,delta), numpy.linspace(start,end,n)
- turn a list into an array using numpy.array(listvar)
- array arithmetic is implicit: if one does arithmetic between a constant and an array, it applies to each element automatically; if one does arithmetic between arrays, it does it element-by-element (if the shapes don't match, then it generates an error!)
- Warning: Python variables may not behave as you expect from other languages.
In fact, they are perhaps better thought of as labels. Consider the
following examples:
a=1
b=a
a=2
print(a,b)
a=[1,2,3]
b=a
a[0]=0
print(a,b)
control statements:
if condition1 :
execute code
elif condition2 :
execute other code
else :
execute yet other code
Note that indentation defines the blocks!
- Conditional operators: ==, >, <, etc
loops: for item in interable :
a=[4,5,6]
for element in a:
print(a+3)
- Note use of range(n) for integer indexed loops
a=[4,5,6]
for element in range(3) :
print(a[element]+3)
- Note use of enumerate(iterable) to add integer index to loops
a=[4,5,6]
b=[7,8,9]
for i,element in enumerate(a) :
print(element+3,b[i])
- Note use of zip(iterable1,iterable2) to loop through multiple iterables simultaneously
a=[4,5,6]
b=[7,8,9]
for i,j in zip(a,b) :
print(i,j)
functions: used to avoid repetition of code for repeated calculations, also for code organization and readability. Think modularly! Use functions!
- Note that functions can return values that can be assigned to variables
- functions can take arguments, either positional or by keyword; keyword arguments can have default values
def addconst(a,const=1) :
"""
Adds a constant to an input variable
Args:
val : value to which constant is added
Keyword args:
const= : specifies what constant to be added
Returns:
new : new value
"""
return a+const
Usage:
addconst(4)
addconst(5,const=10)
Write a function to do temperature conversion
- Use if block to go both directions
- Use optional argument to specify direction
Advantages of writing code in a file, rather than just doing it at command line
- it can be saved and rerun!
- it can be commented
- # at the beginning of the line indicates a comment
- text between triple-quotes (''' or """) allows for multi-line comments
- all functions should have docstrings between triple-quotes!
Note different docstring styles: restructured text, Google, numpy. see, e.g., references
in this page
Practice
- Start ipython interactively: ipython
- Define some numerical variables, and do some arithmetic with them
- Define some string variables, and do some operations with them
- Write some operations in a file, and then execute the file from the command line, and from within ipython
Using pre-existing python packages/modules:
- a module is a collection of routines (functions); a package may be a single module, but may also contain multiple modules
- import module
- import module as shortname
- from packagename import module
- Finding modules
- system-installed modules: PyPI and pip
- getting new modules installed at NMSU Astronomy (pip install --user and reasons not to use it!)
- user modules: PYTHONPATH environment variable
- Frequently used modules:
- matplotlib : plotting routines
- numpy : numerical functions, provides array functionality and operations
- scipy : many scientific analysis routines
- astropy : rapidly developing set of routines for astronomical analysis/calculation
- Convenience feature of ipython: startup files in ~/.ipython/profile_default/startup
Reading data files in Python:
Practice:
Python plotting with matplotlib
- remember, for automatic plot display, start ipython with --matplotlib!
- import matplotlib.pyplot as plt (or also use --pylab when starting ipython!)
- Simple plotting and overplotting:
- plt.plot(x,y[,pointstyle]) (with pointstyle, draws points, otherwise connects points)
- x, y are lists of numbers, or numpy arrays x, y are lists of numbers, or numpy arrays
- example pointstyles give color ([krgbcym]) and shape ([o+.^v<>sph]) : 'ro', 'g+', 'b.', 'c^', 'yv', ...
- plt.plot(x,z[,pointstyle])
- if plot window doesn't automatically show, restart ipython with --matplotlib, or use plt.show() to see plot
- plt.clf() # to clear the plot
- Using matplotlib window functions from the icons
- Customizing plots with limits and labels:
- plt.xlim(xmin,xmax) # x limits
- plt.ylim(ymin,ymax) # y limits
- plt.xlabel(xlabel)
- plt.ylabel(xlabel)
- plt.text(x,y,text) # add text at arbitrary location
- plt.title(title) # put a title on the plot
- Subwindows
- plt.clf()
- plt.subplot(1,2,1) # (ny, ny, id)
- plt.plot(x,y)
- plt.subplot(1,2,2)
- plt.plot(x,z)
- More sophisticated plot interface, allows for multiple figures to be open at a time, more control
- fig=plt.figure()
- ax1=fig.add_subplot(1,2,1) (shorthand: fig.add_subplot(121)) # note difference in function name from above
- ax1.plot(x,y)
- ax1.set_xlim(xmin,xmax) # note difference in function name from above
- ax1.set_ylim(ymin,ymax) # note difference in function name from above
- ax2=fig.add_subplot(1,2,2)
- ax2.plot(x,y)
- ax2.cla # to clear an axis
- fig.tight_layout() # helps if you have things that overlap between plots
- plt.draw()
- Packing more information into your plots: point colors and point sizes
- scatter: allows you to code points by color and/or size, e.g.
ax.scatter(x,y,c=z,vmin=zlo,vmax=zhi,size=t)
ax.colorbar()
will plot x vs y, color points according to z, size points by values in t, add colorbar
Practice:
- Plot up isochrone data from ASCII data file
- Plot observational color-magnitude diagram (e.g., V vs B-V, what are appropriate limits for y-axis?)
- Plot theoretical HR diagram (Luminosity vs Teff)
- Make these subplots on the same figure
- Add additional information: color code points by log(age)
- Add even more information: make point sizes proportional to radius of stars
- How do you calculate the radius? Write a Python function!
- How do you map the radii into point sizes? Write a Python function!
Package development in Python
- consider software organization from day 1
- Imagine coming out of NMSU with a software portfolio
- Organization:
- software subdirectories under project/class directories? (multiple PYTHONPATH)
- project/class subdirectories under software directory (single PYTHONPATH)
- example
Practice
- Let's write a package involving fitting a straight line to a set of (x,y) data points by
the procedure of least-squares. The algorithm for doing this is:
- slope = sum {x_i x_i) * sum{y_i) - sum{x_i} sum{x_i y_i} / ( N sum{x_i x_i} - sum{x_i}^2 )
- intercep = N * sum{x_i y_i) - sum{x_i} sum{y_i} / ( N sum{x_i x_i} - sum{x_i}^2 )
- Put all of the following into a single file (a module)
- Write a function returns an array of y values given as input an array of x-values, a slope, and an intercept, where the
latter two are optional (keyword) parameters that default to 1 and 0, respectively. Remember to document this with a docstring
- Write a function that returns a set of y values from a set of x values, a slope and an intercept with a random gaussian
uncertainty added to each output value. Remember to document this with a docstring
- Write a function that takes as input two arrays of x and y data, and returns the slope and intercept of the best fitting line. Remember to document this with a docstring
- Write a test function that tests your fitting function, using your functions that generate data with (or without). Remember to document this with a docstring.
uncertainties.
- Put this function in a python subdirectory under your home directory
- add that directory to your PYTHONPATH
- use your routine from another directory by importing it
- Upgrades:
- modify your fitting routine to allow for input uncertainties on the points
- add options/routines to plot the data with the best fit
- experiment with the quality of the fit parameters given uncertainties, by generating a large number of datasets, and making histograms of the distribution of resulting parameters, for different input parameters
- determine how to get formal uncertainties on your best fit parameters
Documenting python code using sphinx
- sphinx-quickstart
- autodoc
- Note other documentation tools exist, e.g., for other languages, e.g., doxygen