The Scientific Method

Statistics for Laboratory Experiments

The scientific method plays a significant role in conducting experiments, along with good experimental technique and basic statistics. Laboratory experiments will help you to better understand how scientists test their hypotheses. Here are a few reminders concerning some of the basic statistics used in laboratory experiments.

Histograms and normal distributions (μ and σ)
We often measure a certain quantity for a subsample of objects, and apply the results of our investigations to a larger population. We might weigh the contents of 30 randomly chosen cans of soup in order to estimate the average weight of a bowl of soup, or measure the luminosity of 30 stars with the same surface temperature as that of the Sun in order to estimate the average luminosity of G-type stars.

When we have built up a large number of repeated measurements of the same quantity, we can bin the data (counting up how many measurements fall within each bin width), and plot the results in a histogram. Here is a histogram showing the distribution of book lengths for 500 novels.

Histogram where x (horizontal axis) ranges from 292 to 397 (pages) and y ranges from 0 to 40 (books). A series of short horizontal line segments shows the number of books with lengths between 292 to 296 pages (0), between 296 and 300 pages (1), between 300 and 304 pages (0), and so on. The number of books in each bin rises on average until we reach a height of 36 at roughly 345 pages, and then descends back down to zero. A blue line is drawn smoothly through the line segments attempting to match their overall behavior: it starts a a height of zero at 292 pages, rises smoothly to a height of roughly 36 books at 345 pages, and then descends smoothly down to a height of zero again at 397 pages. While the blue line is smooth and resembles the shape of a bell (a normal curve), the heights of the individual bins jump up and down a bit around this line, forming a feature that resembles a city skyline of skyscrapers of varying heights.

The average value, or the sum of the measurements divided by the number of measurements, is called the mean value (μ). Note how the number of books per bin rises from both sides as we approach the mean value (345 pages). The smooth blue curve is called a normal curve, or bell curve, and shows what the distribution would tend to look like if we had a large number of measurements made to very high accuracy.

Once we have defined the mean value of our measured quantity, the next logical question is how scattered our measurements are around the mean. Do they cluster closely, or do we find a large variation in book lengths? Sigma (σ) is a measure of how much the individual measurements differ from the mean value, on average. The larger the σ value, the more widely distributed the measurements will be.

Smooth curve drawn on a plot where x (horizontal axis) ranges from 292 to 397 (pages) and y ranges from 0 to 40 (counts of books). As with the previous plot, a smooth curve starts at zero on the left-hand side, rises to a height of roughly 36 in the middle, and drops down to zero again on the right-hand side. The central, or mean, value of 345 pages is labeled with the Greek letter mu. The points on the smooth curve which are one sigma to the left or right of the mean are labeled minus sigma and sigma, and those which are two sigma to the left or right are labeled minus two sigma and two sigma.

Two-thirds of the measurements should lie within 1σ of the mean, in the inner white region between the x-values of μ–σ and μ+σ in the plot shown above. A deviation of 1σ is quite typical. As we move further away from the central value, we observe that 95% of all measurements should lie within ±2σ of the mean (within the inner white region and the two hatched side regions). A mere 5% of all measurements will lie more than 2σ above or below the mean value. If you compare two sets of measurements of a given quantity and find that the difference between their mean values is more than 3σ, you should investigate whether there may be differences in the experimental technique (or in the definition of the measured quantity).

Errors
Consider three types of errors.

Measurement error is defined by the precision of our equipment. If we used a stopwatch which only shows seconds, then time interval measurements might be recorded to the nearest second. Each value could be 0.5 seconds lower, or 0.5 seconds higher, than the recorded value.
Systematic errors are errors in experimental design which act to bias all measurements in the same way. If we measured how well people could see but forgot to ask them to take off their glasses and contact lenses first, then we would think that the human eye worked better than it does.
Natural variation refers to the innate width of the distribution of a measured quantity. Even with perfect measuring tools, we would not measure the exact same value for every object within a sample for a given quantity. (For example, people have feet of different lengths, and galaxy disks have different sizes.)