Full text loading...
Chapter 17 : Issues of Study Design and Statistical Analysis for Environmental Microbiology
The practice of good science requires a concise description of study objectives and a study design that matches its objectives. A good study design requires (i) definition of the population or factors of interest in the study, (ii) identification of study units, (iii) collection of representative measurements, and (iv) a statistical analysis matched to study objectives and data characteristics. The goal of this chapter is to help one see the statistical issues involved in study design and data analysis. The majority of studies in environmental microbiology involve some form of comparisons since it is through comparisons that we learn where differences exist and/or what factors can influence microbial populations. Experimental designs specify the nature and extent of comparisons that are of interest in a particular study. The goal of data collection, be it in a sampling study or an experimental design, is to obtain a set of measures from a population to gain insight into how values for a particular population characteristic vary from sample unit to sample unit. In this chapter the authors have attempted to raise readers awareness of the role and function of statistics at both the study design and the data analysis steps of environmental microbiological studies.
Spatial sampling plans, each containing 40 locations. (a) Simple random locations; (b) random locations within cells of a systematic grid (spatial stratification); (c) systematic locations with a random starting point and fixed between-sample distance, δ; and (d) overlapping grid samples of equal between-sample distance and two random starting points.
Typical shapes for common discrete statistical distributions. Distribution parameters: n, number of experiments, ranging from 2 to infinity; π, probability of success, ranging between 0 and 1; λ > 0, Poisson mean.
Typical shapes for common continuous statistical distributions. Distribution parameters are displayed as Greek letters. For the normal and lognormal distributions, µ and σ are the mean and standard deviation controlling the center and spread, respectively. The Weilbull distribution is controlled by a shape and a scale parameter. The Student t, F, and chi-square distributions are controlled by one or two parameters referred to as the degrees of freedom, 1 ≤ df, df1, df2 < infinity. For these distributions, changing the degrees of freedom changes both the shape and spread of the distribution as demonstrated by the dotted lines.
Examples of normal quantile plots. Closeness to the straight line is a measure of the normality of the sample data.
Frequency and density histograms for normally distributed data. Note that the shapes of the distribution are similar and only the vertical axis scale is changed.
Comparative box plots indicating the median (middle of notch), upper and lower quartiles (solid box top and bottom), interquartile range (IQR = height of box), upper and lower fence values (line terminators = median ± 3 IQR), and potential outliers (circle dots).
Biplot of the first two principal components of a 4-variate analysis which is known to have three groups, identified as A, B, and C.