Wednesday, 28 November 2007

Experimental Design and Clustering

An issue with experimental design derives from the economic constraints placed upon it. For instance, rather than implementing truly random samples, cost constraints force researchers to use other techniques. In multi-site studies, cluster randomised designs beside entire sites into the same treatment groups with different sites assigned to different treatments. This creates statistical clusters.

Although sophisticated methods such as hierarchical linear modelling schemes (e.g., Raudenbush and Bryk, 2002) have been formulated, a variety of problems exist with the analysis of cluster randomised trials (e.g., Raudenbush and Bryk, 2002; Donner and Klar, 2000; Klar and Donner, 2002; Murray, Varnell, and Blitstein, 2004). It needs also to be noted that many researchers failed to implement more advanced statistical clustering techniques in the design of their experiments.

Each cluster may be said to correspond as a quasi-experiment in itself to the results of the cluster randomised trial. Rooney and Murray (1996) pointed out the problems of meta-analysis in cluster randomised trials due to the effects size estimation problems where conventional estimates were not appropriate and standard error could, be incorrect.

In assessing the validity of descriptive and experimental research techniques, it is necessary to differentiate between patterns and models. When seeking patterns we are an effect representing local properties of the data. On the other hand, a model aims to fully describe the data. A commonly used example pattern would be Association rules. An association rule notes the rate at which two variables may occur together. In psychology this could be used to represent occurrences that appear together more often than would be expected if they were statistically independent.

Classification differs from clustering in that it is predictive of rather than descriptive. With clustering, there is no correct answer for the allocation of observations to groups. With classification, the current other set will include group labels and the psychologist will seek to derive a method to obtain future data without these labels.

  • Donner, A. & Klar, N. (2000). Design and analysis of cluster randomization trials in health research. London: Arnold.

  • Donner, A. & Klar, N. (2002). Issues in the meta-analysis of meta-analysis of cluster randomized trials. Statistics in Medicine, 21, 1971-2980.

  • Klar, N. & Donner, A. (2001). Current and future challenges in the design and analysis of cluster randomization trials. Statistics in Medicine, 20, 3729-3740.

  • Murray, D. M., Varnell, S. P., & Blitstein, J. L. (2004). Design and analysis of group and randomized trials: A review of recent methodological developments. American Journal of Public Health, 94, 423-432.

  • Raudenbush, S. W. & Bryk, A. S. (2002). Hierarchical linear models. Newbury Park, CA: Sage Publications.

  • Rooney, B. L. & Murray, D. M. (1996). A meta-analysis of smoking prevention programs after adjustment for errors in the unit of analysis. Health Education Quarterly, 23, 48-64.

No comments: