The design of empirical studies: towards a unified view.
Nuffield College, Oxford, OX1 1NF, UK. email@example.com.
A broad review is given of the general principles underlying study design with emphasis on applications in medical and epidemiological contexts. The main theme of the paper is that, while the distinction between interventionist studies, that is experiments, and purely observational ones is important, there are many common threads. A wide range of specific applications are used in outline to illustrate the discussion.
Bias; Case–control study; Clinical trial; Cohort study; Confounding; Generalizability; Longitudinal study; Metrology; Objectives of study; Prospective study; Randomization; Retrospective study; Specificity
- PMID: 26968840; DOI:10.1007/s10654-016-0135-9
THE DESIGN OF STUDIES
The most common role of statistical considerations in biomedical contexts is in the analysis and interpretation of data whenever that data are subject to appreciable unexplained or uncontrollable variation. That role in turn takes two broad forms.
The first concerns the security of conclusions. Suppose that there seems to be a systematic difference between the survival times of two groups of patients. Could that difference reasonably be regarded as a consequence of the “play of chance” or is it in some sense real, that is likely to be reproducible in new similar groups of patients? This leads on to such notions as confidence intervals and, particularly prone to misunderstanding, significance tests.
The second role concerns the exploration and hopefully understanding of complex patterns of dependence. Such issues may arise in particular in attempting to interpret hospital records or, very particularly, large-scale observational studies.
A rather less emphasized role of statistical thinking is in the planning of investigations. The traditional literature on this falls into some broad sections. The first has its primary origin in the work of the statistician and geneticist R.A.Fisher encapsulated in his book The design of experiments. A second area, possibly of less direct concern in the present context, is the design of sampling procedures for the reliable estimation of features of a population on the basis of a well-chosen sample. Somewhat parallel sampling procedures do, however, arise, for example in a laboratory investigation in which the purpose is to assess the properties of a large quantity of material on the basis of an appropriate sample.
In the present context, however, the main contrast is between interventionist studies, in particular randomized clinical trials, and purely observational investigations in which a population of individuals is observed, possibly over an appreciable time span, but no intervention by the investigator takes place. That is, treatments or exposures are observed as they are, not assigned by the investigator. An early US cardiovascular example is the Framingham study and more recent ones are the Rotterdam study, the Million Women study and the Kardoorie (Chinese) study, the last three being very large wide ranging observational investigations.
The object of the present paper, more an extended essay than a conventional research paper, is to discuss the broad principle of study design, especially but by no means only in an epidemiological setting. Numerous illustrative examples are given. All are real, not contrived for expository purposes, but they are described merely in outline to avoid distracting detail.
Why are these issues called statistical? The specific details of each study are peculiar to that context and it is crucial that due account is taken of those details, that is of specifically subject-matter issues. Yet there are common considerations that link different investigations across many fields of study and formulation of these general ideas is part of statistical theory. While virtually all the ideas emerge from the challenges of specific applications their general formulation can be very fruitful in many ways. For example, R.A. Fisher’s work mentioned above stemmed from agricultural field trials. Bradford Hill, a pioneer in the design and implementation of randomized clinical trials, was, I believe, at that point on friendly personal terms with Fisher and the idea of formal randomization as an allocation device certainly came from Fisher. At its best, statistical theory formulates ideas usually emanating from one field of application, develops their consequences and puts those ideas in a form for fruitful general application.
The most successful applications hinge on merging general statistical principles seamlessly with the specific details of each new situation.