Division of Rheumatology, Department of Pediatrics, Alfred I. duPont Hospital for Children, Delaware, USA
Estimating Sample size is required for comparison of two groups. It is meant to compare two different groups of subjects and not for two measurements in the same group of people.
Let us say that if you treat 100 children with TB, 10% develop paradoxical reaction. You want to know what the incidence of paradoxical reaction (PR) is in children with TB and HIV and whether it is clinically significant.
The response from Dr.Shah suggests that she wants to establish the incidence of PR in both groups in her own clinic. This will require a pilot study involving children with TB but without HIV. She can also use available data from the literature.
The next step before studying the population with both TB and HIV is to decide the level at which the percentage incidence will be considered as significantly different. Let us assume that you know that it will be more than in children without HIV. Should the incidence be 15% or 20% or 25% to be clinically important? You may be able to establish this limit from the data in the pilot study or from published studies. Let us say that several studies have shown that the SD for the 10% is 3 (10 +/- 3) for children with TB who develop PR, you may wish to choose an incidence PR of 20% or more in patients with TB and HIV to be considered significantly different.
The next question is "How sure do you want to be?" when you aim for an incidence of 20%. This is the confidence level (CL), also known as Study Power. If you choose 90%, what this means is that if you find the incidence to be, say 23%, you can be 95% certain that this increased incidence is real and therefore may be significant. In general, the higher the confidence level, the lower the chances for type II error (i.e. higher false negatives) but greater will be the requirements for the number of subjects to be included (sample size). Therefore most clinical studies set confidence level at 80%.
The next question is "how much margin of error can you accept?" This is the confidence interval (CI). Let us choose a CI of 5. We already decided that the incidence of PR should be 20% in children with TB and HIV. What CI means is that if we choose several samples of children with HIV and TB and repeated the study several times the values should fall between 15 to 25% (5 below and 5 above 20). If you choose a wider CI, you can select a smaller sample size. But, you may catch false positives (Type 1 error) and conclude that an important difference is present when there is none.
Now you can choose a sample size to give a CI of 5 and CL of 90% (or whatever limits you set for yourself) by using one of several formulas.
Three important points to remember are: 1. Sample sizes and CI should not be calculated after the study is completed. Those of you who wish to understand the reasons for this are referred to some of the articles at the end of this essay. 2. The sample size comes in only when the numbers are large or it will be too expensive to study all the patients. If the population of patients with HIV and TB is small, say 200 or so, it is best to use the entire population. 3. In addition, the type of variable to be studied (dichotomous, continuous, etc) will affect the method by which the sample size is calculated.
It is easy to use one of the many soft-wares or Tables available to calculate sample size. (Resource page: http://statpages.org) You fill in the data in the questionnaire and the computer gives you the sample size. Using one of the available tables, I found that if the population of children with TB and HIV is 1000 and I want to get a confidence level of 95% with a precision of 5, I will need to include 169 of those 1000 children.
Finally, I wish to add a few words from an article written by a clinician several years back. He wrote it at "the risk of offending statistical purists" and explained some of the concepts in a different manner. He asks the clinical researcher to think about three questions before starting a study.
1. What is the acceptable level of Type 1 or alpha error? This is the risk of concluding that an important difference is present when it is not. In standard statistics, this is p value and also can be expressed as Confidence Intervals. But in sample size calculation, it is P (capital P) to denote Probability.
2. What is the level of acceptable Type II error? This is the error of concluding that an important difference is not present when there is. This is the beta error. This is the equivalent of Power in sample size calculation and it is often 90% or 80%, and it is the Confidence Level.
The third factor is deciding "what difference between averages or proportions is important enough to justify doing the study". In other words, what is the difference worth searching for? What will you do if you find a difference?
(I wish to thank Dr.M.Attia of the A I Dupont Hospital for Children, Wilmington DE,USA for reviewing my answer and making the necessary corrections and improvements)
References: » Brown GW. Sample Size. Amer J Dis Childr 142:1213-1215, 1988
» Campbell MJ, Julious SA, Altman DG. Estimating sample sizes for binary ordered, categorical and continuous outcomes in two group comparisons. BMJ. 311: 1145-1148, 1995.
» Hsieh FY, Bloch DA, Larsen MD. A simple method of sample size calculation for linear and logistic regression. Statistics in Medicine. 17:1623-1634, 1998.
» Goodman SN, Berlin JA. The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med. 121: 200-206, 1994
» Simon R. Confidence intervals for reporting results of clinical trials. AnnIntern Med 105: 429-435, 1986.
» Sackett D, Brian Haynes R, Guyatt GH, Tugwell P. Clinical Epidemiology. 4th edition Little, Brown Co. Boston. 2005 ??
» Users' Guides to the Medical Literature. Guyatt G, Drummond Reniie (Eds) JAMA publication 2002
» Designing Clinical Research. 3rd Edition. Hulley SB, Cummings SR, Browner WS, Grady DG, Newman RB.(Eds). Lippincott Williams and Wilkins. 2007