6.7. Sample Size and Power Estimation
Before testing a hypothesis it is desirable to know how large a sample size should be selected to achieve a desired precision. This depends on a number of factors specific to the nature of the test, such as sample variance, confidence level (α probability or Type I error) or minimum detectable difference.
It is also important to know how likely it is not to reject a null hypothesis when in fact it is false (β probability or Type II error), or in other words, to know what the power of the test is (1 – β), i.e. what is the probability of rejecting the null hypothesis when it is in fact false.
This section brings together seven broad classes of commonly used hypothesis tests and provides methods of estimating the sample size, power of the test and other parameters. The types of tests supported here are:
1) One Sample
2) Two Samples
3) Variance
4) Correlation
7) ANOVA
An eighth option is also provided to compute power of the test from the phi statistic and vice versa, which are used in estimating the sample size and power of the test in ANOVA and two sample tests. Therefore, UNISTAT does not require use of OC Curves published by Pearson and Hartley (1951), pp. 112-130.
Although some topics seem to have been excluded from this list, these are often special cases of the methods already provided. For instance, sample size and power of the test in Regression Analysis can be estimated using the Correlation option above. Many different types of ANOVA can also be accommodated simply by entering the relevant statistics in place of the existing parameters. In such cases, you are recommended to consult a statistics book to establish which of the existing procedures can be used as a substitute (see Zar, J. H. 2010).
In procedures where a selection of one or two-tailed estimation is available, the default is always set for two-tailed, corresponding to the null hypothesis that “the entities tested are equal” against the alternative hypothesis that “they are not equal”. Where the alternative hypothesis states a relationship of one entity being greater or less than the other, the one-tailed option should be selected.
In some procedures, the parameter to be estimated may occur on both sides of an equation and therefore it cannot be calculated directly. In such cases, an iterational algorithm is employed to determine the correct level of the parameter and usually convergence is achieved within a few iterations. In such procedures you are provided with two further input fields to control the two convergence parameters, tolerance and the maximum number of iterations. The default values of these two parameters are 0.001 and 100 respectively and they produce satisfactory results in most cases. If the convergence cannot be achieved within these values, then the program will report this in the output. Then you may edit the default values to obtain convergence.