Abstract:
In the classical approach to statistical hypothesis testing the role of the null hypothesis H0
and the alternative H1
is very asymmetric. Power, calculated from the distribution of the
test statistic under H1
, is treated as a theoretical construct that can be used to guide the
choice of an appropriate test statistic or sample size, but power calculations do not explicitly
enter the testing process in practice. In a significance test a decision to accept or reject H0
is
driven solely by an examination of the strength of evidence against H0
, summarized in the P value calculated from the distribution of the test statistic under H0
. A small P–value is taken
to represent strong evidence against H0
, but it need not necessarily indicate strong evidence
in favour of H1
. More recently, Moerkerke et al. (2006) have suggested that the special
status of H0
is often unwarranted or inappropriate, and argue that evidence against H1 can
be equally meaningful. They propose a balanced treatment of both H0 and H1
in which the
classical P–value is supplemented by the P–value derived under H1
. The alternative P–value
is the dual of the null P–value and summarizes the evidence against a target alternative.
Here we review how the dual P–values are used to assess the evidential tension between
H0 and H1
, and use decision theoretic arguments to explore a balanced hypothesis testing
technique that exploits this evidential tension. The operational characteristics of balanced
hypothesis tests is outlined and their relationship to conventional notions of optimal tests
is laid bare. The use of balanced hypothesis tests as a conceptual tool is illustrated via
model selection in linear regression and their practical implementation is demonstrated by
application to the detection of cancer-specific protein markers in mass spectroscopy.