ICH E9(R1) Section 3.5
The probability that a statistical test will correctly reject a false null hypothesis, representing the likelihood of detecting a true treatment effect when one actually exists.
Statistical power quantifies a clinical trial's ability to identify true treatment effects, serving as a critical measure of study adequacy. A trial with 80% power has an 80% probability of demonstrating statistical significance if the true treatment effect matches assumptions used in the power calculation. Conversely, such a trial has a 20% probability of failing to detect a real effect, a Type II error. Higher power reduces the risk of false-negative conclusions but requires larger sample sizes and greater resources.
Power is determined by the interplay of several factors that can be manipulated in trial design. Larger sample sizes increase power by providing more precise estimates of treatment effects. Larger true treatment effects are easier to detect, requiring less power for the same sample size. Lower variability in outcomes improves precision and increases power. More stringent significance thresholds reduce power because they require stronger evidence to declare significance. Understanding these relationships enables investigators to design trials that balance scientific requirements against practical constraints.
Adequate power is essential for ethical trial conduct because underpowered studies may fail to detect beneficial treatments, potentially delaying or preventing patient access to effective therapies. Regulatory guidance typically recommends power of at least 80% to 90% for confirmatory trials, though the appropriate power level depends on the consequences of false-negative conclusions in the specific clinical context. Post-hoc power calculations after study completion, using observed effects rather than assumed effects, are generally discouraged as they can be misleading.
Trial design
"The trial was designed with 90% power to detect a 20% relative risk reduction, meaning that if the treatment truly reduced risk by 20%, there was a 90% probability that the trial would demonstrate statistical significance."
Underpowered study
"The Phase II study enrolled only 60 participants and had approximately 50% power to detect the hypothesized effect, meaning that even if the treatment was effective, there was a substantial chance the study would fail to demonstrate statistical significance."
A range of values calculated from study data that is expected to contain the true treatment effect with a specified probability, typically 95%, providing information about both the estimated effect size and the precision of that estimate.
A statistical analysis strategy that includes all randomized participants in the groups to which they were originally assigned, regardless of whether they completed the study treatment or adhered to the protocol.
A planned statistical analysis conducted before all participants have completed the study, typically to evaluate accumulating data for evidence of efficacy, futility, or safety concerns that might warrant early termination of the trial.
The probability of obtaining results at least as extreme as those observed in the study, assuming that the null hypothesis of no treatment effect is true.
A statistical analysis that includes only participants who completed the study according to protocol requirements, without major protocol violations, adequate treatment exposure, and complete outcome assessments.
ICH E9(R1) Section 3.5
The probability that a statistical test will correctly reject a false null hypothesis, representing the likelihood of detecting a true treatment effect when one actually exists.
Statistical power quantifies a clinical trial's ability to identify true treatment effects, serving as a critical measure of study adequacy. A trial with 80% power has an 80% probability of demonstrating statistical significance if the true treatment effect matches assumptions used in the power calculation. Conversely, such a trial has a 20% probability of failing to detect a real effect, a Type II error. Higher power reduces the risk of false-negative conclusions but requires larger sample sizes and greater resources.
Power is determined by the interplay of several factors that can be manipulated in trial design. Larger sample sizes increase power by providing more precise estimates of treatment effects. Larger true treatment effects are easier to detect, requiring less power for the same sample size. Lower variability in outcomes improves precision and increases power. More stringent significance thresholds reduce power because they require stronger evidence to declare significance. Understanding these relationships enables investigators to design trials that balance scientific requirements against practical constraints.
Adequate power is essential for ethical trial conduct because underpowered studies may fail to detect beneficial treatments, potentially delaying or preventing patient access to effective therapies. Regulatory guidance typically recommends power of at least 80% to 90% for confirmatory trials, though the appropriate power level depends on the consequences of false-negative conclusions in the specific clinical context. Post-hoc power calculations after study completion, using observed effects rather than assumed effects, are generally discouraged as they can be misleading.
Trial design
"The trial was designed with 90% power to detect a 20% relative risk reduction, meaning that if the treatment truly reduced risk by 20%, there was a 90% probability that the trial would demonstrate statistical significance."
Underpowered study
"The Phase II study enrolled only 60 participants and had approximately 50% power to detect the hypothesized effect, meaning that even if the treatment was effective, there was a substantial chance the study would fail to demonstrate statistical significance."
A range of values calculated from study data that is expected to contain the true treatment effect with a specified probability, typically 95%, providing information about both the estimated effect size and the precision of that estimate.
A statistical analysis strategy that includes all randomized participants in the groups to which they were originally assigned, regardless of whether they completed the study treatment or adhered to the protocol.
A planned statistical analysis conducted before all participants have completed the study, typically to evaluate accumulating data for evidence of efficacy, futility, or safety concerns that might warrant early termination of the trial.
The probability of obtaining results at least as extreme as those observed in the study, assuming that the null hypothesis of no treatment effect is true.
A statistical analysis that includes only participants who completed the study according to protocol requirements, without major protocol violations, adequate treatment exposure, and complete outcome assessments.