Sample Size Calculation and Power, One of the most important stages in the planning of clinical trials is the estimation of sample size.

This post discusses the significance of sample size calculation and power, as well as basic rules and processes and sample size estimation examples.

LSTM Network in R » Recurrent Neural network »

## Importance

To begin, we must comprehend sample size underestimation and overestimation.

Assume we’re going to compare product A to product B in terms of reducing knee discomfort.

Sample size underestimate means that the sample size collected from the population is less than what was required for the study. If there is a difference, statistical significance cannot be achieved in this case.

In another case, the sample size was calculated substantially larger than was required.

For example, p<0.000001 showed a highly significant difference. Because the sample size was so large, even little variations turned out to be statistically significant, even if the differences were not clinically valid.

Log Rank Test in R-Survival Curve Comparison »

## Basics of Sample size calculation

**1) Study Design**

Different statistical research designs are available in clinical trials to attain the desired outcome.

**2) Hypothesis**

The study’s goal is to see if there is any noninferiority, superiority, or equivalent. Non-inferiority and superiority trials are one-sided, while equivalence trials are two-sided.

Decision Trees in R » Classification & Regression »

**3) Primary Endpoints**

The sample size will be determined by the primary endpoints. If there is a secondary endpoint, the sample size can be calculated separately.

**4) Response**

The outcome of a response variable is determined by the mean, variance, or proportion of individuals.

**5) Meaningful Difference**

Clinically significant differences can be seen between the test and control groups. In other words, The difference between test and control can be considered as clinically meaningful differences.

**6) Significance level**

The significance level is at 0.05 (95% confidence level)

**7) Power**

The power of the test is **above 80%** is acceptable

KNN Algorithm Machine Learning » Classification & Regression »

### Case1: Comparing two proportions

Let’s take an example with the following pieces of information

power<-85 p1<-0.55 (Probability of success group 1) p2<-0.42 (Probability of success group 2) dropout<-0.20 alpha<-0.05 p1p2<-p1-p2 if(alpha==0.05){zalpha<-1.96} if(power==95){zbeta<-qnorm(power)} if(power==90){zbeta<-qnorm(power)} if(power==80){zbeta<-qnorm(power)} n1<-((zalpha+zbeta)^2*((p1*(1-p1))+(p2*(1-p2))))/(p1-p2)^2 n<-n1*2 Total<-n+n*dropout Total 546.7774

While considering 20% dropouts in the study total of 547 subjects are required.

Principal component analysis (PCA) in R »

### Case 2: Comparing two means

Let’s take continuous data other details as follows,

mean1<-8 mean2<-6 sd<-2 power<-85 alpha<-0.05 dropout<-0.20 if(alpha==0.05){zalpha<-1.96} if(power==95){zbeta<-qnorm(power)} if(power==90){zbeta<-qnorm(power)} if(power==80){zbeta<-qnorm(power)} n<-(((zalpha+zbeta)^2)*(2*(sd*sd)))/(mean1-mean2)^2 n<-n1*2 Total<-n+n*dropout Total 546.7774

While considering 20% dropouts in the study total of 547 subjects are required.