Not Satisfied with statistical significance (p-value)

by finnstats

Not Satisfied with statistical significance (p-value), statistically significant is how humans prefer to categorize something.

The same argument debated for almost ten years. People have asserted that statistical significance offers the benefit of simplicity and clarity on the one hand (and many people are already struggling to achieve this).

The drawbacks of the p-value, however, led some to advocate for more sophisticated models (such as the Bayesian approach).

Although it is straightforward and incorporates the p-value, the approach we are adopting today seems to address the main drawbacks of relying just on statistical significance.

With this idea, we wish to go beyond the too emphasis on statistical significance.

Surprising Things You Can Do With R »

Two key sections make up this article.

In the beginning, Introduce the idea, followed by the key issues with statistical significance (p-value), the solution (a forest diagram), and a conclusion.

What are the problems with p-value statistical significance?

Priorities come first. What is the p-value, and why is using statistical significance (alone) problematic?

“The p-value is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct”- Wikipedia

To explain the p-value would require a full article. Therefore, we recommend you read more on this topic.

These are the two key problems:

Tackle Business problems with Data Science! »

1. Practical relevance:

The size of the effect is not indicated by the statistical significance. Imagine a commercial ad for a shampoo that prevents hair loss.

The advertisement says that the results are statistically significant and that the results have been validated in the laboratory by a major experiment (those who use the shampoo lose less hair).

The underlying research paper reveals, however, that the average user of this shampoo will have five additional hairs on their head. Who cares? Right?

Here’s another illustration of the impact of covid confinement on the virus’s propagation. The size of the effect is important in this case.

Lockdowns have had a significant negative impact on the economy, many people’s mental health, etc. To evaluate the cost/benefit of such efforts, we, therefore, need to know not only if they have stopped the virus from spreading, but also by how much.

Because of this, need to utilize statistical significance as a prerequisite for understanding the effect magnitude.

One can then determine whether or not the effect has practical importance based on the effect magnitude.

While statistical significance gives proof that the outcomes are not random, the emphasis is on magnitude.

2. Manipulation:

Two key issues arise when a fixed criterion (like 5%) is the emphasis. There are occasionally a few degrees of freedom while working with data, including when selecting the statistical test, sample, etc.

As a result, some persons might be able to manipulate the findings just enough to fall below the cutoff.

On the other side, you might not be able to publish your findings if you are just a little bit beyond the barrier.

A p-value of 4.9% and a p-value of 5.1%, however, essentially show the same amount of support for the hypothesis under consideration. This causes significant issues in research.

Importance of Data Cleaning in Machine Learning »

Forest Plots

Assess statistical significance (with a grain of salt), concentrate on the magnitude (if it is statistically significant), and take variability into consideration in order to fully comprehend a test result.

Why are forest plots a sophisticated approach? We can see all of this data on forest plots, which also makes it simpler to compare the coefficients.

1. Statistical Significance:

A statistical test’s statistical significance is still a crucial component. It is not worthwhile to interpret a result if it is not statistically significant.

You may see statistical significance using forest plots. The two-sided null hypothesis that the true coefficient is 0 can be rejected if the bar of the 95% confidence intervals does not include 0. The illustration is not, however, overly simplistic.

For example, * if the p-value is less than 5%, little stars are sometimes used with regression or statistical test coefficients in journals to denote their statistical significance.

This simplification has the drawback that we are unable to determine whether the results are close to or far from statistical significance.

By using forest plots, we can more easily see the portion of the CI that extends beyond the 0 lines and can relax our obsession with a sharp cut-off.

What are the algorithms used in machine learning? »

2. Magnitude:

Both the magnitude and the relative magnitude between the coefficients are simple to read. The coefficient, however, cannot always be compared directly.

The scales of the variables, for instance, may differ greatly in a regression. An equivalent scale will result from normalizing the variables by their standard deviation.

3. Variability:

The magnitude of the confidence intervals draws attention to uncertainty. Once more, it enables us to see the lower and upper bounds of the coefficient and gauge variability rapidly.

Conclusion

Without the need to be an expert in statistics, forest plots offer all the essential components of a statistical test, statistical significance, magnitude, and variability easily accessible.

Additionally, by including additional data, this richer representation automatically places less emphasis on statistical significance.

If you can successfully conduct a statistical test, you should be able to create a forest plot and analyze the findings.

Additionally, even though standardizing the coefficient by the standard deviation in the case of a regression favors comparison of the relative magnitude, the interpretation may be less clear.

Next article we will cover how to create a forest plot in R.

Why Do So Many Data Scientists Quit Their Jobs? »