Test For Randomness in R-How to check Dataset Randomness

Test For Randomness in R, How to check dataset randomness?

Assume that a and b are symbols indicating the kind of items or numbers that make up a sequence and the test hypothesis is that

H₀:-The symbols occur in random order

H₁:- The symbols occur in a set pattern.

Suppose the sample size is n contains n1 symbols of a and n2 symbols of b, that is n1+n2=n. Same way r1 is the number of runs in a and r2 be the number of runs in b and the total of r1+r2=r.

To decide H0, the value of r is compared with the critical number of runs from tables.

tidyverse in r – Complete Tutorial » Unknown Techniques »

If the observed number of runs in a sample lies in between these critical values, H0 is not rejected, and if outside these critical values, H0 is rejected.

So basically run test allows us to determine the randomness of the dataset. Let’s see how to execute the same in R.

Test For Randomness

Different libraries are available

Approach 1: snpar library

Let’s make use of runs.test() function from the snpar library.

Naive Bayes Classification in R » Prediction Model »

Syntax:-

runs.test(x, exact = FALSE, alternative = c(“two.sided”, “less”, “greater”))

Load the library

library(snpar)

create a dataset for testing

data <- c(10, 6, 18, 5, 10, 12, 12, 18, 15, 18)

Execute run test in R

runs.test(data)
          Approximate runs rest
data:  data
Runs = 4, p-value = 0.2061
alternative hypothesis: two.sided

The p-value of the run test is 0.2061. Since the p-value is greater than 0.05we cannot reject the null hypothesis. It indicates that sufficient evidence observed data was formed in a random manner.

Deep Neural Network in R » Keras & Tensor Flow

Approach 2: randtests library

runs.test() function from the randtests library, function, and syntax almost similar to approach 1.

Let’s load the library first,

library(randtests)

Let’s make use of the same dataset.

randtests ::runs.test(data)
          Runs Test
data:  data
statistic = -0.76376, runs = 4, n1 = 4, n2 = 4, n = 8, p-value = 0.445
alternative hypothesis: nonrandomness

The p-value is slightly different from approach 1, however, it’s pointing to the same inference.

Since the p-value of the test is 0.445 that is greater than 0.05, indicating that sufficient evidence to say that the data was formed in a random manner.

LSTM Network in R » Recurrent Neural network »

Test For Randomness in R-How to check Dataset Randomness

Test For Randomness

Approach 1: snpar library

Approach 2: randtests library

You may also like...

1 Response

Leave a Reply Cancel reply

Test For Randomness in R-How to check Dataset Randomness

Test For Randomness

Approach 1: snpar library

Approach 2: randtests library

You may also like...

How to Use Gather Function in R?-tidyr Part2

Add mean value in Boxplots in R with examples

In data science, what Is Open Innovation?

1 Response

Leave a Reply Cancel reply