# Test For Randomness in R-How to check Dataset Randomness

Test For Randomness in R, How to check dataset randomness?

Assume that a and b are symbols indicating the kind of items or numbers that make up a sequence and the test hypothesis is that

H_{0}:-The symbols occur in random order

H_{1}:- The symbols occur in a set pattern.

Suppose the sample size is n contains n1 symbols of a and n2 symbols of b, that is n1+n2=n. Same way r1 is the number of runs in a and r2 be the number of runs in b and the total of r1+r2=r.

To decide H0, the value of r is compared with the critical number of runs from tables.

tidyverse in r – Complete Tutorial » Unknown Techniques »

If the observed number of runs in a sample lies in between these critical values, H0 is not rejected, and if outside these critical values, H0 is rejected.

So basically run test allows us to determine the randomness of the dataset. Let’s see how to execute the same in R.

## Test For Randomness

Different libraries are available

### Approach 1: snpar library

Let’s make use of runs.test() function from the snpar library.

Naive Bayes Classification in R » Prediction Model »

Syntax:-

runs.test(x, exact = FALSE, alternative = c(“two.sided”, “less”, “greater”))

Load the library

library(snpar)

create a dataset for testing

data <- c(10, 6, 18, 5, 10, 12, 12, 18, 15, 18)

Execute run test in R

runs.test(data) Approximate runs rest data: data Runs = 4, p-value = 0.2061 alternative hypothesis: two.sided

The p-value of the run test is 0.2061. Since the p-value is greater than 0.05we cannot reject the null hypothesis. It indicates that sufficient evidence observed data was formed in a random manner.

Deep Neural Network in R » Keras & Tensor Flow

### Approach 2: randtests library

runs.test() function from the randtests library, function, and syntax almost similar to approach 1.

Let’s load the library first,

library(randtests)

Let’s make use of the same dataset.

randtests ::runs.test(data) Runs Test data: data statistic = -0.76376, runs = 4, n1 = 4, n2 = 4, n = 8, p-value = 0.445 alternative hypothesis: nonrandomness

The p-value is slightly different from approach 1, however, it’s pointing to the same inference.

Since the p-value of the test is 0.445 that is greater than 0.05, indicating that sufficient evidence to say that the data was formed in a random manner.

Excellent post however , I was wondering if you could write a litte more on this subject?

I’d be very grateful if you could elaborate a little bit further.

Thank you!