# Test For Randomness in R-How to check Dataset Randomness

Test For Randomness in R, How to check dataset randomness?

Assume that a and b are symbols indicating the kind of items or numbers that make up a sequence and the test hypothesis is that

H0:-The symbols occur in random order

H1:- The symbols occur in a set pattern.

Suppose the sample size is n contains n1 symbols of a and n2 symbols of b, that is n1+n2=n. Same way r1 is the number of runs in a and r2 be the number of runs in b and the total of r1+r2=r.

To decide H0, the value of r is compared with the critical number of runs from tables.

tidyverse in r – Complete Tutorial » Unknown Techniques »

If the observed number of runs in a sample lies in between these critical values, H0 is not rejected, and if outside these critical values, H0 is rejected.

So basically run test allows us to determine the randomness of the dataset. Let’s see how to execute the same in R.

## Test For Randomness

Different libraries are available

### Approach 1: snpar library

Let’s make use of runs.test() function from the snpar library.

Naive Bayes Classification in R » Prediction Model »

Syntax:-

`runs.test(x, exact = FALSE, alternative = c(“two.sided”, “less”, “greater”))`

`library(snpar)`

create a dataset for testing

`data <- c(10, 6, 18, 5, 10, 12, 12, 18, 15, 18)`

Execute run test in R

```runs.test(data)
Approximate runs rest
data:  data
Runs = 4, p-value = 0.2061
alternative hypothesis: two.sided```

The p-value of the run test is 0.2061. Since the p-value is greater than 0.05we cannot reject the null hypothesis. It indicates that sufficient evidence observed data was formed in a random manner.

Deep Neural Network in R » Keras & Tensor Flow

### Approach 2: randtests library

runs.test() function from the randtests library, function, and syntax almost similar to approach 1.

Let’s load the library first,

`library(randtests)`

Let’s make use of the same dataset.

```randtests ::runs.test(data)
Runs Test
data:  data
statistic = -0.76376, runs = 4, n1 = 4, n2 = 4, n = 8, p-value = 0.445
alternative hypothesis: nonrandomness```

The p-value is slightly different from approach 1, however, it’s pointing to the same inference.

Since the p-value of the test is 0.445 that is greater than 0.05, indicating that sufficient evidence to say that the data was formed in a random manner.

LSTM Network in R » Recurrent Neural network »

### 1 Response

1. Excellent post however , I was wondering if you could write a litte more on this subject?
I’d be very grateful if you could elaborate a little bit further.
Thank you!