How to Compare Strings in R with examples

In R, you can compare strings using the following methods.

Comparing Two Strings

Approach 1:-

Suppose you are looking for a case-sensitive comparison

string1 == string2

In another case -insensitive comparison

tolower(string1) == tolower(string2)

Approach 2:

Compare Two Vectors of Strings

Assume a case-sensitive comparison

identical(vector1, vector2)

In another case-insensitive comparison

identical(tolower(vector1), tolower(vector2))

Approach 3:

Find Similarities Between Two Vectors of Strings

Let’s see if any of the strings in vector1 are also present in vector2.

vector1[vector1 %in% vector2] 

The examples below demonstrate how to apply each strategy in practice.

Example 1: Check if two vectors are identical

The code below demonstrates how to compare two strings in R to see if they are equal.

Now we can define two strings

string1 <- "Hello"
string2 <- "hello"

In the case-sensitive comparison

string1 == string2
[1] FALSE

Suppose case-insensitive comparison

tolower(string1) == tolower(string2)
[1] TRUE

Because the two strings aren’t completely similar, the case-sensitive comparison returns FALSE.

However, because the two strings contain the same characters in the same order, regardless of the case, the case-insensitive comparison returns TRUE.

McNemar’s test in R » finnstats

Example 2: Compare Two Vectors of Strings

The following code demonstrates how to use the identical() function to compare two string vectors.

Let’s define two vectors of strings

vector1 <- c("This", "is", "egg")
vector2 <- c("This", "is", "Egg")

Now we can do two vectors case-sensitive comparison

identical(vector1, vector2)
[1] FALSE

Ok, let’s the try case-insensitive comparison

identical(tolower(vector1), tolower(vector2))
[1] TRUE

Because the two vectors do not include the exact same strings in the same case, the case-sensitive comparison returns FALSE.

The case-insensitive comparison, on the other hand, returns TRUE since both vectors contain the same strings, regardless of case.

Example 3: Find String Similarities Between Two Vectors

The following code demonstrates how to discover which strings in one vector belong to another vector using the percent in percent operator:

Create two string vectors.

vector1 <- c("Hi", "sony", "how are you")
vector2 <- c("hi", "sony", "how are you")

Let’s see if any of the strings in vector1 are also present in vector2.

vector1[vector1 %in% vector2]
[1] "sony"        "how are you"

The strings “hey” and “hello” appear in both vector1 and vector2 according to the result.

One of the First Steps to Become a Data Scientist » finnstats

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

16 − eight =