How to Compare Strings in R with examples
In R, you can compare strings using the following methods.
Comparing Two Strings
Approach 1:-
Suppose you are looking for a case-sensitive comparison
string1 == string2
In another case -insensitive comparison
tolower(string1) == tolower(string2)
Approach 2:
Compare Two Vectors of Strings
Assume a case-sensitive comparison
identical(vector1, vector2)
In another case-insensitive comparison
identical(tolower(vector1), tolower(vector2))
Approach 3:
Find Similarities Between Two Vectors of Strings
Let’s see if any of the strings in vector1 are also present in vector2.
vector1[vector1 %in% vector2]
The examples below demonstrate how to apply each strategy in practice.
Example 1: Check if two vectors are identical
The code below demonstrates how to compare two strings in R to see if they are equal.
Now we can define two strings
string1 <- "Hello" string2 <- "hello"
In the case-sensitive comparison
string1 == string2 [1] FALSE
Suppose case-insensitive comparison
tolower(string1) == tolower(string2) [1] TRUE
Because the two strings aren’t completely similar, the case-sensitive comparison returns FALSE.
However, because the two strings contain the same characters in the same order, regardless of the case, the case-insensitive comparison returns TRUE.
McNemar’s test in R » finnstats
Example 2: Compare Two Vectors of Strings
The following code demonstrates how to use the identical() function to compare two string vectors.
Let’s define two vectors of strings
vector1 <- c("This", "is", "egg") vector2 <- c("This", "is", "Egg")
Now we can do two vectors case-sensitive comparison
identical(vector1, vector2) [1] FALSE
Ok, let’s the try case-insensitive comparison
identical(tolower(vector1), tolower(vector2)) [1] TRUE
Because the two vectors do not include the exact same strings in the same case, the case-sensitive comparison returns FALSE.
The case-insensitive comparison, on the other hand, returns TRUE since both vectors contain the same strings, regardless of case.
Example 3: Find String Similarities Between Two Vectors
The following code demonstrates how to discover which strings in one vector belong to another vector using the percent in percent operator:
Create two string vectors.
vector1 <- c("Hi", "sony", "how are you") vector2 <- c("hi", "sony", "how are you")
Let’s see if any of the strings in vector1 are also present in vector2.
vector1[vector1 %in% vector2] [1] "sony" "how are you"
The strings “hey” and “hello” appear in both vector1 and vector2 according to the result.
One of the First Steps to Become a Data Scientist » finnstats