Extract patterns in R?

Extract patterns in R, R’s str extract() function can be used to extract matching patterns from strings. It is part of the stringr package.

The syntax for this function is as follows:

str_extract(string, pattern)

where:

string: Character vector

pattern: Pattern to extract

The practical application of this function is demonstrated in the examples that follow.

Data Science Challenges in R Programming Language

Example 1: Take a String and Extract One Pattern

The R code below demonstrates how to separate the word “for” from a specific string.

library(stringr)

Let’s define string

string <- "datascience.com for data science articles"

Now we can extract “for” from string

str_extract(string, "for")
[1] "for"

The pattern “for” was successfully extracted from the string.

How to add columns to a data frame in R – Data Science Tutorials

Note that we will simply get NA if we try to extract a pattern that isn’t present in the string.

Example 2: Take String Data and Extract Numeric Values

Use the regex d+ to extract just numerical values from a text using the following code.

library(stringr)

Now we can define string

string <- "There are 100 phones over there"

extract only numeric values from string

Triangular Distribution in R – Data Science Tutorials

str_extract(string, "\\d+")
[1] "100"

Example 3: Take Strings from a Vector and Extract Characters

The code below demonstrates how to extract only characters from a vector of strings using the regex [a-z]+.

Let’s define a vector of strings

strings <- c("3 phones", "3 battery", "7 pen") 

Now let’s try to extract only characters from each string in the vector

str_extract(strings, "[a-z]+")
[1] "phones"  "battery" "pen" 

Take note that each string’s characters are the only ones that are returned.

The Multinomial Distribution in R – Data Science Tutorials

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

4 + 4 =