How to Scale Only Numeric Columns in R
How to Scale Only Numeric Columns in R, To scale only the numeric columns in a data frame in R, use the dplyr package’s following syntax.
Best Books to learn Tensorflow – Data Science Tutorials
library(dplyr) df %>% mutate(across(where(is.numeric), scale))
How to actually use this function is demonstrated in the example that follows.
Use dplyr to Scale Only Numeric Columns as an example.
Let’s say we have the R data frame shown below, which contains details about numerous basketball players.
How to Scale Only Numeric Columns in R
Let’s create a data frame
df <- data.frame(Team=c('P1', 'P2', 'P3', 'P4', 'P5'), points=c(2, 3, 7, 22, 8), value=c(27, 39, 49, 82, 54))
Now we can view the data frame
df
Team points value 1 P1 2 27 2 P2 3 39 3 P3 7 49 4 P4 22 82 5 P5 8 54
Technical Remarks
The following fundamental syntax is used by R’s scale() function.
Best Books to Learn Statistics for Data Science (datasciencetut.com)
scale(x, center = TRUE, scale = TRUE)
where:
x: Name of the object to scale
center: whether to scale after subtracting the mean. As a rule, TRUE.
scale: Whether to scale after dividing by the standard deviation. As a general, TRUE.
Scaled values are calculated using the following formula by this function:
xscaled = (xoriginal – x̄) / s
where:
xoriginal: The original x-value
x̄: The sample mean
s: The sample standard deviation
This process, which only changes each original value into a z-score, is also known as normalizing data.
Let’s say we want to scale the data frame’s numeric columns solely, using R’s scale function.
Methods for Integrating R and Hadoop complete Guide – Data Science Tutorials
To do this, we can use the syntax shown below.
library(dplyr)
scale just the data frame’s numerical columns.
df %>% mutate(across(where(is.numeric), scale))
Team points value 1 P1 -0.79813157 -1.1284228 2 P2 -0.67342351 -0.5447558 3 P3 -0.17459128 -0.0583667 4 P4 1.69602958 1.5467175 5 P5 -0.04988322 0.1848279
The team column has remained the same, but the values in the three numerical columns (points, assists, and rebounds) have been scaled.
How to Standardize Data in R? – Data Science Tutorials