How to Create Summary Tables in R
How to Create Summary Tables in R?, The describe() and describeBy() methods from the psych package is the simplest to use for creating summary tables in R.
How to apply a transformation to multiple columns in R?
library(psych)
Let’s create a summary table
describe(df)
We can now create a summary table that is organized by a certain variable.
describeBy(df, group=df$var_name)
The practical application of these features is demonstrated in the examples that follow.
Example 1:- Create a simple summary table
Let’s say we have the R data frame shown below:
make a data frame
df <- data.frame(team=c('P1', 'P1', 'P1', 'P2', 'P2', 'P2', 'P1'),
points=c(150, 222, 229, 421, 330, 211, 219),
rebounds=c(17, 28, 36, 16, 17, 29, 15),
steals=c(11, 151, 152, 73, 85, 79, 58))Now we can view the data frame
df
team points rebounds steals 1Â Â P1Â Â Â 150Â Â Â Â Â Â 17Â Â Â Â 11 2Â Â P1Â Â Â 222Â Â Â Â Â Â 28Â Â Â 151 3Â Â P1Â Â Â 229Â Â Â Â Â Â 36Â Â Â 152 4Â Â P2Â Â Â 421Â Â Â Â Â Â 16Â Â Â Â 73 5Â Â P2Â Â Â 330Â Â Â Â Â Â 17Â Â Â Â 85 6Â Â P2Â Â Â 211Â Â Â Â Â Â 29Â Â Â Â 79 7Â Â P1Â Â Â 219Â Â Â Â Â Â 15Â Â Â Â 58
For each variable in the data frame, a summary table can be made using the describe() function.
Add new calculated variables to a data frame and drop all existing variables
library(psych)
Now will create a summary table
describe(df)
vars n  mean   sd median trimmed  mad min max range skew kurtosis team*      1 7  1.43 0.53     1   1.43 0.00  1  2    1 0.23   -2.20 points     2 7 254.57 90.56   222 254.57 16.31 150 421  271 0.71   -1.03 rebounds   3 7 22.57 8.30    17  22.57 2.97 15 36   21 0.44   -1.73 steals     4 7 87.00 50.34    79  87.00 31.13 11 152  141 0.08   -1.47            se team*    0.20 points  34.23 rebounds 3.14 steals  19.03
Here’s how to interpret each value in the output:
vars: column number
n: Number of valid cases
mean: The mean value
median: The median value
trimmed: The trimmed mean (default trims 10% of observations from each end)
mad: The median absolute deviation (from the median)
min: The minimum value
max: The maximum value
range: The range of values (max – min)
skew: The skewness
kurtosis: The kurtosis
se: The standard error
Any variable that has an asterisk (*) next to it has been transformed from being categorical or logical to becoming a numerical variable with values that represent the numerical ordering of the values.
How to Use Spread Function in R?-tidyr
We shouldn’t take the summary statistics for the variable “team” which has been transformed into a numerical variable.
Also, take note that the setting fast=TRUE allows you to merely compute the most typical summary statistics.
Now we can create a smaller summary table
describe(df, fast=TRUE)
vars n  mean   sd min max range   se team       1 7   NaN   NA Inf -Inf -Inf   NA points     2 7 254.57 90.56 150 421  271 34.23 rebounds   3 7 22.57 8.30 15  36   21 3.14 steals     4 7 87.00 50.34 11 152  141 19.03
Additionally, we have the option of only computing the summary statistics for a subset of the data frame’s variables:
make a summary table using only the columns “points” and “rebounds”
describe(df[ , c('points', 'rebounds')], fast=TRUE)vars n  mean   sd min max range   se points     1 7 254.57 90.56 150 421  271 34.23 rebounds   2 7 22.57 8.30 15 36   21 3.14
Example 2: Make a summary table that is grouped by a certain variable.
The describeBy() function can be used to group the data frame’s summary table by the variable “team” using the following code.
build the summary table with teams as the primary grouping.
How to Use Mutate function in R – Data Science Tutorials
describeBy(df, group=df$team, fast=TRUE)
Descriptive statistics by group
group: P1         vars n mean   sd min max range   se team       1 4 NaN   NA Inf -Inf -Inf   NA points     2 4 205 36.91 150 229   79 18.45 rebounds   3 4  24 9.83 15  36   21 4.92 steals     4 4  93 70.22 11 152  141 35.11 ------------------------------------------------------------- group: P2         vars n  mean    sd min max range   se team       1 3   NaN    NA Inf -Inf -Inf   NA points     2 3 320.67 105.31 211 421  210 60.80 rebounds   3 3 20.67  7.23 16  29   13 4.18 steals     4 3 79.00  6.00 73  85   12 3.46
The summary statistics for each of the three teams in the data frame are displayed in the output.

