# R Summary Statistics Table

R Summary Statistics Table, The describe() and describeBy() methods from the psych package are the simplest way to produce summary tables in R.

`library(psych)`

The syntax for the summary table

tidyverse in r – Complete Tutorial » Unknown Techniques » finnstats

`describe(df)`

Now we can create a summary table, grouped by a specific variable

`describeBy(df, group=df\$varname)`

## R Summary Statistics Table

The following examples show how to use these functions in practice.

### Example 1:- Create a Basic Summary Table

Let’s say we have the following R data frame.

tidyverse in r – Complete Tutorial » Unknown Techniques » finnstats

Let’s take the iris dataset for illustration purposes.

`df <- iris`

Now we can view the data frame

`head(df)`
```Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa```

To construct a summary table for each variable in the data frame, we may use the describe() function.

```library(psych)
describe(df)```
```         vars   n mean   sd median trimmed  mad min max range  skew kurtosis   se
Sepal.Length    1 150 5.84 0.83   5.80    5.81 1.04 4.3 7.9   3.6  0.31    -0.61 0.07
Sepal.Width     2 150 3.06 0.44   3.00    3.04 0.44 2.0 4.4   2.4  0.31     0.14 0.04
Petal.Length    3 150 3.76 1.77   4.35    3.76 1.85 1.0 6.9   5.9 -0.27    -1.42 0.14
Petal.Width     4 150 1.20 0.76   1.30    1.18 1.04 0.1 2.5   2.4 -0.10    -1.36 0.06
Species*        5 150 2.00 0.82   2.00    2.00 1.48 1.0 3.0   2.0  0.00    -1.52 0.07```

The following are some examples of how to interpret each value in the output.

Exploratory Data Analysis (EDA) » Overview » finnstats

vars: number of columns

n: The total number of legitimate cases

mean: The average price

median: The median value has been cut to The average after trimming (default trims 10 percent of observations from each end)

trimmed: The trimmed mean (default trims 10% of observations from each end)

range: The range of values (max – min)

skew: The skewness

kurtosis: The kurtosis

se: The standard error

Any variable marked with an asterisk (*) is a categorical or logical variable that has been converted to a numerical variable with values that represent the numerical ordering of the values.

Because the variable ‘Species1’ has been changed to a numerical variable in our example, the summary statistics for it should not be taken literally.

Best AI Courses Online-Free » finnstats

Also, the parameter fast=TRUE can be used to only calculate the most common summary statistics.

reduce the size of the summary table

`describe(df, fast=TRUE)`
```             vars   n mean   sd min  max range   se
Sepal.Length    1 150 5.84 0.83 4.3  7.9   3.6 0.07
Sepal.Width     2 150 3.06 0.44 2.0  4.4   2.4 0.04
Petal.Length    3 150 3.76 1.77 1.0  6.9   5.9 0.14
Petal.Width     4 150 1.20 0.76 0.1  2.5   2.4 0.06
Species         5 150  NaN   NA Inf -Inf  -Inf   NA```

We may also select to construct summary statistics for only a subset of the data frame’s variables:

only the ‘Petal.Length’ and ‘Sepal.Length’ columns should be included in the summary table

`describe(df[ , c('Petal.Length', 'Sepal.Length')], fast=TRUE)`
```     vars   n mean   sd min max range   se
Petal.Length    1 150 3.76 1.77 1.0 6.9   5.9 0.14
Sepal.Length    2 150 5.84 0.83 4.3 7.9   3.6 0.07```

### Example 2: Make a summary table with specific variables grouped together.

The following code demonstrates how to group the data frame by the ‘ Species’ variable and use the describeBy() function to build a summary table.

Data Visualization Graphs-ggside with ggplot » finnstats

Make a summary table based on the ‘Species’ variable.

`describeBy(df[,-5], group=df\$Species, fast=TRUE)`

Descriptive statistics by group

```group: setosa
vars  n mean   sd min  max range   se
Sepal.Length    1 50 5.01 0.35 4.3  5.8   1.5 0.05
Sepal.Width     2 50 3.43 0.38 2.3  4.4   2.1 0.05
Petal.Length    3 50 1.46 0.17 1.0  1.9   0.9 0.02
Petal.Width     4 50 0.25 0.11 0.1  0.6   0.5 0.01
---------------------------------------------------------------------------

group: versicolor
vars  n mean   sd min  max range   se
Sepal.Length    1 50 5.94 0.52 4.9  7.0   2.1 0.07
Sepal.Width     2 50 2.77 0.31 2.0  3.4   1.4 0.04
Petal.Length    3 50 4.26 0.47 3.0  5.1   2.1 0.07
Petal.Width     4 50 1.33 0.20 1.0  1.8   0.8 0.03
---------------------------------------------------------------------------

group: virginica
vars  n mean   sd min  max range   se
Sepal.Length    1 50 6.59 0.64 4.9  7.9   3.0 0.09
Sepal.Width     2 50 2.97 0.32 2.2  3.8   1.6 0.05
Petal.Length    3 50 5.55 0.55 4.5  6.9   2.4 0.08
Petal.Width     4 50 2.03 0.27 1.4  2.5   1.1 0.04```

The summary statistics for each of the three Species in the data frame are displayed in the output.