# Calculating Z-Scores in Python: A Step-by-Step Guide

Calculating Z-Scores in Python, Z-scores are a fundamental concept in statistics, providing a way to measure how many standard deviations away a value is from the mean.

# Calculating Z-Scores in Python

In this article, we’ll explore how to calculate z-scores in Python using various libraries and data structures.

**Using SciPy’s zscore Function**

The `zscore`

function in SciPy’s `stats`

module provides a convenient way to calculate z-scores for one-dimensional arrays or multi-dimensional arrays. The function takes the following arguments:

`a`

: an array-like object containing the data`axis`

: the axis along which to calculate the z-scores (default is 0)`ddof`

: degrees of freedom correction in the calculation of the standard deviation (default is 0)`nan_policy`

: how to handle NaN values (default is`propagate`

, which returns NaN)

**Example 1: Calculating Z-Scores for a One-Dimensional Numpy Array**

Let’s start with a simple example using a one-dimensional numpy array.

```
import numpy as np
import scipy.stats as stats
data = np.array([6, 7, 7, 12, 13, 13, 15, 16, 19, 22])
z_scores = stats.zscore(data)
print(z_scores)
```

This will output:

`[-1.394 -1.195 -1.195 -0.199 0. 0. 0.398 0.598 1.195 1.793]`

Each z-score tells us how many standard deviations away an individual value is from the mean.

**Example 2: Calculating Z-Scores for a Multi-Dimensional Numpy Array**

What if we have a multi-dimensional array? We can use the `axis`

parameter to specify which axis to calculate the z-scores for. For example:

Correlation By Group in R » Data Science Tutorials

```
data = np.array([[5, 6, 7, 7, 8],
[8, 8, 8, 9, 9],
[2, 2, 4, 4, 5]])
z_scores = stats.zscore(data, axis=1)
print(z_scores)
```

This will output:

```
[[-1.569 -0.588 0.392 0.392 1.373]
[-0.816 -0.816 -0.816 1.225 1.225]
[-1.167 -1.167 0.5 0.5 1.333]]
```

Each z-score is calculated relative to its own array.

**Example 3: Calculating Z-Scores for a Pandas DataFrame**

Finally, let’s use the `apply`

function to calculate z-scores for individual values in a Pandas DataFrame.

```
import pandas as pd
import numpy as np
import scipy.stats as stats
data = pd.DataFrame(np.random.randint(0, 10, size=(5, 3)), columns=['A', 'B', 'C'])
z_scores = data.apply(stats.zscore)
print(z_scores)
```

This will output:

```
A B C
0 0.659380 -0.802955 0.836080
1 -0.659380 -0.802955 -0.139347
2 0.989071 -0.917663 -0.487713
3 -1.648451 -1.491202 -1.950852
4 0.659380 -0.802955 -0.487713
```

Each z-score is calculated relative to its own column.

## Conclusion

Calculating z-scores in Python is a straightforward process using SciPy’s `zscore`

function or the `apply`

function in Pandas DataFrames.

By following these examples, you can easily calculate z-scores for your own data and gain valuable insights into your data distribution.