Matthew Bennett

Logo

I am a data scientist working on time series forecasting (using R and Python 3) at the London Ambulance Service NHS Trust. I earned my PhD in cognitive neuroscience at the University of Glasgow working with fmri data and neural networks. I favour linux machines, and working in the terminal with Vim as my editor of choice.

View my GitHub Profile

View my LinkedIn Profile

View my CV

Z-score

Z-scores are a way of transforming a set of observations such that each represents the number of standard deviations above or below the mean. This is useful since it allows you to compare sets of observations measured on different scales - when fitting models, using standardised scores can aid the interpretation of how much different variables are contributing to the model.

Code implementation

The Z-score is computed in two steps: zero centering the columns (or rows), followed by dividing each value by the standard deviation of the column (or row). For the division step, use the tile method to produce a matrix the same size as the input, with the standard deviation of each column (or row) repeated for each row (or column) and perform a element-wise division:

def zscore(A, axis=0, sample=False):
    A_zc = zero_center(A, axis)
    A_sd = sd(A_zc, axis, sample)
    if axis == 1:
        A_sd_rep = la.tile(A_sd, axes=[1, la.size(A)[1]])
    else:
        A_sd_rep = la.tile(A_sd, axes=[la.size(A)[0], 1])
    return A_zc.div_elwise(A_sd_rep)

Demo

We create a matrix, call the zscore method and print the result:

import linalg as la

A = la.Mat([[1, 2, 3],
            [2, 1, 4],
            [3, 0, 4],
            [4, 2, 2]])

la.print_mat(la.stats.zscore(A, axis=0), 3)

la.print_mat(la.stats.zscore(A, axis=1), 3)

Outputs:

>>> la.print_mat(la.stats.zscore(A, axis=0), 3)
[-1.342, 0.905, -0.302]
[-0.447, -0.302, 0.905]
[0.447, -1.508, 0.905]
[1.342, 0.905, -1.508]

>>> la.print_mat(la.stats.zscore(A, axis=1), 3)
[-1.225, 0.0, 1.225]
[-0.267, -1.069, 1.336]
[0.392, -1.373, 0.981]
[1.414, -0.707, -0.707]

< Variance, covariance, standard deviation and standard error

Pearson correlation >

back to project main page
back to home