Matthew Bennett

Logo

I am a data scientist working on time series forecasting (using R and Python 3) at the London Ambulance Service NHS Trust. I earned my PhD in cognitive neuroscience at the University of Glasgow working with fmri data and neural networks. I favour linux machines, and working in the terminal with Vim as my editor of choice.

View my GitHub Profile

View my LinkedIn Profile

View my CV

Student’s t

One sample t-test

A common task is to check if the value of a population parameter, as estimated from a random sample of that population, differs meaningfully from zero (or some other number). We can form the null hypothesis that any deviation of our estimate from zero is the result of measurement error - and then see if the observed deviation is large enough to reject the null hypothesis. To do this, we could assume that such measurement errors are normally distributed around zero (this is often the case for samples greater than ~30, thanks to the central limit theorem). Then we could calculate the number of standard deviations our estimate is away from zero - rejecting the null hypothesis at some critical value (often set at about 2). This would be a z test.

However, it's uncommon to know the population standard deviation of the estimate in advance, and so it must too be estimated from the sample (this is the standard error):

\[\begin{equation} t = \frac{\bar x - \mu_0}{SE} \end{equation}\]

In addition, if the sample is smaller than we'd like (say 10-20) then the distribution of the measurement errors are likely not well approximated by a normal distribution. Instead, the measurement errors would have more extreme values - that is, the distribution would have heavier tails - and would be better approximated by the Student's t distribution. The shape of the t-distribution varies with the sample size (becoming closer to the normal distribution as the sample size increases) and thus so does the critical value of t that we would require to reject the null hypothesis.

Code implementation

Here we define a method that performs a one sample t test on each column (or row) of a matrix. We compute the mean and standard error of each column (or row) and perform an element wise division. The vector of t values is returned along with the degrees of freedom:

def ttest_one_sample(A, axis=0, H=0):
    A_diff = mean(A, axis).subtract(H)
    A_se = se(A, axis, sample=True)
    ts = A_diff.div_elwise(A_se)
    df = A.size(axis)-1
    return ts, df

Paired samples t-test

In the case where we have two samples of observations where each observation in one sample is 'paired' with an observation in the other sample (e.g. if a set of measurements is repeated at two times), then we can compute these paired differences and run same test as above:

def ttest_paired(u, v):
    A_diff = u.subtract(v)
    return ttest_one_sample(A_diff)

Unpaired samples t-test

If on the other hand we have two samples which do not have a simple pairing, then the variation of the two samples cannot be considered 'shared' and used to normalise the observed difference as before.

Instead, we have to normalise by the 'pooled' standard deviation (which itself gives rise to a family of 'pooled' standard error). If the number of observations in each sample were equal, we could just average the two variances:

\[\begin{equation} s = \sqrt{\frac{\sigma_u^2}{2} + \frac{\sigma_v^2}{2}} \end{equation}\]

And then instead of obtaining the standard error by using $\sqrt{n}$, we would use $\sqrt{2/n}$ (because we have one less degree of freedom due to the second sample):

\[\begin{equation} t = \frac{\bar u - \bar v}{s \sqrt{\frac{1}{n} + \frac{1}{n}}} \end{equation}\]

When the sample sizes are different, we can take a family of 'weighted' average of the variances - weighted, that is, by the proportion of total degrees of freedom each sample accounts for. Thus the 'pooled' standard deviation in that case is:

\[\begin{equation} s = \sqrt{\frac{\sigma_u^2(n_1-1)}{(n_1 - 1) + (n_2 - 1)} + \frac{\sigma_v^2(n_2 - 1)}{(n_1 - 1) + (n_2 - 1)}} \end{equation}\] \[\begin{equation} = \sqrt{\frac{\sigma_u^2(n_1-1) + \sigma_v^2(n_2 - 1)}{(n_1 - 1) + (n_2 - 1)}} \end{equation}\]

Just like in equation $(3)$ above, we account for having one less degree of freedom when converting our pooled standard deviation to a pooled standard error:

\[\begin{equation} t = \frac{\bar u - \bar v}{s \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} \end{equation}\]

Code implementation

Since we allow for different sample sizes, we require two vectors, u and v. Unless explicitly told to assume equal variances, we perform Welch's unpaired t test (see below). Otherwise, we compute a few preliminaries, then proceed to the pooled variance and pooled standard deviation. Finally, we compute and return the t statistic (along with the degrees of freedom):

def ttest_unpaired(u, v, assume_equal_vars=False):
    if not assume_equal_vars:
        return ttest_welchs(u, v)

    diff = mean(u).subtract(mean(v)).make_scalar()
    Nu, Nv = u.size(0), v.size(0)
    u_df, v_df = Nu-1, Nv-1
    df = u_df + v_df
    u_var, v_var = var(u).make_scalar(), var(v).make_scalar()

    pooled_var = (( (u_var * u_df) / (u_df + v_df) ) +
                  ( (v_var * v_df) / (v_df + v_df) ))

    pooled_sd = sqrt(pooled_var)

    t = diff / (pooled_sd * sqrt(1/Nu + 1/Nv))

    return t, df

Welch’s unpaired samples t-test

The unpaired t test above pooled the variances, taking into account the unequal sample sizes but assuming that the magnitude of the variation did not differ between the two samples. If the variance does differ, we can't pool them. A method for dealing with this situation is Welch's unpaired t test. The method is robust and gives very similar results to the pooling method above when variances are equal. For this reason, many statistical software packages default to this method whenever doing unpaired t tests.

Conceptually simple, Welch's t is calculated by dividing the mean difference between the two samples by the square root of the sum of the squared standard errors:

\[\begin{equation} t = \frac{\bar u - \bar v}{\sqrt{SE_u^2 + SE_v^2}} \end{equation}\]

To calculate the degrees of freedom, we can use the Welch–Satterthwaite equation:

\[\begin{equation} df = \frac{\Big(\frac{\sigma_u^2}{n_u} + \frac{\sigma_v^2}{n_v}\Big)^2}{\frac{\sigma_u^4}{n_u^2(n_u-1)}\frac{\sigma_v^4}{n_v^2(n_v-1)}} \end{equation}\]

Code implementation

def ttest_welchs(u, v):
    diff = mean(u).subtract(mean(v)).make_scalar()
    u_se, v_se = se(u).make_scalar(), se(v).make_scalar()
    t = diff / sqrt(u_se**2 + v_se**2)

    # compute degrees of freedom by Welch–Satterthwaite equation
    Nu, Nv = u.size(0), v.size(0)
    u_sd, v_sd = sd(u).make_scalar(), sd(v).make_scalar()

    df =          ( (u_sd**2/Nu + v_sd**2/Nv)**2 /
    (u_sd**4 / (Nu**2 * (Nu-1)) + v_sd**4 / (Nv**2 * (Nv-1))) )

    return t, df

Demo

We create a matrix with three columns - the third being more variable. We then store the each column separately as u, v and z, and extend z by concatenating it with v. We call the various ttest methods and print the results:

import linalg as la

A = la.Mat([[85, 84, 81],
            [86, 83, 92],
            [88, 93, 103],
            [75, 99, 85],
            [78, 97, 97],
            [94, 90, 84],
            [98, 88, 52],
            [79, 90, 88],
            [71, 87, 105],
            [80, 86, 86],
                       ])

u = A.ind('',0)
v = A.ind('',1)
z = A.ind('',2)
z = la.cat(v, z)

ts, df = la.stats.ttest_one_sample(A, H=80)
print(df); la.print_mat(ts, 2)

t, df = la.stats.ttest_paired(u, v)
print(df); print(round(t, 2))

t, df = la.stats.ttest_unpaired(u, z, assume_equal_vars=True)
print(df); print(round(t, 2))

t, df = la.stats.ttest_welchs(u, z)
print(round(df, 2)); print(round(t, 2))

Outputs:

>>> ts, df = la.stats.ttest_one_sample(A, H=80)
>>> print(df); la.print_mat(ts, 2)
9
[1.27, 5.8, 1.56]

>>> t, df = la.stats.ttest_paired(u, v)
>>> print(df); print(round(t, 2))
9
-1.8

>>> t, df = la.stats.ttest_unpaired(u, z, assume_equal_vars=True)
>>> print(df); print(round(t, 2))
28
-1.45

>>> t, df = la.stats.ttest_welchs(u, z)
>>> print(round(df, 2)); print(round(t, 2))
22.79
-1.41

< Pearson’s correlation

back to project main page
back to home