Permutation Tests¶
Alpaca Tutorial: https://www.jwilber.me/permutationtest/¶
Steps of a Permutation Test¶
- Identify a null hypothesis (e.g. there is no difference between groups) and an alternative hypothesis (e.g. Group A is different than B; The mean of Group A is greater than B).
- Choose a test statistic.
- Permute the members of the original groups. Combine data, re-assign into fictive groups that match the size of the original samples, recompute the test statistic.
- Repeat this shuffling (permutation) many times, each time recording the test statistic.
- Compare the distribution of recorded test statistics with the observed values (translates to a p-value).
In [6]:
# Permutation test with alpaca data set
import numpy as np
import pandas as pd
df = pd.read_csv('alpaca.csv')
df
Out[6]:
treatment | control | |
---|---|---|
0 | 7.2 | 4.6 |
1 | 8.3 | 4.6 |
2 | 8.3 | 5.1 |
3 | 7.1 | 2.8 |
4 | 4.4 | 5.4 |
5 | 4.1 | 4.4 |
6 | 4.8 | 4.2 |
7 | 6.2 | 5.6 |
8 | 7.7 | 5.8 |
9 | 7.4 | 4.1 |
10 | 4.3 | 3.9 |
11 | 5.8 | 3.8 |
In [12]:
# Visualize the data
import matplotlib.pyplot as plt
treat_xvals = np.zeros(len(df['treatment']))
ctrl_xvals = np.zeros(len(df['control'])) + 1
fig, ax = plt.subplots(figsize=(3,3))
ax.scatter(treat_xvals, df['treatment'])
ax.scatter(ctrl_xvals, df['control'])
ax.plot(0.1, np.mean(df['treatment']), '<')
ax.plot(1.1, np.mean(df['control']), '<')
ax.set_xticks([0,1])
ax.set_xticklabels(['treatment', 'control'])
ax.set_ylabel('wool quality')
Out[12]:
Text(0, 0.5, 'wool quality')
Setting up the permutation test with our test statistic as the difference in means¶
In [ ]:
# combining arrays
combined_array = np.append(df['treatment'], df['control'])
print(combined_array)
# shuffling the array
shuffled_array = np.random.permutation(combined_array)
print(shuffled_array)
[7.2 8.3 8.3 7.1 4.4 4.1 4.8 6.2 7.7 7.4 4.3 5.8 4.6 4.6 5.1 2.8 5.4 4.4 4.2 5.6 5.8 4.1 3.9 3.8] [5.8 4.8 7.4 4.2 5.6 7.1 4.6 3.8 3.9 4.1 6.2 8.3 4.4 4.4 4.1 5.8 4.3 7.7 5.1 7.2 4.6 2.8 5.4 8.3]
In [15]:
# splitting into groups
shuffled_treat = shuffled_array[0:len(df['treatment'])]
shuffled_control = shuffled_array[len(df['treatment']):]
# visualize random groupings
fig, ax = plt.subplots(figsize=(3,3))
ax.scatter(treat_xvals, shuffled_treat)
ax.scatter(ctrl_xvals, shuffled_control)
ax.plot(0.1, np.mean(shuffled_treat), '<')
ax.plot(1.1, np.mean(shuffled_control), '<')
ax.set_xticks([0,1])
ax.set_xticklabels(['treatment', 'control'])
ax.set_ylabel('wool quality')
Out[15]:
Text(0, 0.5, 'wool quality')
Repeating the shuffle many times¶
In [25]:
number_of_shuffles = 1000
mean_diff_list = []
for each_shuffle in np.arange(number_of_shuffles):
shuffled_array = np.random.permutation(combined_array)
shuffled_treat = shuffled_array[0:len(df['treatment'])]
shuffled_control = shuffled_array[len(df['treatment']):]
mean_diff = np.mean(shuffled_treat) - np.mean(shuffled_control)
mean_diff_list.append(mean_diff)
# Visualizing the test statistic from many shuffles
fig, ax = plt.subplots(figsize = (3,3))
ax.hist(mean_diff_list)
ax.set_ylabel('counts')
ax.set_xlabel('mean treatment - mean control')
# Overlaying observed difference
obs_diff = np.mean(df['treatment']) - np.mean(df['control'])
ax.plot(obs_diff,50,'rv')
ax.text(obs_diff-0.75,60, 'obs diff', color = 'r')
Out[25]:
Text(1.0250000000000004, 60, 'obs diff')
Calculating a p-value (the probability that the null hypothesis is true meaning there is no difference between group means)¶
In [28]:
#counting how many observations are equal to or more extreme than the observed difference
num_greater_than_observed = sum(np.array(mean_diff_list)>=obs_diff)
pvalue = num_greater_than_observed/number_of_shuffles
pvalue
Out[28]:
0.002