Student's T-Test

Named after William Sealy Gosset, who published under the name "Student" to respect his employer's privacy.

Student's t-test helps us compare two sets of data to see if the difference between their means is just random chance.

There are two main types:

Independent samples t-test: to compare two distinct groups where members of one are not related to the other. Examples:

Comparing the effectiveness of two different teaching methods on students performance, where one group receives Method A and the other Method B.
Assessing the impact of a new drug treatment by comparing the recovery rates of patients who receive the drug versus those who receive a placebo.

Paired samples t-test: when the same participants are in both groups being compared, such as before-and-after observations. Examples:

Evaluating the effectiveness of a weight loss program by measuring the weight of participants before and after the program.
Testing the impact of a training program on employee productivity by comparing their performance before and after training.

Independent Samples T-Test: to compare two distinct groups where members of one are not related to the other.
Paired Samples T-Test: when the same participants are in both groups being compared, such as before-and-after observations.

The data should be normally distributed, and is not reliable if the sample sizes are too small or the variances are very different.

Note: you might also like to investigate the Chi-Square Test.

Independent Samples T-Test

We typically use this formula (but there are other formulas):

t = x₁ − x₂√(s₁²/n₁ + s₂²/n₂)

Where:

x₁ and x₂ are sample means
s₁² and s₂² are sample variances
n₁ and n₂ are sample sizes

Note: we use s² for variance, because the variance is the square of the standard deviation s.

To do the t-test:

Calculate mean, variance and samples size for each set of sample data
Use the formula above to calculate "t"
Then we look up a special table to find out how significant that "t" value is

In effect the t-value is a measure of how much the means of the two groups differ from each other compared to the variability of the data.

A higher t-value suggests a more significant difference between the groups (ie less likely due to random chance).

Let's try an example:

Example: Battery Life

This is an experiment you can do yourself (at the cost of 20 batteries).

Two batches of identical batteries are tested:

One group of 10 were tested in a cool environment
Another group of 10 were tested in a hot environment

The batteries power identical devices until they run out of charge, with these results (in hours):

Cool Group: 85, 90, 76, 88, 93, 87, 91, 89, 95, 85
Hot Group: 78, 82, 79, 88, 91, 85, 90, 87, 84, 86

Let's calculate the mean for each:

Cool Group Mean = 85+90+76+88+93+87+91+89+95+8510 = 87.9 hours
Hot Group Mean = 78+82+79+88+91+85+90+87+84+8610 = 85 hours

The Cool Group lasted, on average, about 3 hours longer.

But is this a significant result, or just random in nature?

Now, it's time to calculate the t-statistic.

Our next step is to calculate Variances.

Variance is a measure of how much the data points spread out from the mean. We calculate it by taking the average of the squared differences between each data point, as shown here:

For the Cool Group:

n = 10
Mean = 87.9
Variance = (85-87.9)²+(90-87.9)²+(76-87.9)²+(88-87.9)²+(93-87.9)²+(87-87.9)²+(91-87.9)²+(89-87.9)²+(95-87.9)²+(85-87.9)²9 = 27.88

For the Hot Group:

n = 10
Mean = 85
Variance = (78-85)²+(82-85)²+(79-85)²+(88-85)²+(91-85)²+(85-85)²+(90-85)²+(87-85)²+(84-85)²+(86-85)²9 = 18.89

Start with:

t = x₁ − x₂√(s₁²/n₁ + s₂²/n₂)

Put in our values:

87.9 − 85√(27.88/10 + 18.89/10)

Calculate:

t = 1.341

Now let's compare our t-value to the "critical t-value" from the Student's t-test Table at a chosen significance level, typically 0.05 for a 95% confidence level with degrees of freedom df = n₁ + n₂ − 2 = 18, and using the two-tails value, as the change in battery life could theoretically go either way.

The table shows us that for 0.05 and df=18 we get a "critical t-value" of 2.101

Our actual t value is 1.341, and being less than 2.101 our result does not pass the 95% confidence level.

So there is not enough evidence to say there is a significant difference between the cool and hot groups.

Hypothesis

The t-test is often done more formally using the idea of a hypothesis and null hypothesis.

A hypothesis is an informed guess about a relationship between variables, that is testable.

Example: Fruits and Vegetables

You want to investigate whether eating more fruits and vegetables reduces the risk of heart disease. Your hypothesis might be:

"Consuming at least 5 servings of fruits and vegetables daily leads to a lower incidence of heart disease among adults."

In this case, your hypothesis is testable because you can collect data on people's fruit and vegetable consumption and compare it to their heart health outcomes.

But we usually take the opposing point of view:

A "Null Hypothesis" is a statement that there is no significant relationship between two variables. It is the "default position" that any observed effects is just due to chance.

The null hypothesis is the baseline assumption for statistical analysis

The goal of statistical analysis is often to provide evidence that lets us reject the null hypothesis.

Rejecting the null hypothesis gives credibility to (but does not prove) our original hypothesis. In other words if we can show the skeptic is likely wrong, we can suggest our original idea might be right.

Example continued

The Null Hypothesis might be:

"Consuming at least 5 servings of fruits and vegetables daily does not lead to a significantly lower incidence of heart disease among adults."

If our data does not support the Null Hypothesis then we have evidence for our original hypothesis.

Hypothesis Steps

Formulate the null hypothesis (H₀): There is no significant difference between the two groups.
Formulate the alternative hypothesis (H_a): There is a significant difference between the two groups.
Choose a significance level (commonly 0.05), which is the probability of rejecting the null hypothesis when it is actually true.
Collect the data and calculate the t-statistic based on the two sample means, standard deviations, and sample sizes.
Compare the calculated t-statistic to the critical value from the t-distribution table to determine whether to accept or reject the null hypothesis.

Let's do our original example that way!

Example: Battery Life

Null Hypothesis (H₀): The difference between a cool and hot environment has no effect on a battery's running time

Alternative Hypothesis (H_a): The difference between a cool and hot environment will affect a battery's running time

Using the values calculated earlier:

From before:

87.9 − 85√(27.88²/10 + 18.89²/10)

Calculate:

t = 1.341

From the Student's t-test Table for 0.05 (which is a 95% confidence level) and df=18 we get a "critical t-value" of 2.101

Since our calculated t-value of 1.341 is less than the critical t-value of 2.101, we do not have enough evidence to reject the null hypothesis.

We cannot reject the null hypothesis

This means that based on our sample, we cannot confidently say a statistically significant difference exists.

In everyday terms, it's as if we're saying, "Based on what we've seen, we can't conclude that the change is anything more than just random chance."

Paired Samples T-Test

A Paired samples t-test, also known as a dependent samples t-test, is a statistical test that is used to compare two means (averages) when the data is paired in some way. Such as a before and after trial.

So the same subjects are involved in both sets of data.

Step-by-Step Calculation

To do the Paired Samples t-test:

Calculate the difference (d) for each pair of scores
Calculate the mean difference (d)
Calculate the variance of the differences (var)
Calculate the standard deviation of the differences (s)
Calculate the t-statistic.
Determine the degrees of freedom (df)
Compare the calculated t-statistic to the critical value from the t-distribution table to determine whether to accept or reject the null hypothesis.

Example: Test Scores

We have test scores from 7 students before and after they did a special math program. Let's compare the two sets of scores!

Null Hypothesis (H₀): The special math program has no significant effect on test score differences

Alternative Hypothesis (H_a): The special math program does affect test scores

Results Before: 72, 75, 80, 72, 68, 92, 84
Results After: 84, 74, 86, 79, 78, 91, 88

Put that in a table and do some calcs:

Before	After	Diff (d)	(d− d )²
72	84	12	45.08
75	74	-1	39.51
80	86	6	0.51
72	79	7	2.94
68	78	10	22.22
92	91	-1	39.51
84	88	4	1.65
	Sum:	37	151.42

Mean of Differences (d) = 37/7 = 5.29

Variance (var) = 151.42/(7−1) = 25.24

Standard Deviation (s) = √(25.24) = 5.02

We can calculate the t-value using either one of these formulas (they are mathematically equivalent):

t = ds/√(n) = 5.295.02/√(7) = 2.78

t = d√(var/n) = 5.29√(25.24/7) = 2.78

Now let's compare our t value to the critical t-value in the table.

The degrees of freedom (df) for a paired t-test is n − 1: df = 7−1 = 6
For a two-tailed test with df=6 at the 0.05 level, we find the critical t-value is 2.447
Our value of 2.78 is above that, so the Null Hypothesis fails

It is unlikely that the Null Hypothesis is true (less than 5% chance), so we have good reason to believe that the alternative hypothesis "The special math program does affect test scores" is true

One-Tail and Two-Tail T-Tests

One-Tail T-Test: Tests for a significant effect in one specific direction (either greater than or less than).
Two-Tail T-Test: Tests for a significant effect in both directions (either greater or less than), without committing to one specific direction before the test.

Choose a one-tail test if you have a specific hypothesis about the direction of the effect. Use a two-tail test if you're interested in detecting any significant difference, regardless of direction.

Single Set of Data

For one set of data we can calculate a t value like this:

t = X̄ − μs/√(n)

19047, 19048, 19049, 19050, 19051, 19052, 190534, 19054