Student's T-Test

Sage chooses bones

(Named after William Sealy Gosset, who published under the name "Student" to respect his employer's privacy.)

DRAFT

 

Student's t-test helps us compare two sets of data to see if the difference between their means is just random chance.

There are two main types:

The data should be normally distributed, and is not reliable if the sample sizes are too small or the variances are very different.

Note: you migh also like to investigate the Chi-Square Test.

Independent Samples T-Test

We typically use this formula (but there are other formulas):

t = x1x2√(s12/n1 + s22/n2)

Where:

Note: we use s2 for variance, because the variance is the square of the standard deviation s.

To do the t-test:

In effect the t-value is a measure of how much the means of the two groups differ from each other compared to the variability of the data.

A higher t-value suggests a more significant difference between the groups (ie less likely due to random chance).

Let's try an example:

battery

Example: Battery Life

This is an experiment you can do yourself (at the cost of 20 batteries).

Two batches of identical batteries are tested:

  • One group of 10 were tested in a cool environment
  • Another group of 10 were tested in a hot environment

The batteries power identical devices until they run out of charge, with these results (in hours):

  • Cool Group: 85, 90, 76, 88, 93, 87, 91, 89, 95, 85
  • Hot Group: 78, 82, 79, 88, 91, 85, 90, 87, 84, 86

Let's calculate the mean for each:

  • Cool Group Mean = 85+90+76+88+93+87+91+89+95+8510 = 87.9 hours
  • Hot Group Mean = 78+82+79+88+91+85+90+87+84+8610 = 85 hours

The Cool Group lasted, on average, about 3 hours longer.

question

But is this a significant result, or just random in nature?

Now, it's time to calculate the t-statistic.


Our next step is to calculate Variances.

Variance is a measure of how much the data points spread out from the mean. We calculate it by taking the average of the squared differences between each data point, as shown here:

For the Cool Group:

  • n = 10
  • Mean = 87.9
  • Variance = (85-87.9)2+(90-87.9)2+(76-87.9)2+(88-87.9)2+(93-87.9)2+(87-87.9)2+(91-87.9)2+(89-87.9)2+(95-87.9)2+(85-87.9)29 = 27.88

For the Hot Group:

  • n = 10
  • Mean = 85
  • Variance = (78-85)2+(82-85)2+(79-85)2+(88-85)2+(91-85)2+(85-85)2+(90-85)2+(87-85)2+(84-85)2+(86-85)29 = 18.89
Start with:
t = x1x2√(s12/n1 + s22/n2)
Put in our values:
87.9 − 85√(27.88/10 + 18.89/10)
Calculate:
t = 1.341

Now let's compare our t-value to the "critical t-value" from the Student's t-test Table at a chosen significance level, typically 0.05 for a 95% confidence level with degrees of freedom df = n1 + n2 − 2 = 18, and using the two-tails value, as the change in battery life could theoretically go either way.

The table shows us that for 0.05 and df=18 we get a "critical t-value" of 2.101

Our actual t value is 1.341, and being less than 2.101 our result does not pass the 95% confidence level.

So there is not enough evidence to say there is a significant difference between the cool and hot groups.

Hypothesis

The t-test is often done more formally using the idea of a hypothesis and null hypothesis.

A hypothesis is an informed guess about a relationship between variables, that is testable.

Example: Fruits and Vegetables

You want to investigate whether eating more fruits and vegetables reduces the risk of heart disease. Your hypothesis might be:

"Consuming at least 5 servings of fruits and vegetables daily leads to a lower incidence of heart disease among adults."

In this case, your hypothesis is testable because you can collect data on people's fruit and vegetable consumption and compare it to their heart health outcomes.

But we usually take the opposing point of view:

A "Null Hypothesis" is a statement that there is no significant relationship between two variables. It is the "default position" that any observed effects is just due to chance.

The null hypothesis is the baseline assumption for statistical analysis

The goal of statistical analysis is often to provide evidence that lets us reject the null hypothesis.

Rejecting the null hypothesis gives credibility to (but does not prove) our original hypothesis. In other words if we can show the skeptic is likely wrong, we can suggest our original idea might be right.

Example continued

The Null Hypothesis might be:

"Consuming at least 5 servings of fruits and vegetables daily does not lead to a significantly lower incidence of heart disease among adults."

If our data does not support the Null Hypothesis then we have evidence for our original hypothesis.

Hypothesis Steps

  1. Formulate the null hypothesis (H0): There is no significant difference between the two groups.
  2. Formulate the alternative hypothesis (Ha): There is a significant difference between the two groups.
  3. Choose a significance level (commonly 0.05), which is the probability of rejecting the null hypothesis when it is actually true.
  4. Collect the data and calculate the t-statistic based on the two sample means, standard deviations, and sample sizes.
  5. Compare the calculated t-statistic to the critical value from the t-distribution table to determine whether to accept or reject the null hypothesis.

Let's do our original example that way!

battery

Example: Battery Life

Null Hypothesis (H0): The difference between a cool and hot environment has no effect on a battery's running time

Alternative Hypothesis (Ha): The difference between a cool and hot environment will affect a battery's running time

Using the values calculated earlier:

From before:
87.9 − 85√(27.882/10 + 18.892/10)
Calculate:
t = 1.341

From the Student's t-test Table for 0.05 (which is a 95% confidence level) and df=18 we get a "critical t-value" of 2.101

Since our calculated t-value of 1.341 is less than the critical t-value of 2.101, we do not have enough evidence to reject the null hypothesis.

We cannot reject the null hypothesis

This means that based on our sample, we cannot confidently say a statistically significant difference exists.

In everyday terms, it's as if we're saying, "Based on what we've seen, we can't conclude that the change is anything more than just random chance."

Paired Samples T-Test

A Paired samples t-test, also known as a dependent samples t-test, is a statistical test that is used to compare two means (averages) when the data is paired in some way. Such as a before and after trial.

So the same subjects are involved in both sets of data.

Step-by-Step Calculation

To do the Paired Samples t-test:

Example: Test Scores

We have test scores from 7 students before and after they did a special math program. Let's compare the two sets of scores!

Null Hypothesis (H0): The special math program has no significant effect on test score differences

Alternative Hypothesis (Ha): The special math program does affect test scores

  • Results Before: 72, 75, 80, 72, 68, 92, 84
  • Results After: 84, 74, 86, 79, 78, 91, 88

Put that in a table and do some calcs:

Before After Diff (d) (d−
d
)2
72 84 12 45.08
75 74 -1 39.51
80 86 6 0.51
72 79 7 2.94
68 78 10 22.22
92 91 -1 39.51
84 88 4 1.65
Sum: 37 151.42

Mean of Differences (d) = 37/7 = 5.29

Variance (var) = 151.42/(7−1) = 25.24

Standard Deviation (s) = √(25.24) = 5.02

We can calculate the t-value using either one of these formulas (they are mathematically equivalent):

t = ds/√(n) = 5.295.02/√(7) = 2.78

t = d√(var/n) = 5.29√(25.24/7) = 2.78

Now let's compare our t value to the critical t-value in the table.

It is unlikely that the Null Hypothesis is true (less than 5% chance), so we have good reason to believe that the alternative hypothesis "The special math program does affect test scores" is true

One-Tail and Two-Tail T-Tests

Choose a one-tail test if you have a specific hypothesis about the direction of the effect. Use a two-tail test if you're interested in detecting any significant difference, regardless of direction.

Single Set of Data

For one set of data we can calculate a t value like this:

t = X̄ − μs/√(n)

 

19047, 19048, 19049, 19050, 19051, 19052, 190534, 19054