Monday, July 27, 2009

Paired Student's t-test

Comparison of the means of two sets of paired samples, taken from two populations with unknown variance.

A school athletics has taken a new instructor, and want to test the effectiveness of the new type of training proposed by comparing the average times of 10 runners in the 100 meters. Are below the time in seconds before and after training for each athlete.

Before training: 12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3
After training: 12.7, 13.6, 12.0, 15.2, 16.8, 20.0, 12.0, 15.9, 16.0, 11.1


In this case we have two sets of paired samples, since the measurements were made on the same athletes before and after the workout. To see if there was an improvement, deterioration, or if the means of times have remained substantially the same (hypothesis H0), we need to make a Student's t-test for paired samples, proceeding in this way:



a = c(12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3)
b = c(12.7, 13.6, 12.0, 15.2, 16.8, 20.0, 12.0, 15.9, 16.0, 11.1)

t.test(a,b, paired=TRUE)

Paired t-test

data: a and b
t = -0.2133, df = 9, p-value = 0.8358
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.5802549 0.4802549
sample estimates:
mean of the differences
-0.05


The p-value is greater than 0.05, then we can accept the hypothesis H0 of equality of the averages. In conclusion, the new training has not made any significant improvement (or deterioration) to the team of athletes.
Similarly, we calculate the t-tabulated value:


qt(0.975, 9)
[1] 2.262157


t-computed < t-tabulated, so we accept the null hypothesis H0.




Suppose now that the manager of the team (given the results obtained) fired the coach who has not made any improvement, and take another, more promising. We report the times of athletes after the second training:

Before training: 12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3
After the second training: 12.0, 12.2, 11.2, 13.0, 15.0, 15.8, 12.2, 13.4, 12.9, 11.0


Now we check if there was actually an improvement, ie perform a t-test for paired data, specifying in R to test the alternative hypothesis H1 of improvement in times. To do this simply add the syntax alt = "less" when you call the t-test:


a = c(12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3)
b = c(12.0, 12.2, 11.2, 13.0, 15.0, 15.8, 12.2, 13.4, 12.9, 11.0)

t.test(a,b, paired=TRUE, alt="less")

Paired t-test

data: a and b
t = 5.2671, df = 9, p-value = 0.9997
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf 2.170325
sample estimates:
mean of the differences
1.61


With this syntax we asked R to check whether the mean of the values contained in the vector a is less of the mean of the values contained in the vector b. In response, we obtained a p-value well above 0.05, which leads us to conclude that we can reject the null hypothesis H0 in favor of the alternative hypothesis H1: the new training has made substantial improvements to the team.

If we had written: t.test (a, b, paired = TRUE, alt = "greater"), we asked R to check whether the mean of the values contained in the vector a is greater than the mean of the values contained in the vector b. In light of the previous result, we can suspect that the p-value will be much smaller than 0.05, and in fact:


a = c(12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3)
b = c(12.0, 12.2, 11.2, 13.0, 15.0, 15.8, 12.2, 13.4, 12.9, 11.0)

t.test(a,b, paired=TRUE, alt="greater")

Paired t-test

data: a and b
t = 5.2671, df = 9, p-value = 0.0002579
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
1.049675 Inf
sample estimates:
mean of the differences
1.61

5 comments:

  1. I saw your post on http://www.r-bloggers.com/paired-students-t-test/ and got to this page to say thanks, it has helped me a lot!

    ReplyDelete
  2. Thanks for this and the Wilcoxon piece

    ReplyDelete
  3. Thank you for this topic! it is very helpfull !

    see also :
    http://www.sthda.com/english/wiki/t-test

    ReplyDelete
  4. "In response, we obtained a p-value well above 0.05, which leads us to conclude that we can reject the null hypothesis H0 in favor of the alternative hypothesis H1: the new training has made substantial improvements to the team."

    This is not correct.
    If the p-value is large, say larger than 0.05, you do NOT reject the null-hypothesis.

    ReplyDelete
  5. This is very confusing - when you say "With this syntax we asked R to check whether the mean of the values contained in the vector a is less of the mean of the values contained in the vector b." ... this would actually be the opposite of improvement, since lower times are faster. Also, per the comment above, I'm also very confused about the p-value interpretation.

    ReplyDelete