## Monday, July 27, 2009

### Paired Student's t-test

Comparison of the means of two sets of paired samples, taken from two populations with unknown variance.

A school athletics has taken a new instructor, and want to test the effectiveness of the new type of training proposed by comparing the average times of 10 runners in the 100 meters. Are below the time in seconds before and after training for each athlete.

Before training: 12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3
After training: 12.7, 13.6, 12.0, 15.2, 16.8, 20.0, 12.0, 15.9, 16.0, 11.1

In this case we have two sets of paired samples, since the measurements were made on the same athletes before and after the workout. To see if there was an improvement, deterioration, or if the means of times have remained substantially the same (hypothesis H0), we need to make a Student's t-test for paired samples, proceeding in this way:

`a = c(12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3)b = c(12.7, 13.6, 12.0, 15.2, 16.8, 20.0, 12.0, 15.9, 16.0, 11.1)t.test(a,b, paired=TRUE)    Paired t-testdata: a and bt = -0.2133, df = 9, p-value = 0.8358alternative hypothesis: true difference in means is not equal to 095 percent confidence interval:    -0.5802549 0.4802549sample estimates:mean of the differences    -0.05`

The p-value is greater than 0.05, then we can accept the hypothesis H0 of equality of the averages. In conclusion, the new training has not made any significant improvement (or deterioration) to the team of athletes.
Similarly, we calculate the t-tabulated value:

`qt(0.975, 9)[1] 2.262157`

t-computed < t-tabulated, so we accept the null hypothesis H0.

Suppose now that the manager of the team (given the results obtained) fired the coach who has not made any improvement, and take another, more promising. We report the times of athletes after the second training:

Before training: 12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3
After the second training: 12.0, 12.2, 11.2, 13.0, 15.0, 15.8, 12.2, 13.4, 12.9, 11.0

Now we check if there was actually an improvement, ie perform a t-test for paired data, specifying in R to test the alternative hypothesis H1 of improvement in times. To do this simply add the syntax `alt = "less"` when you call the t-test:

`a = c(12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3)b = c(12.0, 12.2, 11.2, 13.0, 15.0, 15.8, 12.2, 13.4, 12.9, 11.0)t.test(a,b, paired=TRUE, alt="less")    Paired t-testdata: a and bt = 5.2671, df = 9, p-value = 0.9997alternative hypothesis: true difference in means is less than 095 percent confidence interval:    -Inf 2.170325sample estimates:mean of the differences     1.61`

With this syntax we asked R to check whether the mean of the values contained in the vector `a` is less of the mean of the values contained in the vector `b`. In response, we obtained a p-value well above 0.05, which leads us to conclude that we can reject the null hypothesis H0 in favor of the alternative hypothesis H1: the new training has made substantial improvements to the team.

If we had written: `t.test (a, b, paired = TRUE, alt = "greater")`, we asked R to check whether the mean of the values contained in the vector `a` is greater than the mean of the values contained in the vector `b`. In light of the previous result, we can suspect that the p-value will be much smaller than 0.05, and in fact:

`a = c(12.9, 13.5, 12.8, 15.6, 17.2, 19.2, 12.6, 15.3, 14.4, 11.3)b = c(12.0, 12.2, 11.2, 13.0, 15.0, 15.8, 12.2, 13.4, 12.9, 11.0)t.test(a,b, paired=TRUE, alt="greater")    Paired t-testdata: a and bt = 5.2671, df = 9, p-value = 0.0002579alternative hypothesis: true difference in means is greater than 095 percent confidence interval:    1.049675 Infsample estimates:mean of the differences    1.61`

1. I saw your post on http://www.r-bloggers.com/paired-students-t-test/ and got to this page to say thanks, it has helped me a lot!

2. Thanks for this and the Wilcoxon piece

3. Thank you for this topic! it is very helpfull !