Statisfaction

Wilcoxon Champagne test

Posted in R, Sport by Julyan Arbel on 14 June 2011

As an appetizer for Paris triathlon, Jérôme and I ran as a team last week-end an adventure racing in Champagne region (it mainly consists in running, cycling, canoeing, with a flavor of orienteering, and Champagne is kept for the end). It was organized by Ecole Polytechnique students who, for the first time, divided Saturday’s legs in two parts: in order to reduce the traffic jam in each leg, the odd number teams were to perform part 1, then part 2, and even number teams in the reverse order.

As the results popped out, we wondered whether the order of performance had favored one of the groups or not. A very much crucial question for us as we were the only odd number team in the top five. Using ggplot and a dataframe donnees including time and Group variables, the code for the (non normalized) histograms of the two groups (even: 0, odd: 1) looks like this

```library(ggplot2)
qplot(time, data = donnees, geom = "histogram", binwidth = 2,
colour = Group, facets = Group ~ ., xlim=c(0, 30))```

There are roughly the same number of teams in each group (36 and 38). Time is in hours; the effective racing time is around 12 to 15 hours, but to this is added or substracted penalties or bonus, which explains total times between 5 and 30 hours. The whole impression is that the even group as a flatter histogram, and it might be that it is slightly more to the left that the odd one. To test this last hypothesis, I proceeded with a non-parametric test, the Wilcoxon test (or Mann-Whitney test): the null hypothesis is that the distributions of the two groups (say timeO and time1) do not differ by a location shift, and the alternative is that they differ by some non zero location shift.

```> wilcox.test(time0,time1, paired=FALSE)

Wilcoxon rank sum test

data:  time0 and time1
W = 640, p-value = 0.64
alternative hypothesis: true location shift is not equal to 0```

The p-value is greater than 0.05, so we conclude that there is no significant location shift in time. This test is certainly not the most appropriate one since the two distributions do not have the same shape.