Wilcoxon Champagne test

Posted in R, Sport by Julyan Arbel on 14 June 2011

As an appetizer for Paris triathlon, Jérôme and I ran as a team last week-end an adventure racing in Champagne region (it mainly consists in running, cycling, canoeing, with a flavor of orienteering, and Champagne is kept for the end). It was organized by Ecole Polytechnique students who, for the first time, divided Saturday’s legs in two parts: in order to reduce the traffic jam in each leg, the odd number teams were to perform part 1, then part 2, and even number teams in the reverse order.

As the results popped out, we wondered whether the order of performance had favored one of the groups or not. A very much crucial question for us as we were the only odd number team in the top five. Using ggplot and a dataframe donnees including time and Group variables, the code for the (non normalized) histograms of the two groups (even: 0, odd: 1) looks like this

qplot(time, data = donnees, geom = "histogram", binwidth = 2,
          colour = Group, facets = Group ~ ., xlim=c(0, 30))

There are roughly the same number of teams in each group (36 and 38). Time is in hours; the effective racing time is around 12 to 15 hours, but to this is added or substracted penalties or bonus, which explains total times between 5 and 30 hours. The whole impression is that the even group as a flatter histogram, and it might be that it is slightly more to the left that the odd one. To test this last hypothesis, I proceeded with a non-parametric test, the Wilcoxon test (or Mann-Whitney test): the null hypothesis is that the distributions of the two groups (say timeO and time1) do not differ by a location shift, and the alternative is that they differ by some non zero location shift.

> wilcox.test(time0,time1, paired=FALSE)

	Wilcoxon rank sum test

data:  time0 and time1
W = 640, p-value = 0.64
alternative hypothesis: true location shift is not equal to 0

The p-value is greater than 0.05, so we conclude that there is no significant location shift in time. This test is certainly not the most appropriate one since the two distributions do not have the same shape.


2 Responses

Subscribe to comments with RSS.

  1. xi'an said, on 14 June 2011 at 21:31

    Bravo! Three races in four days if we include Cross de Bercy and a third position in the end. Wow!

  2. Jérôme Lê said, on 15 June 2011 at 10:01

    Sacré Juju va!
    par contre je suis toujours un peu sceptique sur la valeur de ces tests. Pour la même répartition mais avec 100 fois plus d’individus, tu peux trouver que les temps sont significativement différents…

    Après, ça peut être un peu endogène comme truc: si les impairs sont floués sur la première journée (où il y a une différence pair/impair), ils peuvent etre surmotivés pour rattrapper leur retard le lendemain.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: