The dressage world was once again dazzled by the performance of Valegro at the CDIW last weekend in Amsterdam. In the Grand Prix they would have taken a new Word record if the final Piaffe had not gone wrong. Plugging in her “usual” score of 9.6 for the Piaffe and 9.5 for the Transitions would have had them reach a final score of at least 87.86 and edge above their Olympia record of 87.74. Oh well, that’s for another day….
The first place was never seriously in doubt, and the second place both in the Grand Prix (78.32) and in the Freestyle was safely secured by Danielle Heijkoop and Siro. Rather than compare them with the other riders in Amsterdam we instead compare them with Edward Gal and Glock’s Undercover at Olympia (80.12) a month earlier. In the spider-web plot we show the scores of each for each figure type:
Comparing the two we see that the canter, passage, piaffe work is essentially identically scored between the two. Siro had a bad Rein-back in Amsterdam and Undercover really picked up points in the Trot Half-Passes while doing slightly better in the Collected Walk and Halts, both combinations show excellent Piaffe scores of 9. Two strong Dutch combinations that we will hopefully see in Aachen in the summer.
Another interesting feature of the Grand Prix in Amsterdam was the very close scores of the riders in 3rd and 4th places. 74.88 for Jessica von Bredow-Werndl and Unee BB of Germany and 74.86 for Diederik van Silfhout with Arlando of the Netherands. (A combined score difference summing all judges of just 0.5 points!) But again, the spider-plot shows us they have different strengths:
Unee BB picks up points in the Halt, Rein-Back and Walk paces, while Arlando has the stronger Canter movements. Both combinations have very similar Trot and Passage notes though possibly Unee BB had a stronger Piaffe.
Now a little observation about judging differences. The full results of the Grand prix are visible on the Sports Computer Graphics site at http://results.scgvisual.com/2015/amsterdam/r10.html and the event was as well judged as any other event at this level. But as everyone is aware, sometimes there are large judging differences that are often poorly understood. First the important news, there were no big changes of rank due to any single judge. Swapping in and out the judges would result in the 3rd, 4th and 5th places flip-flopping but that’s because they were very very closely scored as noted above. One ride (Danielle Heijkoop) had a score difference of 5.9%, that’s the difference between one judge (in this case Isabelle Judet at M) and the average of her 4 colleagues. But note, this is “pretty normal”, with 105 judgements it is very likely to have one of them more than 5% different. When these big differences occur it is always because one judge is consistently higher or lower than their colleagues across the whole test, it is impossible for a ”mistake” –such as a missed error in a change line – to introduce such big differences.
Big scoring differences are seldom about position of the judge either. But it only takes being low or high by 0.5 points in each figure to end up 5% different at the end of the day. That’s one of the reasons the half-points were introduced, and I think one of the reasons we will eventually move to decimal scoring. (0.1 point precision for the scoring per figure)
These big differences always occur because for essentially the entire ride a judge is either consistently higher or lower than their colleagues. In this case the judge at M averages 0.5 points below the judge at B and more like around 2 points less for every Piaffe and Transition. It is not about a different view of the test – over the entire test then such effects average out, it is about a different overall appreciation of the ride.
How important is it? Well in this case, if the judge at M’s notes had not been taken into account the final score for Danielle would have been 79.50 instead of 78.32. If the proposal currently being considered by the FEI to correct the very low or very high score to the next nearest judges score were applied, then she would have ended up with a 79.24. If instead a more balanced approach of removing the highest and lowest score for every figure had been adopted, then her final score would have been 78.433, a very small change compared to the awarded score, but perhaps one that has more justice to it, in this case the majority of the jury (for each figure) would determine the final score and a single high or low judge would neither strongly influence the final score nor risk having their scores entirely dropped.
All this just to help you think about the possibilities of improving the system that can be considered.