Tour De France

Analysing the roles and careers of individual athletes as well as the overal team compositions and distributions of the 92nd Tour De France in the year 2005.

Stages of a Career

The idea behind the plot is to visualize and analyze the career of a road bicycle racer as seen across all teams.

A young racer usually starts at the age of 24 as a helper. Later on he becomes either a time trials specialist / sprinter OR leader / top rider. An older racer contiunes as climber / stage points specialis and finishes his (approximately 10 year long) career around the age of 34. In order to do a better analysis one would have to have a multi-year dataset to follow the careers of individual racers.

Team Composition

The idea behind the plot is to group the riders by role in order to understand the adopted strategy by most teams.

Findings highlighted that the number of leaders or top riders is very high. Our hypothesis is that Lance Armstrong's dominance led other teams to increase their top riders roster in order to be competitive during the race. To do a better analysis it would be useful to compare the historical data of team composition.

Roles Distribution

The idea behind the plot is to provide user with a macro view of the distribution of roles for each team.

The analysis revealed the multi-role capability of several riders because team composition ranged from three to five roles, not every team brought a rider for each specialty, there was no unique leader for each team. It is interesting to emphasise that Discovery Channel team, the final winner, had a small number of helpers and did not have time trial specialist, deducing that its top riders were multi-role. To do a better analysis it would be useful to have more detail about the multi-role data of single riders.


The goal was to compare the performance of the leader with his teammates, in order to understand if there were some links.

To have an overview of helpers performances, it was decided to calculate the “average position” taking in account 2nd, 3rd, 4th, 5th and 6th riders of each team. Visualising the data with a line chart for the 3 top teams, it’s quite clear that at the end of the first half of the competition both Discovery Channel and CSC helpers decreased the performance, possibly to help their leaders (Amstrong and Basso) to stay at the top of the general standings. The T-Mobile comparison shows how the performance of the leader (Ullrich) and the helpers are related. Discarding race #4 (a team one) it seems that if helpers are racing to improve their personal average rank (from 37 to 25) the leader position is negatively impacted (from 3rd to 9th). On the other hand, while helpers started again to support the leader in order to help him reaching 3rd position at the end, their average rank decreased again to 31.

We'd like to point out that this analysis is based on a data set which includes 11 athletes who were disqualified due to use of performance-enhancing drugs in sport.



Inspired by the chronogical data set Amaury Sport Organisation about the history of the Tour de France, we focused our work on one single data set about the year 2005 which is provided as appendix to the book Interactive Graphics for Data Analysis (TDF2005.txt).


For the exploration and analysis of the data set we relied on tools like Microsoft Excel and Open Refine. The visualizations on this page were created using Google Docs and Raw. The website has been built as a theme on top of Bootstrap with the help of Sublime Text.


The people behind this project came together based on a shared interest in visualizing information about endurance sports. Our group consists of Carlo Cerati, Alberto Scotta (deltatre), Juan Huerta, Joao Lopes and Ivo Vasconcelos (Universal Postal Union), Martin Vögeli and Benjamin Wiederkehr