The aim I had when designing a ranking system for sports teams was to allow for determining the relative strength of teams, in such a way that predictions of the outcome of matches or competitions can be made. In essence this is done by giving each team a rating. This rating is a value which can be compared to ratings of other teams in order to determine their relative strength. I have called this relative strength the Expected Performance Level, or EPL, and the corresponding rating the EPL-rating.
The basis under the rating model is that there is a certain threshold level for the difference in strength, at which one team will always beat the other. I have defined this difference, this threshold level, as 1000 rating points. If two teams have a difference in rating of 1000 or more points, one would expect the stronger team (with the higher rating) to always beat the weaker team. For rating differences lower than 1000 points, I have defined a probabilistic model for the chances of a win, draw or loss. Based on this model, an expected number of points can be calculated, for either a single match or a competition.
When the actual results are known, these are compared to the expected results. Depending on the difference between the expected number of points and the actual points received, points are won by the team which performed beyond expectation, from the team which performed below expectation. In this way the ratings are updated to show the progress in strength and to compensate for the inaccuracies of the earlier ratings (which are at best estimates of relative strength).
I have two different ways of performing the calculation of expected and actual points, and for correspondently updating the ratings.
The first is to do this for single matches, the second is to use entire competitions.
In the case of single matches, the number of points to be distributed is 2. The expected value for a match is a linear function, which runs from 0 at a rating difference of -1000, to 2 at a rating difference of +1000. At equal ratings, the expected number of points for both teams is 1.
The actual number of points is determined not by looking at win, loss or draw, but by looking at the number of goals scored and conceded. In order to make the first goal scored or conceded count heavier than the consecutive goals, I have decided to determine the number of points based on the square root of the number of goals scored and conceded. The reason for this is that I want the ratings to also allow for defensive strength. A 2-0 victory usually shows a greater difference in strength than a 4-2 victory, because in the latter case, the losing team was at least able to score twice.
The effect of this is for example, that a 1-0 victory receives the same number of points as a 4-1 victory or a 9-4 victory, namely 1.5 points. For receiving the full 2 points, a 4-0, 9-1 or 16-4 victory is needed. A draw always results in 1 point for each team, unless penalty kicks have been taken to force a decision, in which case the winner gets 1.1 points and the loser 0.9.
The difference between the expected number of points and the actual number of points is multiplied by a 'rating factor' to reach the number of rating points
gained or lost. For single matches, I use rating factors ranging between 25 for friendly matches to 100 for official tournament matches.
In the case of competitions, the strength of each participating team is compared with the average strength of the entire competition. Based on that difference, an expected number of points is calculated. In this calculation, the number of matches played and the official scoring system (e.g. 3-1-0 for win-draw-loss) are taken into consideration. The expected number of points for all teams thus reflects the expected ranking table of the competition after it is completed.
The actual number of points can simply be taken from the final ranking table of the competition. Again, the rating points gained or lost by each team are calculated from the difference of expected and actual points multiplied by a 'rating factor' which depends on the number of games played. The more games played, the lower the rating factor, because with many games played it is easier to get many more points than expected.
For both ways of updating ratings, I have aimed at rating factors which allow for correction of ratings towards the actual results, but which prevent overshooting beyond them. In this way, historic results still make up part of the new ratings.
I use the competition-type of rating calculation mostly for the qualification campaigns for the continental and world championships. This is reflected by the relatively large and sudden shifts in rating after such qualification stages have ended. Had I used the single match type, then this shift would have been more gradual. Personally, I do not mind this, because similar large shifts tend to occur during the big tournaments. Moreover, I feel it is better to process larger competitions as a whole, because they are played as a whole by the participants. Singling out individual matches can produce distorted results, as probably would have happened if Switzerland's loss against Luxembourg had been processed individually, just to give an example.
I process all international matches played by the teams which are included in the system. I use FIFA's fixtures and results as a source for the scores of these matches, in order not to miss any.
When using the rating system for tennis matches, I always process single matches. The main difference between the football ranking system and the tennis ranking system is that I have developed a method for converting tennis scores (best-of-3 or best-of-5 sets) into a point score on a 0-2 range. E.g. in best-of-3, a win in two sets can vary between 1.54 for a 7-6 7-6 win and 2.00 for a 6-0 6-0 win. A three set win can vary between 1.04 for a 0-6 7-6 7-6 win and 1.73 for a 6-7 6-0 6-0 win.
At a rating difference of a thousand points or more, a 2.00 score is expected (meaning a double bagel win). This usually doesn't happen, but over the course of an entire tournament these losses are often compensated by the top players by gaining points off of better opponents.
At this time, I only keep score of the women's tour, which is a whole lot more unpredictable than the men's, which makes having a rating system that helps predictions more interesting. For tennis tournaments, I use rating factors ranging from 50 for Grand Slams, 40 for the big Premiers (Premier Mandatory/Premier 5, ATP-1000), 30 for regular tour tournaments, 25 for WTA125/Challengers and 20 for ITF-tournaments and qualification matches.
Another feature of my ranking system is the policy I use for adding players to the ranking system. Female players get a ranking when
The start rating is determined by taking the last direct acceptance to the tournament (i.e. the one with the lowest rating) and deducting 50 points from that rating.
Apart from that, the top 10 players from the WTA-ranking that don't have a rating (right now, these players are spread between #180 and #250 on the WTA-ranking list) have some privileges:
The strongly occupied ITF-events are those with 25K or higher prize money, which have at least 23 players with a rating in the main draw (players from the privileged-list mentioned above, are counted partially when evaluating this number of players). These tournaments are processed using the rating system (along with all WTA, grand slam and Fed Cup-events). The lower classed ITF-events and junior tournaments are not processed; they are "under the radar", so to speak.
Players can be out of competition for half a year without losing rating points. After that, the rating is reduced by 1 point per day not played. When players retire from the circuit, this reduction by 1 point per day starts immediately. The players that are subject to this reduction are no longer included in the ranking list. Players regain a stable ranking as soon as they complete their first official match on the circuit (even in "under the radar" tournaments). By complete is meant the player in question does not withdraw from this first match.
Recently, in May 2016, I have compared my ranking system, which I have dubbed Expected Performance Level, or EPL, to the WTA-rankings for female tennis players. The resulting analysis is available on this website.
back