Winning odds are calculated based on expected goals. In this app, they are either forecasted average goals before the game is played or true expected goals after the game has been played. For more information on how expected goals are calculated, visit the “FiveThirtyEight” and “Expected Goals” sections of this site.

Why are these win probabilities helpful? Basically, they summarize both what the expectations were going into the game, and how well Wimbledon actually played. Viewing through the lens of probability is much more succinct and interpretable than through other measures like possession and shot share.

From goals to probability

Goals are converted to an outcome distribution using the Poisson distribution. FiveThirtyEight has a very good explanation for how they forecast matches. It's far better than anything I can do here.

The quick version is that the Poisson process is used to calculate the expected distribution of goals for each team. Each team's distribution is calculated to create a matrix of possibilities, an adjustment is applied to account for the increased likelihood of ties, and then the cases are summed to calculate odds of each team winning.

Score-Adjusted xG Model

When mapping expected goals to probability of winning, it's important to account for score effects, or the notion that the score affects how the game is played. The cliche adage “goals change games” is cliche because it's true.

When a team is winning, they generally play a more defensive style. This will lead to them conceding more possession, shots, and expected goals, but if they play good defense this doesn't actually hurt their chances of winning - it increases it. Therefore, we should take expected goals when teams are ahead or behind with a grain of salt.

Score-adjusted xG reduces a team's xG by the time they are trailing in the second half. The longer they're trailing, the more their xG is reduced. When their odds of victory are calculated using the Poisson method based on this reduced xG, we get a more realistic picture of how the game actually went. Another way to think about this is that the score-adjusted xG model gives more credit for generating offense in the first half, or while leading or tied in the second half, than generating offense in the second half when trailing.

It's important to note that this score-adjusted xG is not useful for evaluating how a team played offensively or defensively. Expected goals are expected goals whether they occur in the 10th minute or the 80th minute. Instead, it is only useful when mapping these expected goals to win probabilities.

FiveThirtyEight's Club Soccer Model calculates the strength of pretty much every club team in the world and uses those ratings to make predictions, including game-by-game odds of winning and league table predictions. This page covers only Wimbledon-specific results and doesn't include odds related to promotion and relegation. Head to fivethirtyeight directly to get that information.

FiveThirtyEight is very transparent about their methods, but the highlights will be covered here.

Soccer Power Index (SPI)

At the heart of our club soccer forecasts are FiveThirtyEight’s SPI ratings, which are our best estimate of a team’s overall strength. In our system, every team has an offensive rating that represents the number of goals it would be expected to score against an average team on a neutral field, and a defensive rating that represents the number of goals it would be expected to concede. These ratings, in turn, produce an overall SPI rating, which represents the percentage of available points — a win is worth 3 points, a tie worth 1 point, and a loss worth 0 points — the team would be expected to take if that match were played over and over again.

In general, a League One side has an SPI in the range of 15-35, while a Premier League team will be above sixty, with the “Big Six” normally in the neighborhood of 90.

A team's SPI is based on the value of their players (via transfermarkt), their performance the previous season, and, as the season progresses, how they do in matches relative to their expectations. A team that consistently beats their SPI projections will see their SPI increase over time, and a team that consistently underperforms their SPI will see their SPI decrease over time.

It's important to note that the model updates SPI based on how a team performs relative to expectations, not just based on wins and losses. If Wimbledon beat a team destined for relegation we should not expect their SPI to move much. If they beat a strong team destined for promotion (especially if they play well in that game rather than getting lucky), then we can expect to see their SPI increase.

Forecasting Games

The FiveThirtyEight model uses the SPI of team, plus their offense and defense ratings, to forecast the average score of each game - this is like a projected version of expected goals. From there, they create a score distribution for each team, turn this into a matrix, account for the fact that draws are more likely than they “ought” to be, and then turn these into win, loss, and draw probabilities.

Forecasting Seasons

Once every game in a season has been forecasted, the FiveThirtyEight model runs a Monte Carlo simulation, where the season is played out 20,000 times based on the individual game forecasts. The results for every team in each season are recorded, and then probabilities of certain events (like promotion and relegation) are recorded.

FiveThirtyEight's publicly available data includes all the information needed to assemble a Monte Carlo simulation and calculate the same probabilities they publish in their league tables, but this is computationally expensive and not worth doing. This is why this app doesn't include odds related to the league table, such as promotion and relegation chances.

Expected goals (xG) are statistical measures of offense and defense tracked during a game. Essentially, every scoring chance is assigned a probability of turning into a goal based on a number of factors, and every chance is summed across the game to get the number of expected goals. This is considered a better indicator of quality possession than either raw possession or goals because it takes in-game data into consideration and, because it's continuous, is less noisy than just counting goals. While goals measure success and failure, xG measures the long-term predictability of that success or failure.

There are several xG models used around the world and each one considers different factors and weights them differently to produce their xG number. Most are proprietary, but two websites have publicly available xG: and They produce different results, so looking at both of them gives a more holistic view of how the Dons are playing, and gives us an intuition for how different xG models can be. More detail on their respective algorithms can be found below.

Something all xG models have in common is that they are team agnostic. This means that xG models look at data from all teams in all situations. This is important to keep in mind when looking at xG models in the context of a specific team, because there could be attributes the team has (like a world-class shooter or keeper) that mean the models aren't well calibrated to their particular talent. Right now, Wimbledon have proven to be aces at set pieces and Tzanev has shown to be a great shot stopper, which means, arguably, the Dons should beat the xG models in the long run. Whether this is actually true is up for debate — there is a long history in sports of fans arguing models like xG don't apply to their team, only to be proven wrong in the long run.

Goal Difference Above Expected

Goal Difference Above Expected (GDAE) is the gap between expected goal difference and actual goal difference. This can be calculated both for individual games and over the course of the season. If GDAE is greater than 0, that means Wimbledon came out ahead of the xG model. If it's less than 0, Wimbledon came out behind the xG model. If xG models perform well over the long run, then the GDAE trends towards 0 as the season progresses, and can be used as a coarse measure of luck. If GDAE is above 0, perhaps the Dons were lucky, and if GDAE is less than 0, perhaps they were unlucky. In theory, this is true when considering both individual games and the entire seeason.

Measuring Luck with xG: RAGE

No matter what, measuring luck with xG should be done with some caution. Because xG is based on all teams at all times, it cannot account for the combinations of talents and tactics that any individual team plays with. Nonetheless, it is useful, in part because comparing real results to xG can be a good starting point for deeper analysis of the team.

To that end, coming up with a measure of luck based on xG does have some value, even if it is a blunt instrument at best. GDAE gets at the idea of luck, but it's not very detailed. Goal difference is a combination of offense and defense, and it doesn't give much insight into how each part of the game has really played out. This is where RAGE comes in: Ratio of Actual Goals to Expected.

RAGE is a coarse measure of luck which can separate out offense and defense. It's calculated based on the following equation: \[ RAGE=[\frac{G_{for}}{xG_{for}} + \frac{xG_{against}}{G_{against}}] * \frac{1}{2} \] RAGE is comprised of two basic terms: the ratio of goals for to expected goals for, and the ratio of expected goals against to goals against. The first term is a measure of offense, and the second of defense. If a team matches their xG model, then the value of both ratios is 1. If they score more than their xG for, then the offensive RAGE pushes above 1, and if they score fewer goals than their xG then the offensive RAGE pushes below 1. The same is true for defense, but flipped - if they allow fewer goals than their xG the defensive RAGE is above 1, and if they allow more goals their defensive RAGE. Finally, there is combined RAGE - this is the average of the two, which is why the RAGE equation has a multiplication by one half.

In short, RAGE is built so that it should be close to 1 if an xG model is well calibrated, good luck (either scoring a lot on offense of conceding very little on defense) increases the value above 1, and bad luck (either not scoring very little or conceding a lot) decreases the value to below 1.

RAGE is meant to be interpreted only as a cumulative measure throughout the season - it doesn't apply to individual games. This is because RAGE doesn't capture the difference between when a team scores 0 goals in a game when their xG is 3 versus a game when it's 0.5, even though the former is clearly more unlucky. However, because of its cumulative nature, RAGE becomes a useful measure fairly quickly - after about 7 games.

Details of xG Algorithms

footystats offers a subscription, so their xG model is proprietary. However, they do provide a list of what their xG model accounts for with each shot:

  • Assist Type
  • Distance from Goal
  • Type of Attack
  • Shot Angle
  • Body Part Used to Shoot

footballxg also provide very little transparency on how their xG model works, because they also offer a service gamblers. The best documentation they have is the following paragraph:

The process for calculating an expected goal from any given chance was originally created by Opta. They reviewed hundreds of thousands of historical shots to then work out the percentage chance of a shot being scored from any particular situation. There are now a range of different models that are getting more and more advanced (taking into account the location of the shot, the position of defenders and goalkeeper, height of shot).

The last time the FiveThirtyEight data was updated is:

In general, FiveThirtyEight updates about an hour after league games, which generally end around 5 PM UTC (12 PM EST) on Saturdays and 10 PM UTC (5 PM EST) on weekdays. Are you sure you want to update the FiveThirtyEight data? Please be considerate of their bandwidth.

Expected goal models generally lag by 2-3 days and I update them manually.

Like what you see? You can find the code and supporting data for this page on GitHub.

- Eventually I may have the plots scale more elegantly with the size of the screen (may be too lazy to implement this one)