Adjusting player metrics based on team strength

9 min readJan 28, 2022

Disclaimer: this text shares some similarities with academic papers structurally. If you’re not into that or don’t have time to read everything, I’d suggest you jump straight to the “interpretation” section of this text.

Introduction

Attackers on top teams naturally contribute more goals and assists than their counterparts on bottom teams. Much of this difference can be explained by sheer individual skill. Players on top teams simply tend to be better individually than players on bottom teams.

However, football is a team game. During the 2012-13 season, Messi contributed 1.80 non-penalty goals and assists per 90 minutes in the league for a 100-point Barcelona team. No one can convince me that he wasn’t the best player in the world at the time. Yet, I don’t think anyone would argue against me if I said that Messi’s output would've been considerably lower if he played for, say, Levante, a mid-table team that managed just 40 goals throughout the season.

Good attackers on good teams are, of course, better than bad attackers on bad teams, but attackers on top teams also benefit from having higher-quality players around them to create opportunities for them.

Furthermore, if you are a decision-maker or even just a fan evaluating potential signings to your club, what really matters to you is how much these players would contribute to your team. Not what they contribute to their current team.

With all of this in mind, I attempted to find a way to predict the change in attacking output when quality teammate changes. Here‘s what I found.

Data

I used data from FBref, a fantastic website for sports data. The dataset covers the top 5 European men’s leagues from the 2017–18 season to the 2021–2022 season (as of 24 Jan 2022).

I decided to use expected metrics (xG and xA) instead of regular (goals and assists) since they have lower variance. A season of football for a player is a small sample size, especially if they are not a guaranteed starter, so expected metrics should give a more accurate indication of performance. The metrics I used for attacking output are npxG per 90 (non-penalty expected goals per 90) and xA per 90 (expected assists per 90). Obviously, there is a lot more to an attacking player’s game, but these metrics were the best ones available without over-complicating things

For team strength, I used a metric I call “rest-of-team xG per 90,” which uses the following calculation:

(total team xG when player x is on the field per 90)-(player x’s npxG+xA per 90)

Essentially, how much a player’s team creates in attack if you subtract all the chances that the given player is directly involved with. The higher the rest-of-team xG, the better the player’s teammates are at attacking. I acknowledge that certain players are better than others at creating chances for their teammates indirectly, through movement, dangerous passes that are not the final pass, or even defensive work. Frankly, I don’t have a way to account for this, so keep in mind that this model will be slightly biased against players who excel at these things.

Since individual skill is such an important factor, I couldn’t just run a regression with team strength as the explanatory variable and attacking output as the response variable.

Instead, I looked at the output of individual players when team strength changed. To accomplish that, I used season pairs for individual players, making the dataset longitudinal.

I used the following criteria when filtering out my desired dataset:

Every player must have…

played at least 1000 minutes (each season). Sample size, sample size, sample size. One could even argue that the threshold is too low, but I wanted to include more players than just the nailed-on starters for every team.
played in the same league in both seasons. Some leagues are just stronger than others, and transfers between leagues add another layer of complexity that requires a more advanced model.
an nxpG of at least 11% of the team’s total xG (for the npxG model) or an xA of at least 8% of the team’s total xG (for the xA model) when the player was on the field. This is to filter out players who are not evaluated based on their direct attacking output. Don’t ask me why the thresholds are what they are.

I considered including age as a variable and only using peak-age players, but I ultimately decided not to.

Now that I had the season pairs, I was able to calculate the variables I eventually used in my model:

Percentage change in attacking output (npxG or xA per 90) from season 1 to season 2
Percentage change in team strength (rest-of-team xG per 90) from season 1 to season 2

As the last filter, I removed observations with a percentage change in rest-of-team xG greater than 150%, as well as observations with a percentage change in the output variable (depending on model) greater than 100%. My thought process was that such extreme outliers would affect the linear models more than they should, but I could be wrong about this. Regardless, only a few observations were removed through these filters.

Model and Results

In order to explore the relationship between these variables, I used linear regression with percentage change in team strength as the explanatory variable and percentage change in attacking output as the response variable.

Here are the regression results:

Response variable: PC in npxG per 90 / PC in xA per 90

Explanatory variable: PC in rest-of-team xG per 90

Constant: Not statistically significantly different from 0 for either

Coefficient: 0.271 / 0.369

P-value: < 0.001 / < 0.001

R-squared: 0.049 / 0.070

No. observations: 860 / 804

Interpretation

If you have a decent grasp of statistics, your first reaction might be that the R-squared values are so low for these models. I won’t lie, I was a little concerned at first, and ideally, they would be higher. However, football is extremely hard to predict. A player’s metrics like npxG per 90 or xA per 90 will almost never stay the same from one season to another even if the team and teammates around them don’t change. Therefore, I’m mostly interested in the trend and the averages. Furthermore, the P-values are essentially 0. I’d like to add that my understanding of statistics is far from perfect, so I welcome any criticism of my methods and reasoning.

So, how do you interpret these results?

Based on the dataset used, the PC in npxG per 90 tends to change by around 27% of the change in PC in rest-of-team xG. Put xA per 90 in place of npxG per 90, and the value is around 37%.

Can we say that a drop in teammate quality causes a decrease in attacking output? Technically, I believe the answer is no (again, correct me if I’m wrong). Logically though, I do think that if a player suddenly got a worse team around them, it would cause them to get fewer goals and assists. Still, it’s important to be careful.

Faults in the model

Football is exceptionally unpredictable and complex. It would be impossible to include every single variable in the model, so I chose to make it as simple as possible. Hopefully, future work in the subject can include more variables. These variables include but are not limited to:

Position and role: Some players play in different positions and are tasked with different roles in different seasons.

Centrality in attack: Perhaps, when certain players move to a worse team, their new team tends to play around those players more than their previous teams did.

Team fit: Sometimes players move between clubs that have completely different playing styles. This could also happen within the same club.

Form: Perhaps, when a player is in good form throughout a season (and show higher levels of expected output than before), they tend to be awarded a move to a better team. Maybe in their next season, their form reverts back to normal levels, and output takes a hit.

Set-pieces (excl. penalties): Being on set-piece duties benefits a player’s xA. What if they don’t retain these duties next season? The reason I didn’t use this is that the data wasn’t available to me. If I could use open-play xA instead of xA, I would.

These are some potentially important variables that were not accounted for in this model, and there are many more.

Team strength-adjusted metrics?

My goal from the start was to create a model that could adjust attacking output based on teammate quality. Such a model could provide a baseline prediction of productivity if a player moves to a certain club. After that, you could factor in other variables like fit, etc.

Such a model could also adjust every player’s value to a standard, for example, a mid-tier attacking team. Then you could compare players on a more even playing field.

I know you should be careful to infer causality. But maybe we could still try. The model would obviously be hugely imperfect, but so are many other statistical models, especially in such a complex sport as football. After all, no models are perfect, but some are very useful.

Let’s try it out!

I find that the median and mean rest-of-team xG per 90 for attacking players is around 0.95 for the top 5 leagues. Bundesliga and Serie A generally have more goals per game than Premier League, La Liga, and Ligue 1, so these numbers would probably be different in different leagues. But to keep it simple, let’s just use 0.95 rest-of-team xG as a representation of an average attacking team if you exclude one attacking player.

Then, we could adjust npxG or xA per 90 like this:

predicted PC in <variable> = regression coeff. * ((0.95/rest-of-team xG)-1)

Team-adjusted <variable> = <variable> * (predicted PC in <variable>+1)

I applied this formula to every player in the top 5 European leagues this season:

Largest raw decrease in npxG?

Robert Lewandowski: 0.92 npxG per 90 for Bayern → 0.82 npxG per 90 for a theoretical Bundesliga team with an average attack

Seems reasonable that Lewandowski’s scoring frequency would decrease if he moved to a Bundesliga team with an average attack. Is the difference too small even? Maybe, but Lewandowski has arguably been the best player in the world this season, and would likely perform well anywhere. And it’s not like he’s moving to a bottom team.

Largest raw increase in npxG?

Callum Wilson: 0.28 npxG per 90 for Newcastle → 0.38 npxG per 90 for a theoretical PL team with an average attack

Again, seems pretty feasible. Wilson is a decent PL forward, and Newcastle hardly create anything unless he’s involved. This adjustment would put him at a slightly-above-average tier in terms of goal threat.

Largest raw decrease in xA?

Thomas Müller: 0.47 xA per 90 for Bayern → 0.38 xA per 90 for a theoretical Bundesliga team with an average attack

Müller is one of the best and most consistent creators of his generation, but he would never be able to put up those numbers for a mid-tier Bundesliga club. 0.38 xA per 90 is a lot more reasonable while still being phenomenal.

Largest raw increase in xA?

Moses Simon: 0.31 xA per 90 for Nantes → 0.39 xA per 90 for a theoretical Ligue 1 team with an average attack

Okay, Simon’s numbers are already insane this season, but the fact that he’s doing it at Nantes makes it even more impressive.

Overall, I like what I’m seeing. Players are rewarded for making things happen at smaller clubs, but not to an unfair degree. On the other end, the largest decreases come from dominant teams like Bayern, Man City, and Liverpool. Players like Sané, Jota, Gündogan, and Foden are among the players with the largest decreases from the adjustment.

Finally, are these metrics useful?

Personally, I think so, as long as they’re used with caution. Obviously, they’re far from perfect, and it probably wouldn’t take much to generate more accurate coefficients. But from what I’ve seen, the results seem quite reasonable.

When decision-makers and fans evaluate potential signings, they factor in all sorts of different variables. Team strength-adjusted metrics are helpful in that they provide a baseline prediction. After team-strength adjustment, they can think about other factors such as tactical and cultural fit, playing style, squad dynamic, etc. Furthermore, team strength-adjusted metrics offer a way to compare players on a more level playing field.

I will definitely play around with team strength-adjusted metrics in the near future.

Lastly, thank you for taking the time to read this far! I’d love to have discussions around this topic, so you’re very welcome to give your thoughts and criticism.