News:

Use a VPN to stream games Safely and Securely 🔒
A Virtual Private Network can also allow you to
watch games Not being broadcast in the UK For
more Information and how to Sign Up go to
https://go.nordvpn.net/SH4FE

Main Menu


The Rodak to the Premier League (long read about xG warning)

Started by Craven_Chris, August 08, 2020, 04:38:22 PM

Previous topic - Next topic

Craven_Chris

Hi – this is going to be quite a long form post, based on some numbers I have been looking at regarding Fulham's performance over the year, if xG data is not your thing please disregard, but hopefully some will find this of interest.

Exceeding Expectations

This post is based on stats held on Fulham at Infogol, and I am focusing on data for expected goals (xG – the expected number of goals a team should score based on the number and quality of chances generated) and expected goals conceded (xGc – the expected number of goals a team should concede based on the number and quality of chances given to the opponent).

This data shows that, over the season, Fulham generated expected goals (xG) of 1.43 per game and had expected goals conceded (xGc) of 1.30 per game.  So, give or take, based on the balance of play, the xG model says Fulham should finish the season with a positive goal difference of around 6. That is upper mid-table form, typical of teams just outside the play-offs.

But hold on! Fulham's end of season goal difference was actually +16, the fourth highest in the league and a match to their final league position. So, in reality, they scored quite significantly more goals than they conceded. If the xG model is to be believed Fulham's actual goal difference is around 10 goals better than it "should" be!

Must be Mitro?

You may think this overperformance data makes logical sense, after all Golden Boot winner Aleksandar Mitrović plays for us, so surely this result is explained by our star striker gobbling up every chance coming his way, scoring the goals other strikers wouldn't?

Well, no it isn't – Fulham has scored 64 goals this season against an aggregate xG of 65.98, so offensively they have slightly underperformed, by about 2 goals.

The chart below shows our cumulative expected against actual goals through the season, you can see the two numbers have been closely entwined all year.


FFC Cumulative xg by Chris Frank, on Flickr

If we want to understand the secret to Fulham's success (at least relative to the xG model) we need to look at the other end of the pitch. Its goals conceded, not scored, where the overperformance lives.

Navigating the xGc

As mentioned above, Fulham, have averaged an expected goals conceded (xGc) metric of 1.3 per game, this means that given the number and quality of chances given up, Fulham should concede 1.3 goals a game on average or 59.83 total goals across the season.

But the reality is that, rather than give up 60 odd goals this year, Fulham actually only conceded 48, an average of only 1.04 goals per game and nearly 20% less than expected. 

More simply put, for every 5 goals the xGc statistics say we should concede, we only actually concede 4. For a team like Fulham with 15 league wins this season by a single goal margin, this overperformance is likely to be material in terms of points accumulation and league table position

Is it unusual to overperform to this extent? Well looking at the championship, Swansea demonstrate similar xGc overperformance, and Nott'm Forest are not far behind, but it does seem that this overperformance is large compared to most other teams.

However this level of xGc overperformance is much more commonplace in the premier league – indeed 8 teams in the competition (from highest overperformance down, Sheff Utd, Liverpool, Spurs, Arsenal, Man Utd, Brighton, Palace, Leicester) achieved xGc overperformance better than Fulham & Swansea.

To dig into this further, below is the chart of the cumulative expected goals conceded and actual goals conceded across the season for Fulham – unlike the goals scored chart, there is a clear and widening gap through the season!

FFC Cumulative xgc by Chris Frank, on Flickr

So why has this happened? What does this overperformance in xGc mean and how do we explain it?

Well there are generally 3 possible explanations for this kind of anomaly:

1)   Fulham have simply been very lucky this year, their goal has led a charmed life, but unless some sort of voodoo is involved, this kind of luck is unlikely to be sustainable,

2)   When teams shoot against Fulham, they tend to take shots which are of a consistently lower quality than we would expect based on the position and nature of the opportunity they get. Perhaps opponents are tired from chasing the ball during Fulham's long periods of possession? Perhaps Fulham defenders put opponents under more pressure when they get into good shooting positions?

3)   When teams take good shots against Fulham, our goalkeeper is just much better than those of other teams, and simply saves more of the shots that he faces

Or it could be some combination of all 3 of the above...Lets investigate

Is it Luck?

This is a hard question to answer, we can do a simple statistical test (Two Tailed T Test) to see if the data demonstrates some bias, ie is it random or is some other factor driving the team to overperform? This test shows that the overperformance data does not exhibit characteristics of randomness but with a sample size of only 46 matches, we can get to a 95% level of confidence. That might sound high, but it also means that in a league of 24 teams, we would expect at least one team to achieve the type of overperformance that Fulham have by chance alone.

So it could be that Fulham have been that lucky team this year!

Is it the defence or the way we play?

The question to consider here is whether Fulham's defence or playing style is doing something to reduce the quality of shots so that the probability of opponents scoring is lower than we would expect for the situation / position they are shooting from. This typically requires use of a 'post-shot' xG model which measures the quality of the actual shot produced in each goal scoring opportunity, rather than the quality of the situation. We could then see if teams are taking worse shots than we would expect. Unfortunately, such data is kept behind paywalls so I don't have access to it!

We can see from the public data that Fulham's defensive stats generally look unremarkable, 13th in the league for proportion of shots on target suggests if Fulham are doing something to worsen opponents shooting, it doesn't extend to making them miss the goal entirely!


So could it be goalkeepers?

The chart below tracks the growth of the xGc overperformance through the season, it measures the cumulative number of 'missing' conceded goals by Fulham this season.

Fulhamxg by Chris Frank, on Flickr

The chart clearly shows that we did not overperform our xGc from the start of the year, we started to do it around game 16 (marked by the green line). As you may have gathered from the title of the post, this also coincides with Marek Rodak taking the gloves and becoming Fulham's starting keeper (actually there was a slightly bumpy introduction of Rodak into the side, he came in game 13, got sent off in 14, was suspended for 15 then properly took over from 16, but the point stands)!

The xGc overperformance starts almost immediately with Rodak coming into the side, then grows pretty steadily and consistently through the season (there are a couple of setbacks: in games 38 & 39 when Brentford and Leeds scored 5 against us after lockdown off of quite low xG and in game 23 when we conceded 3 at Luton).

The plot thickens...

So is this the story of our season? We were basically mediocre all year and then Rodak came in and made us great (well done TK for locking in a long-term contract btw)? Well, maybe, but the story is not quite that simple...

You see, when Rodak took over in goal, Fulham did not actually start conceding fewer goals (wait, what? Didn't you just say...)!

No, what actually happened is the number of expected goals against us (the xGc) increased massively, so when Rodak came in our defensive performances got worse, a lot worse. It was in xGc terms, comfortably our worst run of form of the season.

In fact our xGc increased from 1.15 over the 13 games with Betts starting in goal to a relatively whopping 1.6 over Rodak's first 13 games.  How bad is this form? Well, based on the infogol data, only three teams (Charlton, Bristol Ciity & Luton) produced such high xGc across the season.

But, this phase of the season is also where the overperfomance comes in, because despite this high xGc, Rodak conceded the same number of goals (15) in his 13 starts in this period as Betts did in his 13 starts earlier in the season. Indeed Rodak's shot stopping percentage over this time was 74% over his first 13 appearances, compared to a save percentage of 57.1% for Bettinelli over his first (and only) 13 appearances (this is actually quite a low save percentage, but it should be noted that Betts matched his expected goals conceded, which tells us he faced some tough shots – indeed just 2 more saves and fewer goals conceded would have put Betts save % to a more typical championship level)

So, when Rodak came in, we gave up more and better chances (I have no explanation for why we got so much worse during this phase of the season) and our offensive stats stayed the same, but for whatever reason these extra chances given up did not lead to more goals conceded, so results remained stable during this period.

The next major event to impact Fulham's defence was the inclusion of Hector in the starting line-up. Following his introduction and for the remainder of the season (20 games) our xGc came back down to 1.22 close to what it was at the start with Betts in goal, but, crucially, the xGc overperformance continued, with average actual goals conceded falling to just 0.9 per game. I suspect the fall in xG at both ends of the pitch reflects Fulham's developing game management under Parker, where we were able to get a lead and then suffocate the games, limiting both xG and xGc.

Another relevant piece of data – during his 18/19 loan with Rotherham, Rodak actually underperformed his xGc data, conceding 83 goals from an xGc of 75.7 (despite impressing Millers fans during the year). Again this is using pre-shot data so we cannot be sure how much of this over/under performance against xG is explained by shot quality.

Summary and the Eye Test

In trying to answer the homework question as to whether Rodak has been a key driver of Fulham's success this year, I offer one more chart.

It shows for each team, the goalkeeper save percentage on the x axis and the average xG per shot conceded on the y axis (this can be seen as a measure of the average quality of chance given up by each team).

As you might expect there is a clear inverse relationship between the quality of shot given up and the save percentage of the keepers.

save vs quality by Chris Frank, on Flickr

I have shown this by adding a trend line, and in general teams / goalkeepers to the right of the line have performed better than average (given the quality of chances conceded) and those to the left of the line have performed worse than average.


For Fulham, I have split out the two goalkeepers used, Betts and Rodak are shown separately. You can clearly see that Rodak's save %, approaching 75% betters that of any team, but you can also see that the quality of chances conceded is towards the higher end of the range. Rodak's peers in terms of the chance quality faced would be Luton, Derby, Hull and of course Betts. He significantly outperforms those peers.

We cannot discount the role of luck in these performance stats and the lack of post-shot xG data is, very much, a limiting factor. But whether due to luck or individual brilliance, the numbers do suggest that with Rodak in goal, Fulham concede a lot less goals than they should given chances conceded.

But what do our eyes tell us about Rodak, does he look like he is having this kind of positive effect on the team?

Well as a goalkeeper myself (although to no standard of note) I would say that being a good goalkeeper in less about making the spectacular 'highlight reel' stops (although Rodak has a few of those),  but is about consistency. Its about not getting beaten by a mediocre shot, ever. If its savable, you save it – always. This is how Rodak has appeared to me, I can only recall a small number of goals conceded where, on reflection, Rodak should have done better. How many goals go 'through' Rodak – I would argue its almost none.

So by way of a conclusion I will offer this – I think the 'eye-test' supports the data in positioning Rodak as a keeper who exceeds the championship standard and I think Fulham owe a great deal of points this season to consistency of the man between the sticks – simply doing his job, consistently, without fuss, with minimal error.

For Parkerball to work, with an emphasise on game management and defending narrow leads, we need to not give away silly unnecessary goals and I believe Rodak is perfect for such an approach.

So in short, I think Rodak has been Fulham's player of the season!

I hope you found that interesting! I started looking into this a little while ago and this forum looked like the best place to put it, sorry if too long and boring!





HobGoblin

An excellent post, enjoyed the read and the bottom graph shows to me anyhow how much of a improvement that Rodak has brought to the overall defence. That is not to say Betts couldn't have improved his % also.

The addition of Hector. Does that show in the second graph a quite clear improvement in the number of goals conceded. Forgive me I don't recall game number he started in.

Craven_Chris

Quote from: HobGoblin on August 08, 2020, 04:55:36 PM
An excellent post, enjoyed the read and the bottom graph shows to me anyhow how much of a improvement that Rodak has brought to the overall defence. That is not to say Betts couldn't have improved his % also.

The addition of Hector. Does that show in the second graph a quite clear improvement in the number of goals conceded. Forgive me I don't recall game number he started in.

Thanks, glad you enjoyed it!
It is hard to see the impact of Hector in these charts, he came in about game 27 though. You are right that there was a big change in defensive stats at this time, the average number of expected goals conceded per game fell from 1.6 in the run of games before he started to about 1.2 for the games he played. But it is Rodak, rather than Hector whose presence seems to drive the 'overperformance' against xGc - whereas Hector's presence seems to reduce the xGc stat itself if that makes sense?


Arthur

Thanks Chris.

Like Hobgoblin, I found it a worthwhile and interesting read.

I note the statistical figures were complemented by your own observational analysis in the final conclusion. While it might be perceived that statistics diminish the need for a sharp understanding of the game, your summary suggests they sharpen that understanding further.

I've two questions (which roll into one) - which you may be able to answer: Is the quality of the goalscoring opportunity the same in every situation irrespective of the footballer to whom the chance presents itself? Or do xG and xGc take into account the skill of the player?

For example, as supporters, we may form the view that a goalscoring opportunity that just fell to, say, Knockaert on his right foot (and was not converted) would far more likely have resulted in a goal if the chance had been Mitrovic's to score. Is this factored into the figures or is xG more of raw score based on a standardised goalscoring continuum (running from 'Worldie' at one end to 'Sitter' at the other)? If it's the latter, this would make the 'Eye Test' even more critical of course.

Sting of the North

Excellent and interesting post! The opposite of boring!

Craven_Chris

Quote from: Arthur on August 08, 2020, 11:15:59 PM
Thanks Chris.

Like Hobgoblin, I found it a worthwhile and interesting read.

I note the statistical figures were complemented by your own observational analysis in the final conclusion. While it might be perceived that statistics diminish the need for a sharp understanding of the game, your summary suggests they sharpen that understanding further.

I've two questions (which roll into one) - which you may be able to answer: Is the quality of the goalscoring opportunity the same in every situation irrespective of the footballer to whom the chance presents itself? Or do xG and xGc take into account the skill of the player?

For example, as supporters, we may form the view that a goalscoring opportunity that just fell to, say, Knockaert on his right foot (and was not converted) would far more likely have resulted in a goal if the chance had been Mitrovic's to score. Is this factored into the figures or is xG more of raw score based on a standardised goalscoring continuum (running from 'Worldie' at one end to 'Sitter' at the other)? If it's the latter, this would make the 'Eye Test' even more critical of course.

These are good and important questions with xG! My understanding is that the xG generated for each shot does not take into account the skill of the player, but it does take into account whether they were on their 'strong foot' and also takes into account other factors such as whether they were under pressure at the time of the shot. So to give the best example of this, a penalty is generally given an xG of 0.76, it doesn't matter if the taker is Mitro or Knockaert, its always the same because that is the rate at which professional players convert penalties. So xG is supposed to reflect the quality of the chance, and say nothing about the quality of the individual it falls to.

This means that a good finisher should end up with more goals than their xG (ie they are converting more of the chances than the average player would in a similar position) and by contrast a weaker finisher would tend to end up underperforming against their expected goals. I believe for Fulham (I have not run the numbers myself but have seen this on twitter), Mitro is scoring almost exactly in line with his xG, BDR and Knockaert however are someway below their xG and AK47 is quite far above it. I have not seen the numbers for a while, but I would imagine Kebano is now through the roof in terms of outperforming xG!


Denver Fulham

Rodak played very well. He did most weeks what you need in ParkerBall -- make the one big save a week. I would say that most of those saves were the sprawling/1v1 type saves that are shot right into his frame when he makes himself big. I'm concerned that he is not as explosive laterally as he needs to be for the Premier League; the Brentford goal was a decent representation of a save a good Premier Leaque keeper may make but Rodak couldn't get there. The second goal vs City in the FA Cup was another one that comes to mind.

I really like Rodak and he's worthy of the praise for this season, but I wouldn't just hand him the gloves for next season without bringing in a serious veteran competitor.

filham

Interesting and it does support my own feeling that it is Scott's re-organization of our defensive tactics that has won us promotion and that it is now our attack that is in need of the most urgent improvement.

hongkongfulham



Jim©

Thanks for that very interesting.
I posted last week about how Rodak was my player of the season for the points per game increase he brought. Betts was terrible unfortunately, lowest save percentage in the division at the time, now the best.
Ppg compared to Mitro for example was startling higher for Rodak.

FulhamStu

Very interesting, thanks for sharing.   Does suggest as most fans have observed, we need to put the emphasis on strengthening the defence before the new season.   Have yo thought about sending this analysis to Tony Khan ?

CottagersOnTour

Very interesting read - I think he's been great since coming into the team; even despite the immediate red card!


b+w geezer

Just catching up, I have also found this post most interesting. There have been some worthwhile responses and queries arisiing too.

Before reading this, I was wondering whether Rodak most deserved 'player of the season.' Seems he does. Comparing peformance with 'expected performance' has to be as good a way as any of making that call, if such data are available and compiled sensibly. From the sound of it, as clarified by Chris, they are, but he's right to want to reality check via the evidence of his eyes.

LRCN


Dougie

Quote from: Craven_Chris on August 09, 2020, 12:24:57 AM
Mitro is scoring almost exactly in line with his xG, BDR and Knockaert however are someway below their xG and AK47 is quite far above it. I have not seen the numbers for a while, but I would imagine Kebano is now through the roof in terms of outperforming xG!

According to Infogol, Mitro has 26 goals against an xG of 38.6.

Knockaert has the most damning conversion rate of 3 goals vs. xG of 10.2.

Neeskens unsurprisingly is the biggest over-delivery, 5 goals vs. xG of 2.08.

I don't think xG should be interpreted on its own though. We had a deficit of -14 goals vs. xG across the regular season, but Leeds was bigger at -21 goals. There's actually a decent statistical *negative* correlation between xG and conversion rate (r-squared on 0.56). So the higher your xG, the more likely you are to have scored less than your xG in total across the season



Or to put it another way, the better a Championship side is at creating goalscoring opportunities, the less efficient it seems to be at converting them. And based on that, we were slightly below where we would expect to be once this relationship is taken into account, whereas Leeds, WBA and Brentford in particular are above. This suggests we were slightly wasteful in front of goal overall, whereas those other three teams despite all scoring fewer goals than xG, were nonetheless efficient.


Arthur

In trying to get to grips with these charts, it seems to me as if the roles of goalkeeper and striker are the only positions that lend themselves to a comparison between actual and expected performances with any degree of reliability.

If, for example, statistics tell us we ought to be conceding 2 goals per game and, over a period of time, we're only letting in 1, it's fair to conclude even with some poor finishing by the opposition, there must be a number unexpected saves by the goalkeeper - to whom the credit for a better-than-expected performance can reasonably be attributed.

Likewise, if we are scoring 2 goals when we ought only to be netting 1, it is not unreasonable to presume that a high-scoring striker may be primarily responsible - though this would be a less confident assertion: the goalkeeper is responsible for 100% of saves (barring the odd 'Hectoresque' goal-line clearance) whereas the team with a prolific striker will still likely have more than 50% of its goals scored by other players.

In terms of goals for and against, defenders and midfielders do not have the same opportunity to influence the better-than-expected performance - their roles impact upon the actual performance; e.g. if we were previously expected to concede 2 goals per game but are now only expected to concede 1, this suggests we have tightened up as a defensive unit; equally, if we our expected goals per game rises from 1 to 2, the likelihood is our midfield has become more creative.

At least, this is how I interpret it.

A potential weakness to the use of 'expected goals conceded' I am wondering about is the distortion between a goalkeeper who is good at clearing and/or catching crosses and one who stays on his line. A goalkeeper who 'stays at home' may find himself making more saves - thus causing diverging orange and blue lines on the graph. But in the eyes of the supporter, of course, a goalkeeper who is able to prevent chances from occurring by commanding his penalty area is more commendable.

Arthur

A thought-provoking contribution, Dougie.

To you or Chris:

May I ask whether either of you can explain the apparent discrepency between the statistic in the O.P. which says we underscored by 2 goals and the graph above which shows we underscored by 14 - especially as both seemingly come from the same source - Infogoals?

The xG statistics for Mitrovic, Knockaert and Kebano are very revealing. A tally of those three players alone has us down on our expected scoring by 16/17 goals. Logically, therefore, this must mean that the combined xG of every other player must be plus 2 or 3 goals.

So is there a case for saying that Mitrovic has actually under-performed by only scoring two-thirds of the goals he should have done from the quality of chances he was given?

Or is it that the negative correlation between xG and conversion rate applies as much to individual players as it does to whole teams, and that a paradox existed towards the end of the season whereby our most prolific striker was actually less likely to score the next chance than if it fell to any one of a number of other players (but not Knockaert, clearly)? Hmmm. Without knowing in which matches Mitrovic failed to achieve his expected goals return, it seems difficult to assess.

Another interesting dimension to the graph is it tells us we should, statistically at least, have scored one more goal than Brentford. I'll need to think further about this too.

But the Mitrovic xG figure throws up all sorts of questions.

Craven_Chris

Quote from: Arthur on August 09, 2020, 04:28:16 PM
A thought-provoking contribution, Dougie.

To you or Chris:

May I ask whether either of you can explain the apparent discrepency between the statistic in the O.P. which says we underscored by 2 goals and the graph above which shows we underscored by 14 - especially as both seemingly come from the same source - Infogoals.

The xG statistics for Mitrovic, Knockaert and Kebano are very revealing. A tally of those three players alone has us down on our expected scoring by 16/17 goals. Logically, therefore, this must mean that the combined xG of every other player must be plus 2 or 3 goals.

So is there a case for saying that Mitrovic has actually under-performed by only scoring two-thirds of the goals he should have done on the quality of the chances he was given?

Or is it that the negative correlation between xG and conversion rate applies as much to individual players as it does to whole teams, and that a paradox existed towards the end of the season whereby our most prolific striker was actually less likely to score the next chance than if it fell to any one of a number of other players (but not Knockaert, clearly)? Hmmm. Without knowing in which matches Mitrovic failed to achieve his expected goals return, it seems difficult to assess.

Another interesting dimension to the graph is it tells us we should, statistically at least, have scored one more goal than Brentford. I'll need to think further about this too.

But the Mitrovic xG figure throws up all sorts of questions.

I actually messaged Infogols recently to ask this very question: something is not right in the Fulham player's individual xG numbers on Infogol (or my own understanding of what they mean), in particular the aggregate of the Fulham player's individual xG exceeds the team's total xG! At a team level, Infogol shows Fulham as having an xG of 66 (see here: https://www.infogol.net/en/leagues/english-football-league-championship-table-2019-20/151 ) but the sum of players comes xG up with a higher number (it looks close to 80 on Dougie's very interesting chart).

I have done a check previously, by adding up Infogol's xG numbers for Fulham in each individual match, and this matches the team number (so 66 for Fulham). So I am confident in the team aggregate position which seems to contradict the player level numbers. I have also seen other charts showing Mitro meeting xG. There is no obvious answer to this that I have seen in the FAQs, but as its come up here, will do some more digging!

Anyway lots of interesting observations in the replies, I have some time later and will try to answer some more of the questions, glad there is so much interest!



Dougie

Ah, so I used https://footystats.org/england/championship/xg for the per-club xG rather than the Infogol one, though I used Infogol for the individual player stats.

There's definitely something weird going on. Infogol gives Fulham xG at 66 but the sum of the individual players in our squad via the same site is 100.57xG. And Footystats gives us 1.71xG which is 78.66 xG over 46 games.

So ¯\_(ツ)_/¯

Craven_Chris

Quote from: Dougie on August 09, 2020, 05:17:34 PM
Ah, so I used https://footystats.org/england/championship/xg for the per-club xG rather than the Infogol one, though I used Infogol for the individual player stats.

There's definitely something weird going on. Infogol gives Fulham xG at 66 but the sum of the individual players in our squad via the same site is 100.57xG. And Footystats gives us 1.71xG which is 78.66 xG over 46 games.

So ¯\_(ツ)_/¯

There is definitely a stats mystery here!

I still think Infogols has a problem with their player xG stats. I did a test were I added up the individual player xg for 3 teams and compared to the team xG Liverpool came out at 76 (sum of individuals) vs 78 (team )- so a pretty close match, Leeds had 116 xG (sum of individuals) versus  87 team and Fulham had 101 (sum of individuals) versus 66(team). So maybe it is a problem with their championship data.

Looking at the players, there are a few with very high individual xG - Mitro and Bamford in particular well into the 30s which just does not seem plausible to me.

As for why footystats does not match infogol - perhaps they have a different xG model?!?

But... I think the team level xG data on Infogol is fairly robust (and it is this that drives the OP). Reason is that on Infogol you can see the xG on each individual shot in the game shot maps, then see that this data aggregates to the team xG for the game and that this, in turn, aggregates to the season level xG for the team!

So while there is some clear data weirdness, I think the observations on Rodak are unimpacted!

Dougie, I was interested in the chart showing reduced xG conversion the higher the xG is. It may be of interest, I tried to replicate using Infogols data instead of Footystats - I would say the correlation looks weaker here (which may give a clue that there is a problem in the Footystats numbers / model, i.e. if it consistently overestimates xG either through model of data error then you would expect the correlation that you observed)?

ITs all very strange!

overunder by Chris Frank, on Flickr