News:

Use a VPN to stream games Safely and Securely 🔒
A Virtual Private Network can also allow you to
watch games Not being broadcast in the UK For
more Information and how to Sign Up go to
https://go.nordvpn.net/SH4FE

Main Menu


The Rodak to the Premier League (long read about xG warning)

Started by Craven_Chris, August 08, 2020, 04:38:22 PM

Previous topic - Next topic

Craven_Chris

Quote from: Arthur on August 09, 2020, 03:38:23 PM
In trying to get to grips with these charts, it seems to me as if the roles of goalkeeper and striker are the only positions that lend themselves to a comparison between actual and expected performances with any degree of reliability.

If, for example, statistics tell us we ought to be conceding 2 goals per game and, over a period of time, we're only letting in 1, it's fair to conclude even with some poor finishing by the opposition, there must be a number unexpected saves by the goalkeeper - to whom the credit for a better-than-expected performance can reasonably be attributed.

Likewise, if we are scoring 2 goals when we ought only to be netting 1, it is not unreasonable to presume that a high-scoring striker may be primarily responsible - though this would be a less confident assertion: the goalkeeper is responsible for 100% of saves (barring the odd 'Hectoresque' goal-line clearance) whereas the team with a prolific striker will still likely have more than 50% of its goals scored by other players.

In terms of goals for and against, defenders and midfielders do not have the same opportunity to influence the better-than-expected performance - their roles impact upon the actual performance; e.g. if we were previously expected to concede 2 goals per game but are now only expected to concede 1, this suggests we have tightened up as a defensive unit; equally, if we our expected goals per game rises from 1 to 2, the likelihood is our midfield has become more creative.

At least, this is how I interpret it.

A potential weakness to the use of 'expected goals conceded' I am wondering about is the distortion between a goalkeeper who is good at clearing and/or catching crosses and one who stays on his line. A goalkeeper who 'stays at home' may find himself making more saves - thus causing diverging orange and blue lines on the graph. But in the eyes of the supporter, of course, a goalkeeper who is able to prevent chances from occurring by commanding his penalty area is more commendable.

I think these points are entirely valid. On the last point in particular, it should be noted that although Rodak started making lots of saves when he came in, the other thing that happened was Fulham started giving up lots of shots. In fact in terms of low shots on goal conceded, the Bettinelli games remain Fulham's best of season (even post Hector).

There are too many variables to attribute that to simply Betts being better on crosses (not obvious to my eye that this is the case, I would say the opposite) or better at organising the defense. For example, when Rodak came in, we started being more direct in playing out of defense, and our possession stats plummeted too (we averaged 63% possession with Betts in goal vs 54% with Rodak).  But I think that must be more down to tactical decisions by Parker rather than Rodak.

Craven_Chris

Quote from: FulhamStu on August 09, 2020, 11:13:05 AM
Very interesting, thanks for sharing.   Does suggest as most fans have observed, we need to put the emphasis on strengthening the defence before the new season.   Have yo thought about sending this analysis to Tony Khan ?

A very kind suggestion, although I would hope the Fulham analytics team are way ahead of this (and they buy the premium data too) - but I am thinking of starting a blog on this sort of thing...

Craven_Chris

Quote from: Jim© on August 09, 2020, 10:50:18 AM
Thanks for that very interesting.
I posted last week about how Rodak was my player of the season for the points per game increase he brought. Betts was terrible unfortunately, lowest save percentage in the division at the time, now the best.
Ppg compared to Mitro for example was startling higher for Rodak.

Thanks glad you liked it. Just in defense of Betts though, although you are correct that his save percentage this season (57.1%) is very low, it should be noted that he did not underperform against xG. Part of this seems to be that proportion of shots on target against him were very low (only about 25% of the shots he faced were on target), so he seems to have benefitted from wayward shooting (or being more charitable he made opponents miss with excellent positioning?).

Also he suffers from a small sample size -  he only faced 35 shots on target, 15 went in,20 were saved. But just save a couple more and its 13 and 22 - which gives a much more normal save percentage of 63%


Dougie

Quote from: Craven_Chris on August 09, 2020, 09:30:52 PM
Dougie, I was interested in the chart showing reduced xG conversion the higher the xG is. It may be of interest, I tried to replicate using Infogols data instead of Footystats - I would say the correlation looks weaker here (which may give a clue that there is a problem in the Footystats numbers / model, i.e. if it consistently overestimates xG either through model of data error then you would expect the correlation that you observed)?

Yeah I've reached a similar conclusion. I also found a third site that had a different xG than the other two! Though it was closer to Infogol than the site I used so the site I used is probably the dodgiest data and can be disregarded.

No correlation in the Championship, but you start to see a weak positive correlation of goals scored and goals minus xg in the Premiership, which makes sense - the top teams don't only create the most chances, but they employ more lethal attackers.

HamsterWheel

Great post. I think the XG stuff is massively overrated. Apparently Liverpool shouldn't have won the Prem based on XG so it's clearly a long way from being perfect enough to treat as Gospel.
https://www.theguardian.com/football/2020/aug/09/liverpool-xg-jurgen-klopp

Craven_Chris

Quote from: HamsterWheel on August 10, 2020, 12:14:33 PM
Great post. I think the XG stuff is massively overrated. Apparently Liverpool shouldn't have won the Prem based on XG so it's clearly a long way from being perfect enough to treat as Gospel.
https://www.theguardian.com/football/2020/aug/09/liverpool-xg-jurgen-klopp

Thanks - its a valid point and there are all sorts of reasons why xG models might get things wrong (I like to think about good crosses - if someone plays a great ball through the 6 yard box, but no striker gets on the end of it, then no shot is taken and no xG recorded, but at the point the player was hitting the cross, there must have been a reasonable probability of a goal that is missed by the model).

The case of Liverpool in the article is pretty much the case being made in the OP, the xG model has got it wrong because it says Liverpool should not have won the league and it also says Fulham probably should not have made the playoffs (or at least should have conceded a lot more goals than they did). But in reality both of those things did happen, so I find it interesting to try and dig into why the models got it wrong, is it luck or something those teams are doing that the model misses (such as having a brilliant keeper)?


toshes mate

An interesting OP and certainly worthy of the read but, before I could comment, I admit I had to go away and find out a little bit about the history of the xG tag or symbol.   This would also, hopefully, reveal all the benefits and snags of this kind of data.   My research showed xG to have pure betting origins, and is basically a collection of outcomes of similar chances to score, from similar parts of the pitch, with differing difficulty, but with no reference to the skill level of the player(s) involved.  It is, in other words, a measure of chance in a random incident divided between scored or didn't score (which is a strictly fifty-fifty chance outcome like a coin toss) offering a variance to an expected norm without defining what the expected norm really is.    In using this information in betting the user is warned that there is immense variation in data output between the many companies generating this data.


The data assumes a league standard player in every instant and so does xG increase when a side increases its converted chances and decrease when it doesn't and when might this be important?   Well I can see it usefulness when you have a side creating more than ten chances a game but never converting better than a 1 in 25 ratio – they are not going to appear to score many goals per game.  Likewise does cG bounce up and down because a side facing this team with staggeringly poor conversion ratio (and given them twenty free shots on goal) suddenly has the best team Gc ratio in Christendom?
   

I do not know and neither can I find the answer to that question and that has left me scratching my head in frustration... and I don't think it's a reliable way of finding things that are possibly not even intended to be measured by it.  One of the better articles I consulted did have a useful footnote of information about someone who could help me with my gambling addiction I was pleased to note ...