Jump to content

BRFCS

BY THE FANS, FOR THE FANS
SINCE 1996
Proudly partnered with TheTerraceStore.com

Data in Football


Recommended Posts

Been enjoying the data stuff put out by Andy and Joe.

Makes interesting reading.

I'm a big fan and follower of baseball, who've been the forerunners of this data driven approach for nearly 20 years now and have seen how it has impacted the game over there, for the players, the front offices (Director of Football/Scouting etc in football terms) and the general fan.

Its impact will be felt forever more in baseball terms, and whilst not wholly embraced by the supporters....is now the standard in the sport, has lead to a generation of players who play to the 'Moneyball' standards.

Ironically, Brandon Beane and the Oakland A's (the team shown to first use the new approach in the film Moneyball (well worth a watch)) haven't actually won the World Series since the 1980s. And their statistical approach has been copied and pasted throughout the league.....its easier to run a Moneyball operation, when you have the resources/attache/cash to back up the data!

That said, there has been push back against the data from large swathes of the supporter base and ex-players/pundits a like, notably Hall of Fame pitcher John Smoltz.

I do often wonder at what point do the general fan get put off by the data driven approach?

and the endless stream of statistics that are shown? Particularly niche stats like xG

At what point does the general fan have to acknowledge that statistics are here to stay? Yet the need to counterbalance that with the more traditional 'eye test'?

Also, baseball has seen a rise in general managers coming from the top universities, where they have studied economics, business studies or finance and financial analysis. I wonder how long it may be before we see one of the big clubs (or small clubs for that matter) hire an Oxbridge/London School of Economics graduate to run the data department at a football club?

I'm sure this has probably already happened!

The next 25 years of football will undoubtedly see more stats, more interest in the stats and more new fans approving of new methods to find players and to run their teams.......The Old School vs. New School battle is just beginning. 

Personally, I think the stats data will win through and just as we look back at the days of drinking/poor diets/poor training regimes, we'll wonder how clubs used to send a middle aged bloke to Colchester v Scunthorpe on a wet Tuesday in February... to watch one game of a left back, who looked decent on the night and was then signed for x amount of pounds.... Maybe scouts will be replaced by data analysts? who knows....

 

 

Edited by rog of the rovers
  • Like 1
Link to comment
Share on other sites

1 hour ago, rog of the rovers said:

Personally, I think the stats data will win through and just as we look back at the days of drinking/poor diets/poor training regimes, we'll wonder how clubs used to send a middle aged bloke to Colchester v Scunthorpe on a wet Tuesday in February... to watch one game of a left back, who looked decent on the night and was then signed for x amount of pounds.... Maybe scouts will be replaced by data analysts? who knows....

I think the point from me is that they already have. The debate supporters and pundits are having surrounding data is futile because football clubs, staff, players, leagues & bodies are all now using data in every single aspect of what they do. 

- Rovers watch their training sessions back on drone footage
- We wear tech sports bra's that track running and vitals
- The club use intricate data systems to recruit players
- We use MNF style video interactive boards to post-analyse matches in booths
- Managers use iPads at the touchline to scroll back through key moments in games, coded by data analysts
- Football pundits and TV shows use statistics and data in almost every aspect of production
- Players are introduced to data science and its capabilities at incredibly young ages
- Scouts are even starting to fill out reports on iPads rather than writing on pen and paper
- Goal Line Technology & Video Assistant Referee's (VAR) are made possible by modern technologies

The data revolution has already finished, and we're now as supporters having it out. The pushback on the whole is understandable, change is always going to cause debate and that's okay. I think there's definitely an issue of misunderstanding on some of the data science that leads to quickly drawn conclusions - unaided by pre-dispositions against data in the first place. I like to think that scouts won't ever be replaced, I think watching a player still matters, but I also think that the usage of data within our game and within our club is being underplayed.

Edited by JoeH
Link to comment
Share on other sites

53 minutes ago, JoeH said:

I think the point from me is that they already have. The debate supporters and pundits are having surrounding data is futile because football clubs, staff, players, leagues & bodies are all now using data in every single aspect of what they do. 

- Rovers watch their training sessions back on drone footage
- We wear tech sports bra's that track running and vitals
- The club use intricate data systems to recruit players
- We use MNF style video interactive boards to post-analyse matches in booths
- Managers use iPads at the touchline to scroll back through key moments in games, coded by data analysts
- Football pundits and TV shows use statistics and data in almost every aspect of production
- Players are introduced to data science and its capabilities at incredibly young ages
- Scouts are even starting to fill out reports on iPads rather than writing on pen and paper
- Goal Line Technology & Video Assistant Referee's (VAR) are made possible by modern technologies

The data revolution has already finished, and we're now as supporters having it out. The pushback on the whole is understandable, change is always going to cause debate and that's okay. I think there's definitely an issue of misunderstanding on some of the data science that leads to quickly drawn conclusions - unaided by pre-dispositions against data in the first place. I like to think that scouts won't ever be replaced, I think watching a player still matters, but I also think that the usage of data within our game and within our club is being underplayed.

Would a lot of that fall into the category of technology rather than data?

Either way I don’t disagree with much of what you are saying, and of course the data gathering is reliant on tech. Every facet of our lives is heading more and more in that direction. How many times during this pandemic have we been told decisions are being made due to “the data”.

In terms of both tech and data, I’ve seen first hand it’s use is increasing at a very quick rate at the bottom end of the pro game. It is definitely here to stay, and definitely has a place, but I worry that the focus on it is getting towards being too heavy. It’s certainly in vogue at present and that is reflected in the UEFA coaching licences.

From a purely personal point of view, the increased reliance on data analysis has coincided with the decrease in my overall enjoyment of the game. It’s one of several factors though.

Edited by Miller11
  • Like 1
Link to comment
Share on other sites

58 minutes ago, JoeH said:

I think the point from me is that they already have. The debate supporters and pundits are having surrounding data is futile because football clubs, staff, players, leagues & bodies are all now using data in every single aspect of what they do. 

- Rovers watch their training sessions back on drone footage
- We wear tech sports bra's that track running and vitals
- The club use intricate data systems to recruit players
- We use MNF style video interactive boards to post-analyse matches in booths
- Managers use iPads at the touchline to scroll back through key moments in games, coded by data analysts
- Football pundits and TV shows use statistics and data in almost every aspect of production
- Players are introduced to data science and its capabilities at incredibly young ages
- Scouts are even starting to fill out reports on iPads rather than writing on pen and paper
- Goal Line Technology & Video Assistant Referee's (VAR) are made possible by modern technologies

The data revolution has already finished, and we're now as supporters having it out. The pushback on the whole is understandable, change is always going to cause debate and that's okay. I think there's definitely an issue of misunderstanding on some of the data science that leads to quickly drawn conclusions - unaided by pre-dispositions against data in the first place. I like to think that scouts won't ever be replaced, I think watching a player still matters, but I also think that the usage of data within our game and within our club is being underplayed.

I think technology less than data is divisive, and the ones that I have highlighted are not really data in the sense that people are talking about it. Watching training and matches back is something that clearly has happened for a while and of course has been aided with further technology, I can't see much scope for why anyone would doubt that as being useful. Analysing opposition and indeed yourself to improve and to do different tactical things, you cant argue with that really. The sports bras are a good point, and for me that is good use of data, essentially you can monitor players to try and keep them as fit as possible to try and avoid muscular injuries. Not an exact science but will be of massive help to physios and fitness staff.

Recruitment is seemingly more of a complex issue, but as I mention below, the data is not the key IMO, I maintain that the main variable (the same data is available to everyone) is the people and always will be. You mention our club, we are clearly sorely lacking in competence in the recruitment team headed by our manager.

My main issue with data is that it is constantly taken out of context to make what people believe are objective conclusions. Reading Andy's document, it was a lengthy piece of work but for example, it said that "purely based on his defensive output and ridiculous stats" before going onto recommend signing Trybull. Thats a personal opinion and a personal interpretation on data, as is the assumption that Rovers are performing (based on data) at a level far exceeding our league finish. With Trybull for example, he has played in a team that has a specific "style" to keep the ball so his passing stats will be above the norm, the data wont show how he slows everything down and kills any momentum. Similarly, on the Not The Top 20 podcast, it said about Swansea that they cant expect to "run away" from the underlying data for too long, on the back of a second successive play off push. Its this nonsencial idea that the data is factual when all collated together, thats not true. It also sometimes then leads to patronising/condescending comments, you clearly dont understand the data etc.

Link to comment
Share on other sites

6 minutes ago, roversfan99 said:

It also sometimes then leads to patronising/condescending comments, you clearly dont understand the data etc.

There are many aspects of Data Science in football which the general supporter base doesn't understand. Sometimes because it's not particularly well known about, but often because most people simply don't want to know. It's not patronising or condescending to point this out. If you know a lot about bricklaying and someone claims that a method used during that process is "bullshit" you're going to question their understanding of it before you actually argue out whether they're right or not. Establishing how much somebody understands about xG models before having a debate about their usage is fair for example, in my opinion anyway, because how can you have an informed discussion about something if somebody in the conversation doesn't understand it.

Link to comment
Share on other sites

4 minutes ago, JoeH said:

There are many aspects of Data Science in football which the general supporter base doesn't understand. Sometimes because it's not particularly well known about, but often because most people simply don't want to know. It's not patronising or condescending to point this out. If you know a lot about bricklaying and someone claims that a method used during that process is "bullshit" you're going to question their understanding of it before you actually argue out whether they're right or not. Establishing how much somebody understands about xG models before having a debate about their usage is fair for example, in my opinion anyway, because how can you have an informed discussion about something if somebody in the conversation doesn't understand it.

But then why would it ever be mentioned on this forum in the first place if you or anyone else felt that others lacked the full understanding/knowledge to be able to discuss it? Thats not me in any way discouraging talking about it but I dont think that the brick laying comparison is a like for like one.

Link to comment
Share on other sites

Unfortunately looking at the state of the side, looks as if the manager doesn’t understand what’s being fed to him either… or he just ignores it. Because there’s surely no data analyst promoting the deployment of SG as a wide man (along with the other non sensical decisions).

And he certainly does ignore data re potential transfers… unless there’s a ‘connection to Teesside’ parameter on their software?

Edited by Mattyblue
Link to comment
Share on other sites

5 hours ago, Mattyblue said:

Unfortunately looking at the state of the side, looks as if the manager doesn’t understand what’s being fed to him either… or he just ignores it. Because there’s surely no data analyst promoting the deployment of SG as a wide man (along with the other non sensical decisions).

And he certainly does ignore data re potential transfers… unless there’s a ‘connection to Teesside’ parameter on their software?

It's like auto trader. Tony just logs in 

- ticks 'all positions' 

- ticks 'all ages'

- show injury prone players

- within 20 miles of TS3 6RS

  • Like 2
Link to comment
Share on other sites

7 hours ago, roversfan99 said:

But then why would it ever be mentioned on this forum in the first place if you or anyone else felt that others lacked the full understanding/knowledge to be able to discuss it? Thats not me in any way discouraging talking about it but I dont think that the brick laying comparison is a like for like one.

It probably isn’t like for like, mainly because it’s quite a unique one. Again, I think it can be discussed, but it’s good to clarify people’s understand if you’re going to discuss it, and I don’t think it’s fair to label that as condescending or patronising.

Link to comment
Share on other sites

8 hours ago, JoeH said:

If you know a lot about bricklaying and someone claims that a method used during that process is "bullshit" you're going to question their understanding of it before you actually argue out whether they're right or not. Establishing how much somebody understands about xG models before having a debate about their usage is fair for example, in my opinion anyway, because how can you have an informed discussion about something if somebody in the conversation doesn't understand it.

There's a lot of scepticism on here about stats cos they are used to justify the performances we have seen (by Mowbray and some Twiterati, not by you).

However if a bricklayer does a job and the end result is clearly shite, you tell him it's shite and he starts telling you why his methods are better, it doesn't matter. Ultimately the job he has done is shite.

That's what's happening at Ewood. And therefore the methods are are not trusted, even though it's the bricklayers fault.

Edited by Hasta
  • Like 2
Link to comment
Share on other sites

I agree with Hasta. Personally I think with the data it all comes down to how it’s interpreted. The first port of call should be is the side successful then a distant second is the football enjoyable. With rovers, for example, it was a massively unsuccessful season and boring side ways possession football to boot so if the data then shows why it’s not working and offers ways of improving then great. However, what I’m seeing is people ignoring the only important issue of success or relative success and clinging onto some obscure stat of why things were actually quite good.


That said I’ll always be soured on stats because of Steve Kean whenever I hear them I’m reminded of needing to win at Spurs away, not registering a single shot on target then having that prick justify it as decent because of box entries. Fuck box entries what about goals Steve.

Edited by matt83
  • Like 2
Link to comment
Share on other sites

Statistics work better in baseball and American football because it is a series of set pieces (pitches in baseball, downs in American football). It is very linear. Football is very fluid and the data collection (particular data that is actually relevant) is much harder and the interpretation of that even harder still. The stats are interesting, but strike me as being too basic atm. The level of detailed data required to properly model things would be huge and would then need someone with a very specialist understanding to interpret and make them useful. I love a good football stat but most of them don't mean very much in real practical terms, I think.

  • Like 2
Link to comment
Share on other sites

On 08/06/2021 at 14:06, Mattyblue said:

Unfortunately looking at the state of the side, looks as if the manager doesn’t understand what’s being fed to him either… or he just ignores it. Because there’s surely no data analyst promoting the deployment of SG as a wide man (along with the other non sensical decisions).

And he certainly does ignore data re potential transfers… unless there’s a ‘connection to Teesside’ parameter on their software?

In terms of Rovers' use of data, I think this is nail on the head - that the problem is, yet again, the man in charge. Feed him all the data you like, but I think that ultimately, he picks and chooses.

For instance, I've painstakingly demonstrated that statistically we do far worse when we control the possession than we do when the opposition control it, even if they only control it by a bit. This was also borne out in last season's data. Despite this, the twat-whistle keeps insisting that the key to our future is in controlling the possession. I find it hard, at best, to believe that none of the professional data analysts at the club have noticed this basic piece of data that took me no more than an hour to put together on my own, including the write up, with nothing more than google at my fingertips. It's very unlikely this data has never reached his eyes. He is just choosing to ignore it.

Meanwhile he loves to bang on about the amount of possession we have, so he is clearly paying attention to that part of this data and saying 'ooh look how well we are doing' whilst ignoring the data that shows this possession is counterproductive. And he goes on about data in general, but surely the data isn't saying that Bennett/Gallagher make for a better RB than Nyambe, or that as you say, Gally is a good RW. To justify the latter he probably bums the data about how many headers he wins out wide (I think he has mentioned that before) whilst ignoring the fact the man can't cross effectively, which again I doubt the data shows anything contrary to that.

As I've said before, data is only as valuable as the skills of the person interpreting it. And Mogga doesn't have the first clue about something as modern as this. But the stubborn old goat thinks he does and wants others to believe it, so he openly embraces it, but tosses out anything that doesn't confirm his bizarre biases.

As I've also said before, my suspicion is Harvey left because his data-supported recruitment suggestions were being ignored by Tony in favour of the 'connection to Teeside parameter'.

  • Like 1
Link to comment
Share on other sites

1 minute ago, bluebruce said:

For instance, I've painstakingly demonstrated that statistically we do far worse when we control the possession than we do when the opposition control it, even if they only control it by a bit

Very true. Huge correlation between positive results and use having less of the ball. 55-65% is the absolute horror zone for us. 68%+ possession and we usually win but I like us at around 45% personally. The 4-0 win at Derby this year we had 36%!

Link to comment
Share on other sites

26 minutes ago, JoeH said:

Very true. Huge correlation between positive results and use having less of the ball. 55-65% is the absolute horror zone for us. 68%+ possession and we usually win but I like us at around 45% personally. The 4-0 win at Derby this year we had 36%!

I'm guessing now from that post that you didn't see the thread I made, Joe. I had been expecting you to comment in it considering your focus on data, but if you didn't see it then you wouldn't.

If you're interested, there it is. I will finish it off at some point by updating to include the final 7 games that hadn't been played yet when I did my last update at the bottom of that thread, when I can be arsed. I'd be curious to know what you think (and yes I appreciate there is probably a more accurate source of the possession stats than the match stats that show in Google, but I wouldn't know which is best and that was a convenient one for me to work off...I'm sure it was close enough). I imagine you might have a tool that can just fart this data out at will without the legwork I had to do.

Edited by bluebruce
Link to comment
Share on other sites

Thought this was an interesting chart from Ben Mayhew re: time spent in the lead, tied, or losing:

I'm not sure exactly how he's ordered this (2nd chart is the Championship), but we were roughly 6th worst in the league in the amount of time we spent winning vs losing/tied, which I think explains a lot of our 'impressive' possession numbers versus actual results. A lot of time passing the ball around chasing games against teams that learned how to nullify us after getting ahead early (or were happy to sit back and wait for their chance). Need to account for such 'score effects' when looking at possession stats...

I think a general story for our season is we came out like gangbusters offensively and arguably had a few 'unlucky' results (our xG supports that), but then teams quickly figured out how to bottle us up and nullify our possession (and xG roughly supports that too! I can't find a quick chart of our rolling xG average, but it definitely declined markedly during our Feb-Mar collapse)

While I generally like xG as part of an analytical toolbox, we were clearly a statistical outlier last year (no stat is perfect...). I recall reading that xG is more predictive of future performance than actual goals until about halfway through a season, after which actual goals is more predictive. That's not at all to say xG is useless, it's a better early indicator of a team's 'true talent' or whatever you want to call it, but eventually results are what matter...

Edited by RoverCanada
Link to comment
Share on other sites

  • Backroom
1 hour ago, bluebruce said:

I'm guessing now from that post that you didn't see the thread I made, Joe. I had been expecting you to comment in it considering your focus on data, but if you didn't see it then you wouldn't.

If you're interested, there it is. I will finish it off at some point by updating to include the final 7 games that hadn't been played yet when I did my last update at the bottom of that thread, when I can be arsed. I'd be curious to know what you think (and yes I appreciate there is probably a more accurate source of the possession stats than the match stats that show in Google, but I wouldn't know which is best and that was a convenient one for me to work off...I'm sure it was close enough). I imagine you might have a tool that can just fart this data out at will without the legwork I had to do.

I built a relational database containing this season's stats which as far as I can tell seems to be accurate (BBC stats for possession). Assuming my statistics are correct and I didn't screw up the indexing somehow, it ends up like below:

45% or Less Possession

      Games      Points          PPG
Home 2 6 3
Away 4 8 2
Combined 6 14 2.3

46-55% Possession

       Games      Points          PPG
Home 5 8 1.6
Away 6 6 1
Combined 11 14 1.3

56-65% Possession

      Games      Points          PPG
Home 10 11 1.1
Away 7 3 0.4
Combined 17 14 0.8

66-75% Possession

     Games       Points          PPG
Home 6 9 1.5
Away 6 6 1
Combined 12 15 1.3

So yeah, assuming my calculations are accurate, the more possession we have the worse we get across the board until we reach the 66% mark, which is still marginally below 46-55% possession and some way below < 46%. 

Edited by DE.
Link to comment
Share on other sites

5 minutes ago, DE. said:

I built a relational database containing this season's stats which as far as I can tell seems to be accurate (BBC stats for possession). Assuming my statistics are correct and I didn't screw up the indexing somehow, it ends up like below:

45% or Less Possession

      Games      Points          PPG
Home 2 6 3
Away 4 8 2
Combined 6 14 2.3

46-55% Possession

       Games      Points          PPG
Home 5 8 1.6
Away 6 6 1
Combined 11 14 1.3

56-65% Possession

      Games      Points          PPG
Home 10 11 1.1
Away 7 3 0.4
Combined 17 14 0.8

66-75% Possession

     Games       Points          PPG
Home 6 9 1.5
Away 6 6 1
Combined 12 15 1.3

So yeah, assuming my calculations are accurate, the more possession we have the worse we get across the board until we reach the 66% mark, which is still marginally below 46-55% possession and some way below < 46%. 

Amazing stuff. As I thought really which is always satisfying! There’s certainly a sweet spot and I wish we could hit it more often

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.