Home Activity Blog Project Euler

I'm a Machine Learning engineer working for Klarna in Stockholm, Sweden. I work in the AI & Automation group, where we are tackling the problem of web automation at scale.

Prior to this, I completed my PhD in Astrophysics in 2018. I worked on a wide range of topics in astronomy, machine learning and statistics.

### Contact

Feel free to email me
jimbarrett27@gmail.com

Or find me via my various online presences below

## (un-)Predictability of Sports

#### 31/12/2017 - Jim Barrett

A subject of debate (mostly in the pub) over the last few months has been about the predictability of sports results. My hypothesis was that the total number of points scored in a team-vs-team sport should be roughly Poisson distributed, and as such you would expect greater statistical fluctuations in the scores of sports where the score-lines are typically low (e.g. football), compared to sports with a high scoreline (e.g. Basketball). My thinking was that the "better" team should win more consistently in high scoreline sports.

Over the Christmas break, I decided to put my money where my mouth is, and actually investigate it. You can find the Jupyter notebook here. I downloaded a bunch of historical results from football, ice hockey and basketball, together with bookmakers pre-game odds. The data took a bit of cleaning, but I ultimately looked at how often the bookie's favourite lost for each sport. The tl;dr conclusion is that I was wrong. Football is the most accurately predicted, and has the lowest average scoreline.

I had intended to clean the notebook up a little bit, and make some more investigations (e.g. was I wrong because the score-lines aren't Poisson distributed? Or because each team's individual scoreline are too highly correlated with each other?) but whilst I was clearing out some of my external hard-drives I accidentally formatted my internal hard-drive too. That was a lot of fun to fix.... Anyway, I lost all my squeaky clean data, so I couldn't make prettier plots. Apologies.