Predicting Soccer Attendances (part 3a)

Dataset for Weather (temperature)

As mentioned in the previous update, one of the data items I wish to have included in the final dataset (to be used for training/modelling) is the weather, or more specifically the temperature on the day of a game. To get this, it will be a two stage process :-

  1. Creation of a weather dataset, containing all days within the football season, the temperatures for each day, at each football location.Step 
  2. Utilise the dataset created in step 1, within the pre-processing script (which will create the final dataset), to derive the temperature on each match day.

Step 1, "creation of a weather dataset".  This is done with RStudio, and specifically with the use of a pre-defined library, called weatherData.

For a full view of this weather data, please check out my gitHub link below :-

Predicting Soccer Attendances (GitHub) - Weather

For info, here is the format of the Weather Dataset ...

  Date TemperatureHighC TemperatureAvgC TemperatureLowC team
1 01/08/2016 21 16.5 12 Barnsley
2 02/08/2016 20.3 16.8 13.4 Barnsley
3 03/08/2016 21.6 18.9 16.2 Barnsley

In the next episode we will look at the pre processing script, written in Python, that will create the main dataset, for use with training, and production of the model.

Part 2

Leave a Comment