Dataset for Weather (temperature)
As mentioned in the previous update, one of the data items I wish to have included in the final dataset (to be used for training/modelling) is the weather, or more specifically the temperature on the day of a game. To get this, it will be a two stage process :-
- Creation of a weather dataset, containing all days within the football season, the temperatures for each day, at each football location.Step
- Utilise the dataset created in step 1, within the pre-processing script (which will create the final dataset), to derive the temperature on each match day.
Step 1, "creation of a weather dataset". This is done with RStudio, and specifically with the use of a pre-defined library, called weatherData.
For a full view of this weather data, please check out my gitHub link below :-
Predicting Soccer Attendances (GitHub) - Weather
For info, here is the format of the Weather Dataset ...
Date | TemperatureHighC | TemperatureAvgC | TemperatureLowC | team | |
1 | 01/08/2016 | 21 | 16.5 | 12 | Barnsley |
2 | 02/08/2016 | 20.3 | 16.8 | 13.4 | Barnsley |
3 | 03/08/2016 | 21.6 | 18.9 | 16.2 | Barnsley |
In the next episode we will look at the pre processing script, written in Python, that will create the main dataset, for use with training, and production of the model.